Stanford Study: AI Detection Tools Falsely Accuse International Students of Cheating

Artificial intelligence (AI) detection tools are biased against non-native English speakers and falsely accuse them of cheating, according to a Stanford study.

Johns Hopkins University professor Taylor Hahn first noticed the issue when he got an alert while grading an international student’s paper, according to a report by the Markup.

Professor Hahn has uploaded the student’s paper to the software tool, Turnitin, which is reportedly used by over 16,000 academic institutions across the world to find plagiarism — and more recently, to spot AI-generated text.

Turnitin ended up labeling more than 90 percent of the paper as AI-generated, so Hahn set up a meeting to question the student about their paper.

“This student, immediately, without prior notice that this was an AI concern, they showed me drafts, PDFs with highlighter over them,” Hahn recalled of his meeting with the student.

The professor, therefore, was convinced that Turnitin’s AI-catching tool had made a mistake.

In another instance, Hahn had worked directly with a non-native English speaking student on an outline and drafts for a paper, only to later find that Turnitin flagged the majority of the paper as generated by AI.

So a group of Stanford computer scientists conducted an experiment to see how reliable AI detectors are when it comes to writing by non-native English speakers. The paper, which was published last month, found a clear bias in AI-detecting tools.

While the Stanford study did not involve Turnitin, it did find that seven other AI-detecting tools had flagged writing by non-native English speakers 61 percent of the time, with the incorrect assessment being unanimous on approximately 20 percent of the papers.

Meanwhile, the AI detectors almost never made the same mistakes when evaluating writing by native English speakers.

One theory as to why this bias is occurring with AI detectors is the reality that these tools are typically programmed to flag content when the word choice is predictable and simple — a pattern that non-native English speakers are more likely to exhibit in their writing.

When in writing in one’s first language, a person usually has a larger vocabulary and a better grasp of complex grammar, which is not always the case when someone is speaking in their second language.

But AI tools, such as the popular chat bot, ChatGPT, mimic human writing by parsing everything it has ever processed and creating sentences using the most common words and phrases, The Markup noted.

ChatGPT has become a major problem in the world of academia, as students are increasingly using the tool as their go-to source for cheating.

As Breitbart News previously reported, students at an elite academic program at a Florida high school were accused of cheating by using ChatGPT to write their essays.

A study published earlier this year also found that 17 percent of students at Stanford University admitted to using ChatGPT on their final exams.

You can follow Alana Mastrangelo on Facebook and Twitter at @ARmastrangelo, and on Instagram.

COMMENTS