The false promise of generative AI detectors

In the months since ChatGPT and other generative artificial intelligence (AI) applications made their public debut, a whole slate of different — yet closely related — applications have also risen to prominence: generative AI detectors like GPTZero and’s AI Content Detector.

These tools claim to be able to detect and identify text that was generated by large language models (LLMs). But how well do they actually work? As it turns out, these AI-detection tools are pretty biased themselves — and probably aren’t the silver bullet to detecting and weeding out AI-generated content. 

A team of researchers at Stanford University’s department of computer science recently published a study that found “an alarming bias in GPT detectors against non-native English speakers.” 

When the researchers ran 91 English-language essays written by native Chinese speakers through seven different popular generative AI detection tools, they found an average false-positive rate of 61%, meaning that they erroneously flagged the content in the essays as AI-generated. One detection tool even flagged 97.8% of the essays generated by non-native speakers as AI-generated.

“Many teachers consider GPT detection as a critical countermeasure to deter ‘a 21st-century form of cheating,’ but most GPT detectors are not transparent,” the researchers write. “The design of many GPT detectors inherently discriminates against non-native authors, particularly those exhibiting restricted linguistic diversity and word choice.”

The researchers believe that, because generative AI detectors typically measure perplexity — that is, the “randomness” of a text’s structure and vocabulary — to determine whether a text was human-created, they have a bias against simpler, less linguistically complex sentences often produced by non-native speakers of a language. It’s clear that more sophisticated measurements must be developed before these tools can be considered reliable.

Such tools are often used by educators and hiring managers to flag AI-generated content. But the fact of the matter is that they are an unreliable measure of whether or not a given text is actually AI-generated — potentially making their use in such high-stakes situations as irresponsible as using tools like ChatGPT in the same scenarios.

Some AI detectors recognize this — ZeroGPT, for instance, includes a disclaimer that “The nature of AI-generated content is changing constantly. As such, these results should not be used to punish students. While we build more robust models for GPTZero, we recommend that educators take these results as one of many pieces in a holistic assessment of student work.” However, others provide no such disclaimer, making their use in high-stakes settings a well-meaning yet ultimately quite reckless endeavor.

Andrew Warner
Andrew Warner is a writer from Sacramento. He received his B.A. in linguistics and English from UCLA and is currently working toward an M.A. in applied linguistics at Columbia University. His writing has been published in Language Magazine, Sactown Magazine, and The Takeout.


Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC