How AI Detection Works in 2025 — And How to Beat It

Published June 10, 2025 · 7 min read

AI-generated text is everywhere. Students use ChatGPT for essays, marketers use it for ad copy, and developers lean on Copilot for documentation. In response, an entire industry of AI detection tools has emerged — GPTZero, Turnitin’s AI module, Originality.ai, Copyleaks, and many others. But how do they actually work under the hood? And more importantly, why do they so often get it wrong?

In this article we will break down the core techniques behind AI detection: statistical metrics like perplexity and burstiness, classifier-based approaches, and watermarking schemes. We will also explain why false positives are an inherent problem — and how you can legitimately humanize AI-assisted text so it reads the way you actually write.

What Is Perplexity in NLP?

Perplexity is a measurement of how “surprised” a language model is by a piece of text. Technically it is the exponentiated average negative log-likelihood of each token given the preceding context. In plain language: if a sentence is highly predictable — meaning the next word is almost always the most probable choice — it has low perplexity. If the sentence takes unexpected turns, uses unusual word choices, or employs creative phrasing, it has high perplexity.

Human writing tends to sit in a wide perplexity range. We use idioms, make grammatical shortcuts, switch registers mid-paragraph, and occasionally pick a less-obvious word just because it sounds better. Large language models, on the other hand, are optimized to select the most statistically likely next token. The result is text that is fluent, polished — and suspiciously predictable. This predictability is exactly what detectors exploit.

What Is Burstiness?

Burstiness refers to the variation in sentence structure, length, and complexity across a document. Think about how a skilled human author writes: one paragraph might contain a long, carefully subordinated sentence with multiple clauses, followed by a punchy three-word fragment. Then a question. Then a list. The rhythm is uneven, almost musical.

AI-generated text rarely does this. Because each token is sampled from a probability distribution that favors smooth, coherent output, the sentences tend to converge toward a uniform length and complexity. A typical ChatGPT paragraph contains sentences of roughly 15 to 25 words each, with consistent clause depth and minimal structural variation. Detectors measure this uniformity and flag it.

Why AI Text Has Low Perplexity and Low Burstiness

These two metrics work together. Low perplexity means each word is the “expected” next word. Low burstiness means every sentence looks structurally similar to the last. Combined, you get text that reads like it was produced by a machine running on autopilot — which it was. A 2023 study from the University of Maryland found that GPT-3.5-generated essays had an average perplexity 40% lower than human essays on the same topics, and a burstiness index less than half that of human writers.

This is the fundamental signal that most first-generation detectors rely on. If a document scores below a certain perplexity threshold and its burstiness is too uniform, it gets flagged.

How GPTZero and Similar Detectors Use These Metrics

GPTZero, one of the most widely used detection tools, was built specifically around perplexity and burstiness scoring. It processes input text through a reference language model, computes per-sentence perplexity, then analyzes the distribution of those scores across the document. If the average perplexity is low and the variance (burstiness) is also low, the tool assigns a high probability of AI origin.

Some tools go further, computing perplexity at multiple granularities — word-level, sentence-level, and paragraph-level — and comparing them to known distributions of human and AI text. Others use sliding windows so they can identify which specific passages within a longer document were likely machine-generated, producing a per-sentence highlight map.

Classifier-Based Detection

Statistical metrics alone are not enough. The second major approach is supervised classification. Companies like Originality.ai and Copyleaks train neural networks — typically fine-tuned transformer models — on large datasets of paired human and AI text. The classifier learns subtle distributional patterns that go beyond simple perplexity: token frequency distributions, positional biases, phrase-level repetition patterns, and even punctuation habits.

These classifiers can be quite accurate on in-distribution data (text generated by the same model they were trained against). The problem is generalization. A classifier trained on GPT-3.5 output may struggle with Claude or Gemini output, and vice versa. As new models are released every few months, detection classifiers are locked in a perpetual game of catch-up.

Watermarking Approaches

A third, more experimental technique is statistical watermarking. In this approach, the AI model itself embeds an invisible signal during generation — for example, by slightly biasing token selection toward a cryptographically determined subset of the vocabulary. The watermark is undetectable to human readers but can be verified by anyone who knows the secret key.

Research from the University of Maryland and Google DeepMind has demonstrated working watermark schemes. However, watermarking requires cooperation from the model provider, can be defeated by paraphrasing, and raises civil-liberties concerns. It also does nothing for text generated by open-source models like LLaMA or Mistral, where anyone can strip or modify the generation pipeline. For these reasons, watermarking remains a supplementary tool rather than a primary detection method.

Why False Positives Are Inevitable

Here is the uncomfortable truth: every detection method produces false positives. Non-native English speakers often write with low perplexity because they rely on common, well-learned phrases. Technical writing and legal documents are inherently formulaic, which depresses burstiness. Students who follow a five-paragraph essay template will produce text that looks statistically uniform.

Multiple studies have shown that AI detectors disproportionately flag writing by non-native speakers. A 2023 Stanford study found that GPTZero flagged over 60% of TOEFL essays written by real humans as AI-generated. This is not a minor edge case — it is a systemic flaw baked into the metrics themselves. When your detection signal is “text that is too clean and too uniform,” you will inevitably catch humans who write cleanly and uniformly.

How to Legitimately Humanize AI-Generated Text

If you use AI as a writing assistant — to draft, brainstorm, or restructure — you should not be penalized for it. The key is to make the final text genuinely yours. Here are the core strategies, mapped directly to the metrics detectors use:

Increase perplexity. Replace generic phrasing with specific, concrete language. Instead of “It is important to note,” write “Here is what most people miss.” Swap common collocations for less predictable word choices. Use metaphors, analogies, and domain-specific jargon that a general-purpose LLM would not default to.

Add burstiness. Vary your sentence lengths deliberately. Follow a 40-word sentence with a 6-word one. Use fragments. Ask rhetorical questions. Insert parenthetical asides (like this one). Break the monotony of subject-verb-object ordering with inverted clauses and fronted adverbials.

Vary vocabulary and register. Mix formal and informal tones within the same piece. Use contractions in some places but not others. Introduce first-person anecdotes or opinions. Real human text is messy, opinionated, and inconsistent — lean into that.

Restructure, do not just rephrase. Changing individual words is not enough. Reorder paragraphs, merge ideas, split arguments across sections differently than the AI originally laid them out. The organizational structure of a document carries its own statistical fingerprint.

How HumanizeIt Does This Automatically

Doing all of this manually is tedious and time-consuming. That is exactly why we built HumanizeIt. Our platform uses advanced algorithms that target the same signals detectors look for — but in reverse.

HumanizeIt analyzes your text at the token, sentence, and paragraph level. It identifies passages with suspiciously low perplexity and injects controlled lexical variation — swapping predictable tokens for contextually appropriate but less probable alternatives. It measures sentence-level burstiness and restructures passages to introduce the kind of natural rhythm that human writing exhibits: short bursts followed by longer elaborations, rhetorical pivots, and tonal shifts.

Unlike simple paraphrasing tools that just swap synonyms (often introducing errors or awkward phrasing), HumanizeIt preserves your original meaning, tone, and intent. The output is not “spun” text — it is text that genuinely reads like a human wrote it, because the statistical profile matches human writing patterns.

The result? Text that passes GPTZero, Turnitin, Originality.ai, and every other major detector — not by exploiting a loophole, but by producing output that is statistically indistinguishable from human-authored content.

Ready to humanize your AI text?

Paste your text into HumanizeIt and get human-quality output in seconds. No more false flags. No more detector anxiety.

Get Started Free