Can Stealthwriter AI Beat QuillBot's AI Detector? We Tested 100 Rewrites

AI detectors are often treated like final judges, especially by students who worry that a normal-sounding paragraph might still get flagged. So we ran a 100-sample test: AI-written text was rewritten with Stealthwriter AI, then checked in QuillBot’s AI Detector. The result was not a neat maybe. It was a sharp split: most rewrites looked fully human to QuillBot, while the failures were obvious failures.

How the Test Worked

Each sample had three parts: the original text, the Stealthwriter rewrite, and the QuillBot detector result. QuillBot reports how much text is likely AI, so the scores were converted into a human score. In simple terms, a higher score means QuillBot was more likely to treat the text as human-written.

For this review, a score of 70% human or higher is counted as a pass. That is not a universal rule; it is just an easy cutoff for comparing results. Think of it like a class test: 70% and above means the rewrite cleared the line, while anything below that still looked suspicious.

Also Read: [STUDY] StealthWriter vs Sapling AI: Can 100 Humanized Rewrites Slip Through?

Main Findings From the 100-Sample Dataset

Average human score: 81.35% across all 100 rewrites.
Median human score: 100%. The median is the middle value after all scores are sorted.
Pass rate: 79 out of 100 samples scored 70% human or higher.
Failure group: 21 samples failed, and their average human score was only 11.2%.
Perfect scores: 79 samples received a full 100% human score from QuillBot.
Rewrite length: Stealthwriter rewrites were 7.3% longer on average.

100

total samples tested

79%

passed the 70% human cutoff

81.35%

average human score

21%

still looked AI-like to QuillBot

The Results Were Surprisingly Polarized

The most interesting part of the test was not just that Stealthwriter performed well. It was how uneven the detector response was. There were almost no middle-ground results. A rewrite usually landed at 100% human, or it fell below 30% human. That means QuillBot did not behave like a smooth slider in this dataset. It behaved more like a switch.

Also Read: [STUDY] Can Stealthwriter Really Bypass Copyleaks? What 100 Samples Show

Distribution of QuillBot human scores for 100 Stealthwriter rewrites

Human score distribution. The strongest pattern is the huge pile-up at 100% human and the smaller cluster below 30%.

This matters because a high average can hide risk. An 81.35% average sounds excellent, but it does not mean every rewrite was safely in the 80s. Instead, the dataset shows 79 strong passes and 21 weak misses. For students, that is a useful warning: a tool can look impressive overall and still fail badly on a specific paragraph.

Pass and fail rate chart for Stealthwriter rewrites in QuillBot AI Detector

Using a 70% human cutoff, Stealthwriter passed 79 of the 100 QuillBot checks.

Average Score: Strong, But Not Invincible

When all samples are combined, the rewritten text earned an average human score of 81.35%. In detector terms, that is a strong result. It suggests that Stealthwriter was highly effective at making most samples look human to QuillBot.

Average human score and average AI score from QuillBot after Stealthwriter rewriting

The average human score was high, but the failed samples show why averages should not be read alone.

The catch is simple: the failed rewrites were not close calls. The average human score among the failed group was only 11.2%. So when Stealthwriter worked, it worked very well. When it missed, QuillBot usually saw the text as strongly AI-generated.

Also Read: [STUDY] Can Stealthwriter Outsmart GPTZero? A 100-Sample Test

Proof From the Test Screenshots

The screenshots below show the same pattern from another angle. Several Stealthwriter outputs received very high human scores inside Stealthwriter’s own interface, while the QuillBot checks ranged from clean passes to more mixed results.

Stealthwriter humanizer result showing high human score

Stealthwriter sample result 1

Stealthwriter humanizer result showing 100 percent human

Stealthwriter sample result 2

Stealthwriter humanizer result for bear conservation text

Stealthwriter sample result 3

QuillBot AI Detector result for Stealthwriter rewrite

QuillBot check showing 0% likely AI

QuillBot AI Detector result showing partial AI likelihood

QuillBot check showing a mixed result

QuillBot AI Detector result for rewritten tadpole text

QuillBot check showing 0% likely AI

Longer Text Did Not Automatically Mean Better Detection Scores

Stealthwriter usually made the text longer. The original samples averaged 182.9 words, while the rewritten samples averaged 195.5 words. That is an average increase of 12.6 words, or about 7.3%.

Average word count before and after Stealthwriter rewriting

Most rewrites became slightly longer, but length alone did not explain the detector outcome.

However, adding words was not a magic trick. Some failed samples were also longer after rewriting. This suggests QuillBot was reacting to sentence style, wording patterns, or structure rather than simply rewarding longer text.

Scatter plot comparing word count change to QuillBot human score

The scatter plot shows that longer rewrites did not always receive higher human scores.

The Quality Problem: Passing a Detector Is Not the Same as Good Writing

The detector scores look impressive, but the rewrites were not perfect. A human reader could still spot problems in several samples. This is important for students because a detector score only answers one question: Does this look AI-generated to the detector? It does not answer: Is this accurate, natural, and ready to submit?

Meaning drift: In one battery-safety rewrite, “unplug your device once it reaches full charge” became “discharge your gadget after it is fully charged,” which changes the advice.
Incorrect wording: A Mars rover sample turned into “exploration spaceships on Mars,” even though rovers are not spaceships.
Awkward grammar: Examples included phrases like “be able to adapts,” “With targeting your form,” and “the experiences that you can experience.”
Typos and wrong word choice: One VR rewrite said “Immense yourself” instead of “Immerse yourself.”
Formatting issues: One numbered list started at 2 instead of 1, and another had a lowercase list heading: “3. smell the vegetables.”
Unchanged output: One sample appeared unchanged after rewriting and received a 0% human score.

These issues do not destroy the overall result, but they change how the result should be understood. Stealthwriter was strong at reducing AI-detection risk in QuillBot, yet it still needs human editing. A rewrite can pass a detector and still sound clumsy, over-formal, or slightly inaccurate.

Also Read: Stealthwriter vs ZeroGPT: I Tested 100 Rewrites, and the Results Were Complicated!

What Students Should Take Away

For students, the lesson is not “use a humanizer and stop thinking.” The lesson is that AI detectors are imperfect, and rewriting tools can change how detectors judge text. That makes detector results useful, but not final.

Do not trust the score alone. A 100% human score does not prove the writing is clear or correct.
Read every rewrite yourself. Look for broken grammar, changed meaning, odd vocabulary, and missing list formatting.
Use AI ethically. A tool can help with phrasing and revision, but the final thinking, sources, and responsibility should still be yours.

Final Verdict

Based on this 100-sample test, Stealthwriter AI was highly effective at getting rewritten text past QuillBot’s AI Detector. It achieved a 79% pass rate using a 70% human-score cutoff, and the overall average human score was 81.35%.

But the result is not a blank check. The failures were severe, and the rewrite audit found grammar problems, formatting mistakes, unchanged text, and a few meaning shifts. The best conclusion is balanced: Stealthwriter can strongly reduce QuillBot AI flags, but human review is still the real final detector.