When Edgar Allan Poe Fails the AI Detector

When Edgar Allan Poe Fails the AI Detector: Why These Tools Are Destroying Writer Credibility

Disclaimer: These tests were only run on one AI Detector (which will remain nameless). It is one of the most popular detection websites and I know many teachers, editors and readers who use this site. While results may vary across different detectors, the fundamental issue remains: if tools this widely-used can flag Poe and Irving as AI, we have a serious credibility problem.

I was talking to a friend the other day, a fellow teacher, and he said if a piece of work turned in by a student, or anything he finds on the internet, tests even 20% AI, he rejects it outright. More than that, he claims to know WITH CERTAINTY that it’s AI-written. The detector said 20%, so case closed.

This got me thinking. How many pieces are testing as AI but were actually written by human hands?

I decided to run an experiment. First, I tested some stories I wrote in college. Many of my stories came back flagged—at least 20% AI according to a popular online detector. I wrote those stories twenty years ago. That was long before ChatGPT was even a glimmer in some programmer’s eye.

Then I thought, what about the masters? What about the writers we’ve been teaching for generations? So I grabbed some famous short stories and ran them through the same detector. The results shocked me.

The Method

I copied roughly 10,000 words from each story (the detector I used maxes out at 15,000 words) and pasted the text into the tool. I believe this is how many educators and editors actually check web stories or articles with a quick cut and paste job to see if the content is “real” or not.

The Results

First let me say many many stories tested at about 20%. I tested maybe 50 stories. I took the most famous with the highest detection percentage and put 10 on the list. Here’s what I found:

Title	Author	Year Published	AI Detection %
Rip Van Winkle	Washington Irving	1819	78.33
The Cask of Amontillado	Edgar Allan Poe	1846	69.59
The Monkey’s Paw	W. W. Jacobs	1902	65.44
The Tell-Tale Heart	Edgar Allan Poe	1843	62.95
Rikki-Tikki-Tavi	Rudyard Kipling	1894	62.85
An Occurrence at Owl Creek Bridge	Ambrose Bierce	1890	61.06
The Gift of the Magi	O. Henry	1905	58.78
The Black Cat	Edgar Allan Poe	1845	50.34
A Rose for Emily	William Faulkner	1930	41.91
The Necklace	Guy de Maupassant	1884	40.87

Washington Irving scores 78%. Edgar Allan Poe would be rejected from my friend’s class three times over. Rudyard Kipling? Sorry, that’s AI. O. Henry? Flagged.

These aren’t borderline cases hovering around 20%. Some of these are over 60%, over 70%. If I didn’t know these stories were written a century or more ago, I might believe they were AI-generated myself based solely on these numbers.

What This Actually Means

AI detectors don’t detect AI. They detect patterns. They flag clear, direct prose. They penalize good grammar and logical sentence structure. They mistake efficiency for artificiality.

Is it the characteristics of good writing that is being detected? Teachers have spent decades telling students to be clear, to be concise, to use active voice, to structure their thoughts logically. We’ve taught them to revise until their prose flows smoothly. Have we trained them to write exactly the way these detectors now flag as “artificial?”

It seems like the detectors, at this point, are identifying some of the hallmarks of competent writing and calling it fake.

Fears of Real Damage

Think about what happens when we rely on these tools:

A student turns in a well-written essay. It’s clear, organized, and grammatically sound. The detector flags it at 35% AI. The teacher assumes cheating. The student gets a zero, and faces everything from academic suspension to being expelled in some cases. These are worst case scenarios but they have happened.

Magazines and literary journals are now stuck. Do they run every submission through a detector? Do they trust their own editorial judgment? What happens when a detector flags a submission at 45%, but the editor genuinely believes it’s human-written? The safe choice becomes rejecting anything that scores high. But does “safe” mean losing quality work and damaging relationships with good writers?

A freelance writer submits an article to a publication. The editor runs it through a detector as standard practice now. It comes back 40% AI. The writer loses the assignment, gets quietly blacklisted, and wonders why their career is stalling. They wrote every word themselves, but the machine says otherwise, and the machine doesn’t make mistakes, right?

Except the machine is making mistakes. Constantly.

How many writers are being rejected right now because their prose is too clean? How many students are being accused of cheating because they actually learned what their teachers taught them?

The Deeper Issue

There’s something else happening here that worries me more. These detectors may be creating an environment where bad writing is safer than good writing. If clear, efficient prose gets flagged, then students and writers will adapt. They’ll write worse on purpose. They’ll add unnecessary complexity, throw in awkward phrasing, make deliberate mistakes. They’ll sabotage their own work just to pass the detector.

Are we now incentivizing mediocrity? Are we teaching people that writing well is suspicious?

The Bottom Line

Edgar Allan Poe would fail my friend’s AI test. So would Washington Irving, Rudyard Kipling, and William Faulkner. If we’re going to reject anyone who scores above 20% as definitively using AI, we better be prepared to reject some of the greatest writers in the English language.

My friend who rejects anything over 20% with certainty isn’t detecting AI. He’s detecting competence and calling it fraud.

These tools are not reliable. Are they beginning to destroy the credibility of honest writers and honest students while giving us a false sense of security? Running a paper through a detector and treating the result as gospel isn’t protecting academic integrity—it’s abandoning it.

The detectors can’t tell the difference between Poe and a chatbot. Maybe it’s time we stopped pretending they can.

Leave us a comment. If you test other famous short stories, let us know.

Author
Recent Posts

Richard

Richard Everywriter (pen name) is the founder of EveryWriter and a 25-year veteran of the publishing industry. With degrees in Writing, Journalism, Technology, and Education, Richard has dedicated two decades to teaching writing and literature while championing emerging voices through EveryWriter's platform. His work focuses on making literary analysis accessible to readers at all levels while preserving the rich heritage of American literature. Connect with Richard on Twitter Bluesky Facebook or explore opportunities to share your own work on ourSubmissions page. For monthly insights on writing and publishing, subscribe to our Newsletter.

When Edgar Allan Poe Fails the AI Detector

When Edgar Allan Poe Fails the AI Detector: Why These Tools Are Destroying Writer Credibility

Related Posts:

Leave a Reply Cancel reply