Best AI Detector | Free & Premium Tools Compared
AI detectors are tools designed to detect when a text was generated by an AI writing tool like ChatGPT. AI content may look convincingly human in some cases, but these tools aim to provide a way of checking for it. We’ve investigated just how accurate they really are.
To do so, we used a selection of testing texts including fully ChatGPT-generated texts, mixed AI-and-human texts, fully human texts, and texts modified by paraphrasing tools. We ran all these texts through 12 different AI detectors to see how accurately each tool labelled them.
Our research indicates that if you’re willing to pay, the most accurate AI detector available right now is Scribbr’s premium AI Detector, which identified 84% of our texts correctly. If you don’t want to pay, the best choices are QuillBot’s and Scribbr’s free AI Detectors: they are totally free and both scored 78% accuracy, a tie for the second highest score among all tools.
Tool | Accuracy | False positives | Free? | Star rating |
---|---|---|---|---|
1. Scribbr (premium) | 84% | 0 | 4.2 | |
2. QuillBot | 78% | 0 | 3.9 | |
3. Scribbr (free) | 78% | 0 | 3.9 | |
4. Originality.AI | 76% | 1 | 3.7 | |
5. Sapling | 68% | 0 | 3.4 | |
6. CopyLeaks | 66% | 0 | 3.3 | |
7. ZeroGPT | 64% | 1 | 3.1 | |
8. GPT-2 Output Detector | 58% | 0 | 2.9 | |
9. CrossPlag | 58% | 0 | 2.9 | |
10. GPTZero | 52% | 1 | 2.5 | |
11. Writer | 38% | 0 | 1.9 | |
12. AI Text Classifier (OpenAI) | 38% | 1 | 1.8 |
General conclusions
In general, our research showed that because of how AI detectors work, they can never provide 100% accuracy. The companies behind some tools make strong claims about their reliability, but those claims are not supported by our testing. Of the tools we tested, only Scribbr’s premium AI Detector surpassed 80% accuracy; the best free tools, QuillBot’s free AI Detector and Scribbr’s free AI Detector, both scored 78%.
We also observed some other interesting trends:
- False positives (human-written texts flagged as AI) do happen. Four of the 12 tools we tested had a false positive, including one of the overall best tools, Originality.
- GPT-4 texts were generally harder to detect than GPT-3.5 texts. However, most tools do still detect GPT-4 texts in some cases.
- AI texts that have been combined with human text or paraphrased are hard to detect. Sribbr’s premium AI Detector does best with them but still finds only 60%.
- AI detectors generally don’t detect the use of paraphrasing tools on human-written text. Of the tools we tested, only Originality detected this in more than half of cases (60%).
- AI texts on specialist topics seem slightly harder to detect than those on general topics (67% vs. 76% accuracy).
- While most detectors show a percentage, they are often binary in their judgements – showing close to 100% or close to 0% in most cases, even when a text is about half-and-half.
Overall, AI detectors shouldn’t be treated as absolute proof that a text is AI-generated, but they can provide an indication in combination with other evidence. Educators using these tools should bear in mind that they are relatively easy to get around and can sometimes produce false positives.
1. Scribbr (premium)
- The most accurate detection out of all the tools we tested
- Very low false positive rate
- Detects a high proportion of edited AI texts and up to 100% of GPT-4 texts
- Provides a percentage
- Also includes a plagiarism check
- Offers a money-back happiness guarantee and live support
- Premium quality has a price
- Doesn’t always detect use of paraphrasing tools
- Doesn’t highlight text to indicate AI content
Scribbr’s premium AI Detector stood out as the best tool we tested in terms of accuracy. It had the highest overall accuracy score at 84%, did not incorrectly label any human text as AI-generated, and detected every GPT-4 text. Additionally, it was the best tool for detecting AI content that was combined with human text or run through a paraphrasing tool (it caught 60% of these texts).
The information provided is clear: the tool provides a percentage that indicates the likelihood it considers a text to be AI-generated. Additionally, the premium AI check is provided as an add-on to a premium plagiarism check. This dual approach provides comprehensive content assessment.
Users can get a premium AI Detector check for free in combination with a plagiarism check. There’s no word limit, so users can upload as big a document as they want.
Try Scribbr’s premium AI Detector now
2. QuillBot
- Highly accurate free tool
- Very low false positive rate
- Detects a high proportion of edited AI texts and up to 100% of GPT-4 texts
- Gives a percentage
- No sign-up needed
- Unlimited number of checks
- Doesn’t highlight text to indicate AI content
- Doesn’t always detect use of paraphrasing tools
QuillBot’s free AI Detector received an overall score of 78%. This made it one of the most accurate free tools we tested (alongside Scribbr’s free AI detector). It detected all GPT-3.5 and GPT-4 texts with 100% accuracy, and it had no false positives.
Additionally, it performed above average when used to detect AI content that had been combined with human-written text or altered using a paraphrasing tool (it accurately detected 50% of these texts).
QuillBot’s free AI Detector is quick and easy to use; You can paste in text or upload a file to test it immediately. The information provided is clear: the tool gives a percentage likelihood for text it considers to be AI-generated. However, it doesn’t highlight this text.
The tool is completely free and doesn’t require you to sign-up. It can scan up to 1200 words per check, and there’s no limit to the number of checks users can perform.
Try QuillBot’s Free AI Detector
3. Scribbr (free)
- Highly accurate free tool
- Very low false positive rate
- Detects up to 100% of GPT-4 texts and a high proportion of edited AI texts
- Gives a percentage
- No sign-up needed
- Unlimited number of checks
- Doesn’t highlight text to indicate AI content
- Doesn’t always detect use of paraphrasing tools
Scribbr’s free AI Detector had an overall score of 78%, making it one of the most accurate free tools we tested (alongside QuillBot).
The tool detected all GPT-3.5 and GPT-4 texts with 100% accuracy and did not incorrectly label any human-written text as AI-generated. It performed better than most tools when used to detect mixed AI and human-written text or text that had been modified using a paraphrasing tool (it detected 50% of these types of texts correctly).
Scribbr’s free AI Detector is fast and user-friendly. The tool provides clear information, giving a percentage likelihood for text it identifies as AI-generated.
Additionally, it is completely free to use and doesn’t require sign-up. Users can scan up to 500 words per check, and they can perform an unlimited number of checks.
Try Scribbr’s Free AI Detector
4. Originality.AI
- High accuracy
- Detects up to 100% og GPT-4 texts
- Sometimes detects use of paraphrasing tools
- Gives a percentage
- Highlights text to indicate likelihood of AI content
- Relatively high cost
- Requires sign-up to use
- Relationship between percentage and highlighting is not very clear
Originality.AI, another premium tool, performed almost as well as Scribbr, but with slightly lower overall accuracy (76%) and one false positive. However, it was the only tool in our testing to detect the use of paraphrasing tools more than half the time (60%); if you’re interested in this kind of detection, Originality is likely the best choice.
Originality gives a percentage likelihood that a text is AI-generated and highlights text in various colors to label it as AI or human. The highlighting doesn’t always have a clear relationship to the percentage shown, though. It’s not fully clear how the user should interpret the two pieces of information.
It’s worth noting that Originality’s pricing is fairly generous at $0.01 per 100 words, but there is a minimum spend of $20. Still, for that price, you get 200,000 words.
5. Sapling
- Free
- Quite accurate
- Low false positive rate
- Gives a percentage
- No sign-up needed – just paste in text
- Not clear how to interpret the two different kinds of highlighting
Sapling performed quite well for a free tool, with an overall score of 68%. It detected all GPT-3.5 texts and over half of GPT-4 texts (60%). It also had no false positives and did better than most tools at correctly highlighting the AI content in mixed AI-and-human texts.
Sapling is very quick and straightforward to use. There’s no sign-up required; you just paste in the text you want to check and get an instant result.
You get a percentage score followed by two highlighted versions of the text. It’s not really clear how the user is meant to interpret these two different highlighted texts, since they give different information. The first one is the one that matches the percentage given most closely.
6. CopyLeaks
- Free
- Accurate for a free tool
- Low false positive rate
- No sign-up required – just paste in text
- Doesn’t give an overall percentage
- Information provided is not clearly explained
- Limits on number of daily checks (even if you sign up)
CopyLeaks is one of the better free tools in terms of accuracy, at 66% (though this is much lower than the 99% claimed on the site). Like Sapling, it found all GPT-3.5 texts and over half of the GPT-4 texts, and it had no false positives.
However, CopyLeaks has some unfortunate downsides in terms of usability. There’s a limit on daily checks, which can be increased (but not removed) by signing up for a free account. Additionally, the results shown are very unclear compared to those of other tools.
Instead of an overall percentage, you just get a highlighted text. When you mouse over part of the text, a percentage is shown, but this is not the overall AI content percentage. It seems likely that this percentage represents the tool’s confidence in its label for that piece of text, but this is a guess – it’s not explained anywhere in the interface. As such, the tool is not user-friendly.
7. ZeroGPT
- Free
- Accurate for a free tool
- Gives a percentage, highlighting, and text assessment
- No sign-up required – just paste in text
- Not always clear how text assessment relates to percentage
- Doesn’t always detect GPT-3.5 texts
ZeroGPT performed quite well for a free tool, with 64% accuracy overall. It identified four of the five GPT-3.5 texts and three of the five GPT-4 texts. It performed particularly well at finding texts that consisted of paraphrased AI content or mixed AI-and-human content, finding 50% of these texts.
We found the tool straightforward to use. You can just paste in text (or upload a file) to test it immediately, and the results show a text label such as “Your Text is AI/GPT Generated”, a percentage, and text highlighting indicating which parts of the text are most likely AI.
We did find it hard to understand exactly how the text label related to the percentage, since very different percentages would sometimes show the same label, or vice versa. Additionally, the tool did have one false positive, identifying a human-written text as AI.
8. GPT-2 Output Detector
- Free
- Low false positive rate
- No sign-up required – just paste in text
- Provides a percentage
- Below-average accuracy
- No text highlighting
- Doesn’t always detect GPT-3.5 texts
GPT-2 Output Detector performed slightly below average in our testing, at 58%. It caught the same number of GPT-4 texts as Sapling and CopyLeaks, but it missed one of the GPT-3.5 texts. It had no false positives but was otherwise not very impressive in accuracy terms.
The interface provided is simplistic but clear, and there’s no sign-up required. You simply paste in your text and get percentages representing how much text is “real” and how much “fake”. There’s no text highlighting to indicate which is which, though.
GPT-2 Output Detector is an OK option, but there’s no real reason to use it instead of the more accurate and equally accessible QuillBot.
9. CrossPlag
- Free
- Low false positive rate
- No sign-up required – just paste in text
- Provides a percentage
- Below-average accuracy
- No text highlighting
- Doesn’t always detect GPT-3.5 texts
CrossPlag performs at the same level of accuracy as GPT-2 Output Detector: 58% (though they got slightly different things wrong, suggesting they’re not using identical technology). Like that tool, it had no false positives and got one of the GPT-3.5 texts wrong.
The information provided is also very similar: just a percentage, without any text highlighting or other information. CrossPlag presents the information in a slightly more attractive interface, but there’s no real difference in terms of content.
Because of this, there’s very little distinguishing these two tools. They’re both middling options for AI detection that are outperformed by other free tools like QuillBot or Scribbr.
10. GPTZero
- Free
- Provides stats that other tools don’t
- Highlights text
- No sign-up required – just paste in text
- Below-average accuracy
- No percentage shown
- Only seems to give binary judgements
GPTZero is unusual in the way it presents its results. Instead of a percentage, it gives a sentence stating what it detected in your text (e.g., “Your text is likely to be written entirely by AI”). In our testing, it only ever said that a text was entirely AI or entirely human, suggesting it’s unable to detect mixed AI-and-human texts.
Because of these binary judgements, it got the relatively low accuracy score of 52%. The tool does also highlight text to label it AI, but again, we found that it only ever highlighted either the whole text or none of it.
Further stats – perplexity and burstiness – are shown, but these are not likely to be helpful to the average user, and it’s unclear how exactly they relate to the judgement. While GPTZero is straightforward to use, we found the information it provided to be inadequate and not very accurate.
11. Writer
- Free
- Low false positive rate
- No sign-up required – just paste in text
- Often fails to load results
- Very low accuracy
- Struggles to detect GPT-4 texts
- Low character limit
- No text highlighting
The AI detector on Writer’s website didn’t work very well for us. On our first attempt, the results consistently failed to load, making the tool useless. When we tried again a few days later, results did usually load correctly, although they still failed every few checks, requiring a lot of retries.
When the tool was working, its results were still some of the least accurate we saw, at 38%. While it had no false positives, it detected none of the GPT-4 texts and only 70% of the GPT-3.5 texts. Its ability to detect paraphrased or mixed AI texts was the worst of all the tools we tested.
In terms of the information shown, Writer provides a percentage of “human-generated content” but no highlighting to indicate what content has been labelled AI. It also has a character limit of 1,500, the lowest of the tools we tested. We don’t recommend this tool.
12. AI Text Classifier (OpenAI)
- Free
- Quick and straightforward to use
- Very low accuracy
- Very vague results, with no percentage or highlighting
- Sign-up required (same account as ChatGPT)
Although it was developed by OpenAI, the company behind ChatGPT itself, we found that the AI Text Classifier did not provide enough information to be useful. It doesn’t give a percentage or any kind of highlighting, just a statement that the text is very unlikely/unlikely/unclear if it is/possibly/likely AI-generated.
The overall accuracy of the tool was 38%, the same as that of Writer. However, unlike Writer, it did unfortunately have one false positive – a significant problem if you want to use the tool to assess student submissions, for example.
According to Open AI, as of July 20, 2023, the AI text classifier is no longer available for use.
Research methodology
To carry out this research, we first selected 12 AI detectors that currently show up prominently in search results. We looked mostly at free tools but also included two premium tools with reputations for high accuracy.
We tested all 12 tools with the same texts and the same scoring system for accuracy. The usability and pricing of the tools are discussed in the individual reviews but were not included in the scoring system, which is based purely on accuracy and the number of false positives.
Testing texts
In our testing, we used six categories of texts, with five texts in each category and therefore a total of 30 texts. Each text was between 1,000 and 1,500 characters long (AI detectors are usually inaccurate with texts any shorter than this). The categories were:
- Completely human-written texts
- Texts generated by GPT-3.5 (from ChatGPT)
- Texts generated by GPT-4 (from ChatGPT)
- Parts of the human-written texts, combined with GPT-3.5 text (from ChatGPT)
- The GPT-3.5 texts, but paraphrased by QuillBot
- The human-written texts, but paraphrased by QuillBot
The human-written texts were all on different topics – two quite technical specialist topics and three more general topics – and from different kinds of publication:
- A thesis introduction about chronic obstructive pulmonary disease
- An academic report about artificial intelligence
- A Wikipedia article about the French Revolution
- An online article about Romanticism
- An analysis article about gun control in the US
To make the comparison as fair as possible, all other texts were on the same five topics (e.g., we prompted ChatGPT to “Write a college essay about the French Revolution”). We used the same prompts for the GPT-3.5 and GPT-4 texts, and we used the same settings for all QuillBot paraphrasing (“Standard” mode, maximum number of synonyms).
You can see all the testing texts in the document below, including links to the sources of the human-written texts:
Accuracy scoring
For each scan, we gave one of the following scores:
- 1: Accurately labelled the text as AI or human (within 15% of the right answer)
- 0.5: Not entirely wrong, but not fully accurate (within 40% of the right answer)
- 0: Completely wrong (not within 40% of the right answer)
For example, if a text is 50% AI-generated, then a tool gets 1 for labelling it 55% AI, 0.5 for labelling it 27%, and 0 for labelling it 2% (or 98%). When a tool didn’t show a percentage, we converted the information it did give into a percentage (e.g., OpenAI’s “likely” label = 81–100%).
These scores were added up and turned into accuracy percentages. However, we excluded the paraphrased human-written texts from this score. So a tool that scored 1 for every text (excluding paraphrased human texts) would have a 100% accuracy score.
We excluded these texts because AI detectors are not really designed to detect paraphrasing tools, only purely AI-generated text. It’s interesting to investigate whether they can sometimes detect these tools anyway, but it’s not fair to include this in the score.
The scores indicated in the table at the start are:
- Accuracy (as defined above)
- False positives: How many of the five purely human-written texts are wrongly flagged as AI
- Star rating: The accuracy percentage, turned into a score out of 5, with 0.1 subtracted for each false positive
Frequently asked questions about AI detectors
- How accurate are AI detectors?
-
AI detectors aim to identify the presence of AI-generated text (e.g., from ChatGPT) in a piece of writing, but they can’t do so with complete accuracy. In our comparison of the best AI detectors, we found that the 10 tools we tested had an average accuracy of 60%. The best free tool had 68% accuracy, the best premium tool 84%.
Because of how AI detectors work, they can never guarantee 100% accuracy, and there is always at least a small risk of false positives (human text being marked as AI-generated). Therefore, these tools should not be relied upon to provide absolute proof that a text is or isn’t AI-generated. Rather, they can provide a good indication in combination with other evidence.
- How can I detect AI writing?
-
Tools called AI detectors are designed to label text as AI-generated or human. AI detectors work by looking for specific characteristics in the text, such as a low level of randomness in word choice and sentence length. These characteristics are typical of AI writing, allowing the detector to make a good guess at when text is AI-generated.
But these tools can’t guarantee 100% accuracy. Check out our comparison of the best AI detectors to learn more.
You can also manually watch for clues that a text is AI-generated – for example, a very different style from the writer’s usual voice or a generic, overly polite tone.
- Can I use AI tools to write my essay?
-
Using AI writing tools (like ChatGPT) to write your essay is usually considered plagiarism and may result in penalisation, unless it is allowed by your university. Text generated by AI tools is based on existing texts and therefore cannot provide unique insights. Furthermore, these outputs sometimes contain factual inaccuracies or grammar mistakes.
However, AI writing tools can be used effectively as a source of feedback and inspiration for your writing (e.g., to generate research questions). Other AI tools, like grammar checkers, can help identify and eliminate grammar and punctuation mistakes to enhance your writing.
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.