I’ve heard a few people say that, since generative AI systems can hallucinate, we’re alway going to have to double-check their work in great detail if we want to be confident it’s correct.
I disagree.
Why?
Because AI systems can do much of this checking themselves.
And, when the cost/benefit tradeoff is right, I believe that’s exactly what we’ll get them to do.
How Do Humans Check Documents?
Imagine the kind of checking you might do as a human reviewer if you wanted to check that an article was accurate:
- Where the article cites online sources, you might go and check those sources to make sure any factual claims in the article are backed up by those sources.
- Where the article doesn’t cite sources, you might try to find appropriate sources to corroborate or refute specific claims.
- Where the article makes inferences, you might check that those inferences are reasonable.
AI Can Check Documents, Too!
It turns out that all these things can be done pretty well with the help of current LLMs and the kind of online content retrieval that tools such as Deep Research are built around. (I recently prototyped something along these lines myself, using some web search APIs.)
If an LLM hallucinates a claim in the middle of an article it is generating, it’s entirely possible that that same LLM, if asked to focus specifically on that claim (and provided the same source materials), can identify the claim as problematic.
We’ll Get AI To Check Documents (When It’s Worth It)
I don’t think this kind of mechanism has been built into tools much so far but I suspect it’s coming.
LLM costs are declining steeply and there are, surely, cases where documents are important enough to pay extra for increased accuracy.
Conclusion
I suspect that, very soon, we’ll start seeing tools along the lines of Deep Research that (perhaps for a premium price) incorporate AI-powered fact checking in order to provide an extra level of trustworthiness.
So the good news is that, despite what you may have heard, you won’t need to carefully check everything your AI tools write for you.
Leave a Reply