Every year brings a new legal-technology miracle. In 2026, the most aggressively promoted one may be “AI for discovery.” If you have attended even a single conference lately, you have heard the pitch. AI will slash review costs. AI will eliminate drudgery. AI will—apparently any day now—fetch your coffee. That last claim remains unproven.
What tends to get lost in the enthusiasm surrounding AI for discovery is a basic but critical distinction: not all AI is the same. The market often groups two very different technologies under a single oversized umbrella labeled AI, and the difference between them matters enormously in discovery. Definitions are in order:T echnology-assisted review (TAR) is the old, reliable workhorse. It is extractive. It finds what is already there based on mathematical patterns. As an article in the Richmond Journal of Law and Technology demonstrates, it has been in use for more than a decade, is well understood, and has enjoyed broad judicial acceptance.
TAR has earned respect from courts and practitioners who value measurable performance metrics, transparent workflows, and repeatable validation. The Sedona Conference TAR Primer remains the foundational explanation of why TAR works, how it can be audited, and how precision and recall can be evaluated.
Generative AI—large language models such as ChatGPT, Claude, and Gemini—is the new, charismatic intern. It is creative. It quickly generates new text based on probability. It is dazzling at first encounter, articulate, fast, and often helpful. It is also prone to making things up when under pressure.
Generative AI lacks TAR’s long judicial track record in discovery workflows. Chatbots are trained to produce plausible text, not to classify documents according to legal standards. They do not inherently understand responsiveness, confidentiality, privilege, or legal intent. Independent evaluations, including the Stanford HAI Index, consistently warn that while generative models are powerful, they remain unpredictable in risk-sensitive contexts.
Hallucinations Are Unavoidable–That’s the Nature of the Beast
Anyone who has experimented with generative AI has encountered its most famous quirk: hallucinations. That term is a polite way of saying the system will sometimes confidently assert things that never happened. Humans fabricate too, of course, but AI does it at scale, with greater speed and unearned confidence.
Research shows that hallucinations are not simply bugs waiting to be fixed. They are an inherent consequence of how large language models generate text. As a white paper from a leading AI app vendor explained, “Language models are shown to produce overconfident, plausible falsehoods, which diminish their utility.” Amen to that one.
In discovery, even a low hallucination rate can be devastating. A fabricated summary, an invented timeline, or a misclassified privileged document can derail a case. The sanctions decision in Mata v. Avianca (analyzed at J.D. Supra) demonstrated just how convincing—and just how wrong—AI-generated legal content can be.
The math is not comforting: if an AI model is wrong 1% of the time and you are reviewing 1 million documents, that translates into 10,000 errors. That is 10,000 opportunities to explain yourself to a judge you were hoping never to meet.
Judicial Reactions
Courts are not banning AI, but they are making it clear that responsibility for accuracy remains firmly human. Judge Brantley Starr’s Northern District of Texas AI Certification Order is one of the first (and clearest) judicial responses to these issues. It requires attorneys to certify that they have personally verified any AI-generated content.
An EDRM Memo analyzes a growing body of decisions emphasizing human verification, disclosure of AI use, and sanctions for lawyers who rely uncritically on machine output. This illustrates that courts are not hostile to technology. They are simply declining to babysit it.
Security Issues Are the Mother of All Red Lights
Discovery routinely involves an organization’s most sensitive information, and generative AI introduces vulnerabilities distinct from those of traditional e-discovery systems. The authoritative NIST AI Risk Management Framework and a related AI-specific memo explain risks that map uncomfortably well onto litigation practice: data leakage and prompt-injection attacks.
Even when assurances are offered that models will not be trained on client data, shared-model architectures can still permit unintended exposure. There’s a useful rule of thumb: if you would not email a document to a stranger in an airport lounge, you probably should not upload it to a Generative AI system that has not been carefully vetted and isolated.
The cost-benefit ratio militates against the use of this risky new technology. Even when generative AI appears to save time, it rarely reduces the amount of work. Lawyers must still validate outputs, double-check privilege, and document workflows.
Nuances Lost Can Spell Disaster
Discovery review requires nuanced reasoning regarding facts and relationships. This is distinct from legal analysis; it is about understanding what people actually mean when they communicate.
Research presented at the AAAI Conference on Artificial Intelligence shows that large language models sometimes struggle with sophisticated forms of language, such as sarcasm. A model might read an email saying “Great job on the account” as praise, missing the sarcasm that implies the account was lost.
Discovery is where nuance goes to be weaponized. Generative AI is often blissfully unaware of nuance—and far too cheerful about it.
Privilege Determination Challenges
While AI struggles with factual nuance, it fails dangerously with legal intent. Privilege turns on purpose and role—precisely the areas where generative AI performs least reliably. The AI sees words. It does not see the attorney-client relationship behind them.
Mislabeling a responsive document is embarrassing. Mislabeling a privileged document is a malpractice seminar waiting to happen. It’s a potential disaster because the ABA ethics rules emphasize competence, confidentiality, and supervision. Nothing in this guidance is consistent with delegating privilege determinations to a generative model.
New Lines of Attack in Discovery Litigation
Using generative AI in discovery can introduce a litigation vulnerability that lawyers may not initially anticipate. Opposing counsel may ask how the model works, how hallucinations were controlled, and whether outputs can be reproduced. For many generative AI systems, these questions do not have satisfying answers.
TAR workflows, by contrast, are explainable and defensible—qualities courts like almost as much as punctual filings.
When Limited Use of Generative AI Might Make Sense
Many practitioners are approaching these questions in good faith. Used carefully, generative AI can play a useful supporting role in discovery, provided a clear distinction is maintained: AI can inform judgment, but it cannot exercise judgment.
Its strengths are best applied in exploratory and low-risk contexts. Generative AI can help lawyers orient themselves in unfamiliar subject matter, summarize background materials not destined for production, or assist in drafting internal issue lists. In these roles, AI functions like a fast research aide—useful for gaining perspective, but never authoritative.
So What’s the Bottom Line?
Generative AI is coming for legal work, whether lawyers like it or not, and much of what it brings will be genuinely useful. Discovery, though, is a different conversation. The consequences of hallucination too expensive, and the existing competition too good to ignore. TAR has spent a decade earning judicial trust through transparency and measurable accuracy.
Adventurous litigators may be tempted to experiment with supplementing TAR with generative AI, and there’s nothing wrong with that. Just make sure it never touches your privilege log without a human chaperone.”
