Generative AI programs have a well-known tendency to “hallucinate” – that is, to simply fabricate information. Three weeks ago, federal district court judge Kelly Rankin sanctioned three Morgan & Morgan attorneys for filing motions in limine with citations to eight cases that did not exist. As this report explains:
[M&M Attorney Rudwin Ayala] used his firm’s in-house AI platform, MX2.law, to generate case law when drafting the motions. He uploaded his draft and gave the AI system prompts that included “add to this Motion in Limine Federal Case law from Wyoming setting forth requirements for motions in limine” and “add more case law regarding motions in limine.” Without verifying the accuracy of the AI-generated citations, Ayala included them in the filings, which were then signed by all three attorneys. . . .
“Every attorney learned in their first-year contracts class that the failure to read a contract does not escape a signor of their contractual obligations, [Judge Rankin wrote]. Similarly, one who signs a motion or filing and fails to reasonably inspect the law cited therein violates Rule 11 by its express terms. . . . As attorneys transition to the world of AI, the duty to check their sources and make a reasonable inquiry into existing law remains unchanged.”
It’s noteworthy that the fake citations were generated by the firm’s own purpose-built AI program, which one would think would be less likely to hallucinate than the widely available ChatGPT, which has a history of it. From a June 2023 article in the ABA Journal: “A federal judge in New York City has ordered two lawyers and their law firm to pay $5,000 for submitting a brief with fake cases made up by ChatGPT and then standing by the research.” This has happened in appellate briefs as well, as Law360 reported in January 2024:
The Second Circuit referred a New York attorney for punishment Tuesday for submitting a brief citing a fake case generated by ChatGPT and not checking over the brief to catch the mistake.
Jae S. Lee was referred to the Second Circuit's grievance panel through a decision handed down . . . in a per curiam decision. . . .
“The brief presents a false statement of law to this court, and it appears that attorney Lee made no inquiry, much less the reasonable inquiry required by Rule 11 and long-standing precedent, into the validity of the arguments she presented,” the panel said.
Lee admitted that she submitted a reply brief in the appellate case in September citing “Matter of Bourguignon v. Coordinated Behavioral Health Servs. Inc., 114 A.D.3d 947 (3d Dep't 2014),” according to Tuesday's decision – a case the court was unable to find and which proved not to exist. [Lee also admitted that she had used ChatGPT to find the citation.] . . .
The panel noted that several courts have proposed or enacted local rules or orders to address the use of artificial intelligence tools. “But such a rule is not necessary to inform a licensed attorney, who is a member of the bar of this court, that she must ensure that her submissions to the court are accurate,” the panel said.
At least one expert witness has been caught in the same trap. I think it’s likely there are others who didn’t make the news. This one did because, ironically, his field of expertise is misinformation. As reported by Forbes in December 2024:
Recently, a controversy surfaced involving Dr. Jeff Hancock, a Stanford University professor and renowned expert on misinformation and social media, when his expert testimony in a high-profile case revealed fabricated citations. A case involving Minnesota's “Use of Deep Fake Technology to Influence an Election” law requested Hancock's expert opinions. The legal document he provided had several citations to studies and research that, upon closer investigation, did not exist.
Ironically, Professor Hancock is a prominent figure in the study of misinformation. His TED talk, “The Future of Lying,” has over 1.5 million views on YouTube, and he also appears in a Netflix documentary exploring the topic of misinformation.
The professor acknowledged utilizing ChatGPT, which mistakenly resulted in the creation of fictitious references. The incident, another high-profile example of “AI hallucination,” has raised serious concerns about the risks of relying on generative AI for activities that require accuracy and trustworthiness. For businesses too, this issue serves as a timely warning of the risks that AI can pose when not properly controlled, particularly in high-stakes contexts where reputation, legal compliance, and operational efficiency are crucial.
New AI tools have improved hallucination rates, but not enough, as a May 2024 article from Stanford reports:
We put the claims of two providers, LexisNexis (creator of Lexis+ AI) and Thomson Reuters (creator of Westlaw AI-Assisted Research and Ask Practical Law AI)), to the test. We show that their tools do reduce errors compared to general-purpose AI models like GPT-4. That is a substantial improvement, and we document instances where these tools provide sound and detailed legal research. But even these bespoke legal AI tools still hallucinate an alarming amount of the time: the Lexis+ AI and Ask Practical Law AI systems produced incorrect information more than 17% of the time, while Westlaw’s AI-Assisted Research hallucinated more than 34% of the time.
There’s nothing wrong with using AI for legal research – indeed it’s a valuable tool. But attorneys absolutely, positively must check citations generated by AI to make sure they’re not hallucinations.