Latest GPT 'Fake Case' Kerfuffle Brings New, More Exciting Ways For AI To Mess Up
It's a mistake... but not the worst.
Remember when those lawyers filed a brief riddled with fake cases generated by ChatGPT? The mainstream media had an old-fashioned freakout about the dangers of artificial intelligence without ever once considering that not reading the cases you’re citing in a brief is a very, very human lawyer problem more than a technological one. But “AI is ruining the law” is a better headline and so here we are.
Fast forward to today, where lawyers are now appropriately forewarned against blindly trusting generative AI legal research and trusted legal research providers have developed AI tools tailored to — hopefully — protect lawyers from the fever dream hallucinations of the algorithm.
Unfortunately, hallucinations aren’t the only flies AI can introduce to the ointment:
A lawyer has asked a Virginia federal judge not to impose sanctions after he used incorrect case citations and quotes in a court filing, arguing the errors were unintentional and stemmed from “good-faith reliance” on artificial intelligence tools.
Note the problem — despite the judge’s initial concerns in the case — is not “fake” cases but incorrectly cited and quoted ones. It’s a step removed from the wild days of 2023 where algorithms might just magic up a bogus case to give the user what they wanted to hear, but it’s still a problem for lawyers.
In the attorney’s show cause response, Thad Guyer explained that the cases cited were both very real and very substantively relevant, but…
The two cases cited by Defendant and the Court as apparently non-existent, United Therapeutics Corp. v. Watson Laboratories, Inc. and United States v. Mosby, do exist but were miscited (Guyer Decl. ¶ 11). United Therapeutics, though incorrectly cited as “No. 3:17-cv00081, 2017 WL 2483620, at 1 (E.D. Va. June 7, 2017),” is a real case reported at 200 F. Supp. 3d 272 (D. Mass. 2016) (Guyer Decl. ¶ 11(1)). Moreover, the proposition for which this case is cited in Plaintiff’s Objections is accurate and applicable, notwithstanding the misquotations. (Guyer Decl. ¶ 14). Similarly, United States v. Mosby, though miscited as “2021 WL 2827893, at 4 (D. Md. July 7, 2021),” is a real case reported at 2022 WL 1120073 (D. Md. Apr. 14, 2022). (Guyer Decl. ¶ 11(2)). The proposition for which this case was cited finds analogous support in a Fourth Circuit case cited within Mosby. (Guyer Decl. ¶ 15).
We have achieved AI that sucks at Bluebooking! COMPUTERS — THEY’RE JUST LIKE US!
Unlike the earlier hallucination cases, this error threatens to become a stumbling block for a lot of attorneys because the platform is actually giving the user the correct case to read so the lawyer is faithfully checking the case to guarantee it isn’t misleading the court — but if they aren’t vigilant, they can parrot the bad cite and generate a lot of unnecessary confusion.
Alas the citations weren’t the only problem:
The attributed quotations that do not appear verbatim in the cited cases, Graves v. Lioi and Bostock v. Clayton County, nonetheless accurately reflect principles discussed in those cases (Guyer Decl. ¶ 12). The “decided by necessary implication” language in Graves v. Lioi, while not a direct quote, correctly states the law of the case doctrine as applied by the Fourth Circuit (Guyer Decl. ¶ 17). Similarly, the “make a mockery of the law” phrase in Bostock v. Clayton County, though misquoted, aligns with the Supreme Court’s emphasis on avoiding statutory constructions that would lead to absurd consequences (Guyer Decl. ¶ 18).
Christine Lemmer-Webber is credited with dubbing generative AI “mansplaining as a service.” By design it’s trying to give the user what they want, and it’s willing to loudly and confidently talk out of its digital ass to do it. That’s how hallucinations happened and now it’s how it willingly massages the language it pulls to give the lawyer fake quotes. As Guyer explains, the quotes aren’t actually wrong about the cases. In fact…
I now understand that when the GPTs “see” a case, they see the extended document, including internal citations to authority. The GPTs saw the Wilson reference to the Touhy statute as incorporated into Mosby. I should have included the citation to United States v. Wilson.
For example, Guyer explains in his declaration that he found that the “make a mockery of the law” quote should have been “absurdity” because that’s the language in the Bostock opinion, but the AI had grabbed “mockery” from quotes in the Bostock string cite without realizing that those were different — if still helpful — cases entirely.
These are all much more nuanced problems with AI research than we saw last year but they’re nonetheless problems. Indeed they might be worse problems because unlike entirely fake cases, errors like these could get overlooked by everyone involved and then end up in an opinion that becomes “garbage in” for future researchers.
But is it sanction-worthy? Meh. He cited real cases with real relevance and quotes that might have been wrong but weren’t misleading about the holdings. Guyer makes a pretty good case that the media spotlight on AI and the slings and arrows of grandstanding judges have fostered unduly itchy sanction trigger fingers.
And until the advent of the GPT’s and a spate of sensationalized 2024 cases, lawyer miscitation and misquotation seldom drew the mention of Rule 11, and almost never resulted in a sanction, and at least once drew “Levity”. But in the dizzying pace of courtroom technology of ChatGPT and the “Zoom cat lawyer”, our profession is sensitive to technological practice error.
A corollary of not rushing to blame technology for human screw ups is not acting like the sky is falling whenever tech makes a mistake that could’ve just as easily come from a human. Had some lowly associate cut and pasted the wrong WL citation or jumbled up “absurdity” and “mockery” and these mistakes made it to the final draft, we wouldn’t be talking about sanctions. The court would just move on with its day and the associate would live with debilitating guilt. It’s a proven model!