Paul L. Caron
Dean





Tuesday, March 25, 2025

Can AI Hold Office Hours?

Lisa Larrimore Ouellette (Stanford; Google Scholar), Amy Motomura (Loyola-L.A.; Google Scholar), Jason Reinecke (Marquette) & Jonathan S. Masur (Chicago; Google Scholar), Can AI Hold Office Hours?

Rapid improvements in AI tools offer transformative opportunities in legal education, including the possibility of students using AI tools to answer questions that students might otherwise ask during office hours. But a critical challenge is the accuracy of these tools’ responses. Both general-purpose and law-specific AI models have been shown to “hallucinate” incorrect responses to a range of legal questions. Here, we evaluate the current capabilities of AI models when given the more constrained task of answering questions about a specific legal text. We provided three AI tools—OpenAI’s GPT-4o, Anthropic’s Claude 3.5 Sonnet, and Google’s NotebookLM—with the text of Masur & Ouellette’s Patent Law: Cases, Problems, and Materials, a free patent casebook that has been adopted at over seventy law schools. We then asked each tool to answer 185 questions based solely on the casebook, including questions asked by students in our own patent law classes, and we graded the responses. 

We found that a substantial number of responses were unacceptable in the sense of being harmful for learning (26% for GPT, 14% for Claude, and 31% for NotebookLM), and many more responses failed to fully answer the question or had minor errors (25% for GPT, 31% for Claude, and 32% for NotebookLM). Based on our review, none of us would recommend these tools to our students at this time. We also do not think they would increase our own efficiency at answering student questions due to the time involved in reviewing AI responses for extraneous, incorrect, or misleading information. Our results contribute both to the literature assessing AI capabilities across different tasks and to the literature on how AI tools should be incorporated into legal education. In particular, they highlight that restricting AI to providing answers based on a designated body of information does not solve the problem of wrong answers and hallucinations. They also point fairly conclusively in one direction: AI is not ready to start holding office hours.

Editor's Note:  If you would like to receive a daily email with links to legal education posts on TaxProf Blog, email me here.

https://taxprof.typepad.com/taxprof_blog/2025/03/can-ai-hold-office-hours.html

Legal Ed Scholarship, Legal Education, Scholarship | Permalink