AI Models on Law School Exams

Posted by

The question of how well AI can do on law school exams is one that interests me, since I give exams and want them to be a measure of how much my students have learned (as opposed to their skills at using AI — although I want them to learn those too). Others appear to be interested too — just see the ssrn downloads for papers on this topic.  Caveat: I can't pretend that I have more than the shallowest of understandings of AI models.  But this cool new paper I came across might be of interest to folks.  

The paper is from a set of scholars at ETH in Zurich (a place long known for its excellent research).  As I understand the draft (and, to repeat, I don't understand a lot of this stuff), the paper finds that the large language models (LLMs) don't do that great when you increase the level of reasoning required on the exam.  I was also intrigued to read (I think) that LLMs are not necessarily better on multiple choice exams than essay type ones.  From the abstract, here is a sentence that stood out: "Our evaluation on both open-ended and multiple-choice questions present significant challenges for current LLMs; in particular, they notably struggle with open questions that require structured, multi-step legal reasoning".  

The paper is "LEXam: Benchmarking Legal Reasoning on 340 Legal Exams

Among the other cool things about this paper to me is how collaborative it is — students, professors, and even judges.  To me, it reflects well on the culture of the institution that has such a degree of collaboration.