
All rights reserved, Habeas 2024
See our Privacy Policy
See our Privacy Policy

Many of the assumptions that make generic AI effective in other domains do not align with the practice of law. This blog will address those assumptions and explain why they are not suited to legal research.
Generic AI, operating under the above assumption, is built to generate language that reads well and sounds correct. This is why it often hallucinates, or produces outputs that sound confident whilst being entirely incorrect – see more on that phenomenon here. Outputs under this model are persuasive, but not authoritative.
This assumption is problematic when applied to legal research. That is because the effectiveness of legal research is not judged by how persuasive a proposition sounds, but instead is grounded in whether the proposition is supported by an authoritative source. Yet, a generic AI model can confidently produce wrong answers to sound plausible.
The frequency with which generic AI models are confidently wrong poses a fatal threat to the edifice of accuracy upon which successful legal research rests. The assumption that users want plausibility more than accuracy, that guides these generic AI models, is therefore concerning when applied to legal research.
Generic AI is, in its purest form, a token predictor. That is, these systems operate by predicting the next ‘token’ – meaning that they take a prompt, break it into text-units, known as ‘tokens’, and then generate the ‘tokens’ statistically most likely to follow your inputted tokens, as derived from a scraped dataset across the Internet. That is not a criticism of these models in itself, but an explanation of their design model.
The problem with the assumption that questions can be treated as token inputs is that legal research is not a token-completion task. It is not an arbitrary arrangement of words, but is a request for a defensible answer within a jurisdiction, anchored to a relevant authority, and sensitive to context including doctrinal development.
The corollary of treating legal questions as ‘token inputs’ is that a second assumption is born, tolerable in daily life but unsafe in the practice of law: that ‘close enough’ is ‘good enough’.
This assumption is generally fine in everyday life – applied to the earlier examples, if the recipe is slightly off, you can improvise, or if your travel itinerary is mostly correct, you can manually fill in the blanks. However, in legal research, ‘mostly right’ can lead to quite harmful consequences, as seen earlier this month when lawyers had to face regulators after using AI to prepare their legal documents.
In many domains, users do not necessarily need to know the source of an answer to benefit from it. For instance, you don’t need to know the source of either the recipe or travel itinerary output – if it sounds accurate, you just move on.
However, in law, the pathway matters – a lawyer must be able to show their work, in the form of the sources they relied upon to reach an answer. Legal work is constantly scrutinised, and a conclusion without an auditable trail just produces further work for those tasked with the role of scrutinising. When a generic AI model produces an answer without clear underlying authorities or traceable citations, the supervising lawyer is forced into one of two positions:
1. Reconstruct the search manually, by locating and validating authorities relied upon from scratch, undermining the time-saving value promise of the AI mode; or
2. Rely on an output that is not traced, which causes professional risk without providing the tools to manage it.
Neither option is acceptable for a practice that values accuracy. The opacity of generic AI is thus a structural flaw when applied to legal research.
Habeas operates under none of these assumptions. We are optimised for not sounding correct but being correct, and to provide a transparent pathway from question to output. If you want to see what that looks like and how that could benefit your practice, book a demo here.
Generic AI fails in legal research not because it is poorly designed, but because it is designed for the wrong problem. Its underlying assumptions – prioritising fluency over accuracy, treating questions as token sequences, and tolerating opaque reasoning – are reasonable in many domains. These assumptions are, however, incompatible with the demands of legal practice.
That does not mean that AI has no place in legal work. It means that legal AI must be built differently, by being grounded in jurisdictional authority, designed around legal reasoning, and structured with traceable pathways to conclusions to support scrutiny rather than obscure it. Until those structural requirements are met, as they are with Habeas, AI tools remain useful in daily life but unsuitable as authoritative legal research tools.
