When You Build What You Cannot Afford
Someone posts a "Show HN" — an open-source legal AI. The project is eight hours old, two commits deep. The lawyers arrive in the comments within the hour.
The verdict comes fast: "this is just a wrapper around regular LLMs, nothing you couldn't achieve yourself with the right prompting."
They're right. Mike — the tool in question — is an interface. A clean one: document reading, verbatim citations, contract drafting, support for Claude and Gemini. Self-hosted. AGPL. Zero marginal cost if you already pay for API access. But underneath it, you're still querying a general-purpose model. And underneath that, the thing legal professionals actually rely on — comprehensive case law databases, jurisdiction-specific repositories, the ability to verify whether a precedent still holds — isn't there. You can't query Westlaw from a two-commit open-source project.
The lawyers aren't wrong. The gap is real.
Here's what they're missing.
Legal work has a dependency chain. At the top: the interface, the chat window, the document reader — what Mike provides. Below that: document parsing and citation tracking. Below that: institutional databases, case law repositories, and the exclusive contracts Thomson Reuters has spent decades assembling. Below that: the accumulated judgment of practitioners who've seen what happens when a clause bites you three years after signing.
Commercial legal AI is priced at the bottom of that stack. Most users only need the top.
The indie founder reviewing a vendor agreement doesn't need Westlaw. The small nonprofit trying to understand a lease doesn't need case law verification. They need: read this document, tell me what it says, flag where the risk is. A capable LLM with a well-built interface can do that — and has been able to for a while. What it hasn't had is a self-hosted, open-source wrapper accessible without enterprise rates.
Mike doesn't solve all of legal AI. But it makes the useful tier accessible to the people the big players can't serve profitably.
This is what alignment over force looks like in practice.
You don't fight a $500/hour rate by arguing it should be lower. You don't fight enterprise legal AI pricing by competing on their terms. You identify the segment they're ignoring — the people who need the top layer of the capability chain, not the whole thing — and you build exactly that. Self-hosted. No data leaving your infrastructure. API keys you already have.
The leverage shifts not because the law changed. Not because the pricing changed. Because the tooling did.
This is the surfer's wave — place yourself where reality's power carries the work forward. The wave in legal AI isn't Harvey or Legora. The wave is that general-purpose LLMs are now capable enough for a real slice of legal document work. Most of the market hasn't made that accessible at prices that reach the people who need it. Mike is a surfboard. Two commits old, but a surfboard.
The honest accounting:
The HN lawyers are right: what Mike can't do is exactly what the highest-stakes work requires. If you're going to court, you need a lawyer. If you're in a complex commercial transaction, you need someone who can verify case law currency and catch the clause that looks standard but isn't. A two-commit open-source wrapper won't save you there — and pretending otherwise would be worse than not having the tool at all.
So the question isn't "does this replace a lawyer?" It's "does this change when you need one?"
The person who uses Mike to understand what they're signing — who catches the non-standard indemnification clause before they've already agreed to it — that person makes a better client. They know when the stakes require a professional. They come to the meeting having done the first layer of work themselves, not paying paralegal rates to have someone explain what a warranty provision means. That's not disruption. It's a calibration.
The most durable thing about building Mike isn't the tool itself. It's what building it teaches you.
To build a legal AI that works, you have to understand legal AI's architecture well enough to see where your implementation hits a wall. You learn — from the ground up, not from reading about it — that the interface is the easy part. The hard dependencies are in the data: the case law repositories, the jurisdiction-specific coverage, the verification layer. You discover this by building without it and watching where things break.
That comprehension is more valuable than the tool. Someone who builds Mike and runs into the case law limit doesn't just know the tool is limited — they know where it's limited and why. They can make a calibrated decision about when to use it and when to call someone who has access to Westlaw. That's a better position than using commercial legal AI at full price and treating it as a black box you don't fully understand.
"Build what you cannot afford" isn't just a cost optimization. It's a design philosophy. The act of building forces you to understand the architecture of what you're replacing — layer by layer, dependency by dependency. And a map you drew yourself, especially in the places where you had to figure out why the path ended, is worth more than a finished product you never had to understand from the inside.
The tool is two commits old. The map it's already drawing is the interesting part.
source · Hacker News — Mike: open-source legal AI
threaded with
- river · Agency
The One-Command Gap
The local AI era removed permission friction. Hardware friction is another story. Before you run the next model, run a substrate check — five questions that save you from costly misalignment.
5 days ago
- river · Agency
The Copyright Horizon
Most builders treat copyright as friction to outrun. The team behind talkie made it their spec — and what they built, and couldn't avoid, is a lesson in how constraints reveal capability.
1 week ago
- river · Agency
The Record of Instructions
Anthropic publishes Claude's system prompts. But they don't track how they change. Simon Willison built a git timeline to fill that gap — and you can build one too.
2 weeks ago