Case Study • Legal RAG
AI Legal Research Assistant for a Law Firm
Agentic RAG system for a law firm: answers complex legal questions over a corpus of 10K+ documents — laws, regulations, and court decisions — with grounded, source-cited responses in seconds
96%
Cited Accuracy
x8
Faster Research
10K+
Documents
<2 sec
TTFT
Client's Task
- •Lawyers spend 30–60 minutes per query manually sifting through hundreds of pages of contracts, regulations, and court decisions
- •Cross-document questions require referencing multiple contracts, laws, and rulings at once — nearly impossible to do accurately by hand
- •Clients demand precise source references for every legal opinion — manual citation tracking leads to errors and outdated provisions
- •Internal memos, brief banks, and closed matter files are scattered across iManage, DMS, and SharePoint — keyword search only
- •Off-the-shelf LLM tools hallucinate case citations — in legal practice, that is professional liability
Project Goals:
- Cut per-query research time from 30–60 minutes down to seconds
- Every answer must come with an exact page-level reference to its source
- Handle cross-document questions across hundreds of PDFs at once, accurately
- Strict refusal on out-of-corpus questions — return null, never fabricate
Solution
Agentic Legal RAG with page-level grounding and zero hallucinations
A retrieve → rerank → LLM pipeline with agentic quality control. Hybrid search (vector + BM25) finds the right passages, a reranker tightens the result set, and an LLM-as-Judge verifies every answer for correctness and source attribution
Multi-Type Answer Support: separate optimized pipelines for deterministic answers (numbers, booleans, names, dates) and free-text explanations
Hybrid Retrieval: vector similarity + BM25 full-text + reranker for precise retrieval across a corpus of 10K+ legal PDFs
Source-Grounded Responses: every answer includes an exact page reference — no hallucination, full traceability
LLM-as-Judge Evaluator: cascaded check on every answer for correctness and citation faithfulness
Anti-hallucination: when the corpus has no answer, the system returns null with empty grounding
Production-grade latency: under 2 seconds time-to-first-token — fast enough for real-time legal research
Results After 1 Month
| Metric | Before | After |
|---|---|---|
| Time per query | 30–60 minutes | <2 minutes |
| Research speed | — | x8 faster |
| Cited answer accuracy | — | 96% |
| Hallucinated citations | Occasional | 0 on benchmark |
| Delivery timeline | — | Prototype in 3 weeks, production in 10 weeks (2.5 months) |
What the Client Gained
Lawyers reclaim hours of working time each day from manual research
Every citation is verifiable to the page — meets client due diligence expectations
Cross-document questions across hundreds of PDFs are answered automatically
No fabricated prior cases — lower professional liability exposure
Fast delivery: prototype in 3 weeks, production in 2.5 months
Technologies
Voyage-4 • GPT-5 • Voyage Rerank-2.5 • Gemini 2.5 Flash • Qdrant • PostgreSQL • Python • FastAPI • Redis • React