Case Study • Legal RAG

AI Legal Research Assistant for a Law Firm

Agentic RAG system for a law firm: answers complex legal questions over a corpus of 10K+ documents — laws, regulations, and court decisions — with grounded, source-cited responses in seconds

96%

Cited Accuracy

Faster Research

10K+

Documents

<2 sec

TTFT

Client's Task

•Lawyers spend 30–60 minutes per query manually sifting through hundreds of pages of contracts, regulations, and court decisions
•Cross-document questions require referencing multiple contracts, laws, and rulings at once — nearly impossible to do accurately by hand
•Clients demand precise source references for every legal opinion — manual citation tracking leads to errors and outdated provisions
•Internal memos, brief banks, and closed matter files are scattered across iManage, DMS, and SharePoint — keyword search only
•Off-the-shelf LLM tools hallucinate case citations — in legal practice, that is professional liability

Project Goals:

Cut per-query research time from 30–60 minutes down to seconds
Every answer must come with an exact page-level reference to its source
Handle cross-document questions across hundreds of PDFs at once, accurately
Strict refusal on out-of-corpus questions — return null, never fabricate

Solution

Agentic Legal RAG with page-level grounding and zero hallucinations

A retrieve → rerank → LLM pipeline with agentic quality control. Hybrid search (vector + BM25) finds the right passages, a reranker tightens the result set, and an LLM-as-Judge verifies every answer for correctness and source attribution

Multi-Type Answer Support: separate optimized pipelines for deterministic answers (numbers, booleans, names, dates) and free-text explanations

Hybrid Retrieval: vector similarity + BM25 full-text + reranker for precise retrieval across a corpus of 10K+ legal PDFs

Source-Grounded Responses: every answer includes an exact page reference — no hallucination, full traceability

LLM-as-Judge Evaluator: cascaded check on every answer for correctness and citation faithfulness

Anti-hallucination: when the corpus has no answer, the system returns null with empty grounding

Production-grade latency: under 2 seconds time-to-first-token — fast enough for real-time legal research

Results After 1 Month

Metric	Before	After
Time per query	30–60 minutes	<2 minutes
Research speed	—	x8 faster
Cited answer accuracy	—	96%
Hallucinated citations	Occasional	0 on benchmark
Delivery timeline	—	Prototype in 3 weeks, production in 10 weeks (2.5 months)

What the Client Gained

Lawyers reclaim hours of working time each day from manual research

Every citation is verifiable to the page — meets client due diligence expectations

Cross-document questions across hundreds of PDFs are answered automatically

No fabricated prior cases — lower professional liability exposure

Fast delivery: prototype in 3 weeks, production in 2.5 months

Technologies

Voyage-4 • GPT-5 • Voyage Rerank-2.5 • Gemini 2.5 Flash • Qdrant • PostgreSQL • Python • FastAPI • Redis • React