Case studiesLegal services

Cutting case-research time from 6 hours to 22 minutes with a RAG system over 80,000 court filings

Item: Just Think AI
Rating: 5
Author: Managing Partner

RAG legal-research assistant for a litigation boutique.

IndustryLegal services
Use caseRAG retrieval over private corpus
RoleManaging Partner
EngagementSingle AI Sprint
Timeline2 weeks

−94%Research time per caseFrom 6 hours to 22 minutes median

80,000Filings indexedPACER + state court PDFs across 11 jurisdictions

99.2%Citation accuracyValidated against attorney spot-checks

$340k/yrBillable-hour recaptureEstimated, based on hours redirected to higher-value work

“We did not buy software. We built a research associate that reads everything we have ever written and never forgets a case.”
Managing PartnerLitigation boutique, 12 attorneys

The challenge

A 12-attorney litigation boutique. Every new case opened with two to six hours of associate time spent reading prior filings to understand opposing counsel's pattern, the judge's tendencies, and the precedents being argued. That work was billable but it was also bottlenecking the firm's capacity to take new matters.

Existing legal-tech AI tools were too generic — they searched public case law but did not have access to the firm's accumulated 80,000-document corpus of filings, briefs, and internal memos. The partners refused to upload anything client-confidential to a third-party vendor.

How we approached it

Days 1–4 — Ingest pipeline. Built a Lambda-driven ingest that watched the firm's S3 bucket of court filings (PDF + DOCX). OCR via Textract for scanned documents. Chunked at the section level with overlap, embedded with Cohere v3, indexed in Pinecone. Tagged every chunk with jurisdiction, court, judge, opposing counsel, date, and our internal matter ID.

Days 5–8 — Retrieval + reranker. Hybrid retrieval: vector search for semantic match plus BM25 for cite-string matching (Westlaw-style citations). Cohere reranker on the top 50 to surface the actually-relevant 8. Claude Sonnet 4 as the answer model with a strict "answer only from the provided documents — cite every claim with the matter ID + page number" system prompt.

Days 9–10 — UI and evals. Lightweight Next.js app inside the firm's Cloudflare-protected intranet. 100 historical research questions used as the eval set, scored against attorney-validated answers. Hit 99.2% citation accuracy on first ship.

The whole system runs in the firm's AWS account. No client data ever leaves their tenant.

The outcome

Median case-research time dropped from 6 hours to 22 minutes. Associates now use the assistant as a first-pass and spend their billable hours on the harder analysis — strategy, drafting, argumentation. The firm took on three additional matters in the quarter following launch without adding headcount.

The Managing Partner estimated $340k/year in recaptured billable hours redirected to higher-value work. The system pays for itself in the first month it is used.

Stack

Anthropic Claude Sonnet 4
Cohere reranker
Pinecone
AWS Lambda
Linear-driven workflow

Team

One senior engineer + a legal-domain reviewer

Handoff

Full code in their AWS org with IaC templates. Eval harness re-runnable on every model change. Quarterly model-upgrade reviews. The firm now adds new jurisdictions to the index in-house.

Your engagement, next on this list

Tell us the workflow. We will scope the receipt.

Start a project More case studies

The challenge

How we approached it

The outcome

Stack

Team

Handoff

Tell us the workflow. We will scope the receipt.

Related work

Support copilot for a 220-person B2B SaaS

Accounts-payable automation agent for a fast-growing fintech

Programmatic SEO + content engine for a B2B marketplace