Just Think AIStart thinking

Case studiesLegal services

Cutting case-research time from 6 hours to 22 minutes with a RAG system over 80,000 court filings

RAG legal-research assistant for a litigation boutique.

  • IndustryLegal services
  • Use caseRAG retrieval over private corpus
  • RoleManaging Partner
  • EngagementSingle AI Sprint
  • Timeline2 weeks
−94%Research time per caseFrom 6 hours to 22 minutes median
80,000Filings indexedPACER + state court PDFs across 11 jurisdictions
99.2%Citation accuracyValidated against attorney spot-checks
$340k/yrBillable-hour recaptureEstimated, based on hours redirected to higher-value work

We did not buy software. We built a research associate that reads everything we have ever written and never forgets a case.

Managing PartnerLitigation boutique, 12 attorneys

The challenge

A 12-attorney litigation boutique. Every new case opened with two to six hours of associate time spent reading prior filings to understand opposing counsel's pattern, the judge's tendencies, and the precedents being argued. That work was billable but it was also bottlenecking the firm's capacity to take new matters.

Existing legal-tech AI tools were too generic — they searched public case law but did not have access to the firm's accumulated 80,000-document corpus of filings, briefs, and internal memos. The partners refused to upload anything client-confidential to a third-party vendor.

How we approached it

Days 1–4 — Ingest pipeline. Built a Lambda-driven ingest that watched the firm's S3 bucket of court filings (PDF + DOCX). OCR via Textract for scanned documents. Chunked at the section level with overlap, embedded with Cohere v3, indexed in Pinecone. Tagged every chunk with jurisdiction, court, judge, opposing counsel, date, and our internal matter ID.

Days 5–8 — Retrieval + reranker. Hybrid retrieval: vector search for semantic match plus BM25 for cite-string matching (Westlaw-style citations). Cohere reranker on the top 50 to surface the actually-relevant 8. Claude Sonnet 4 as the answer model with a strict "answer only from the provided documents — cite every claim with the matter ID + page number" system prompt.

Days 9–10 — UI and evals. Lightweight Next.js app inside the firm's Cloudflare-protected intranet. 100 historical research questions used as the eval set, scored against attorney-validated answers. Hit 99.2% citation accuracy on first ship.

The whole system runs in the firm's AWS account. No client data ever leaves their tenant.

The outcome

Median case-research time dropped from 6 hours to 22 minutes. Associates now use the assistant as a first-pass and spend their billable hours on the harder analysis — strategy, drafting, argumentation. The firm took on three additional matters in the quarter following launch without adding headcount.

The Managing Partner estimated $340k/year in recaptured billable hours redirected to higher-value work. The system pays for itself in the first month it is used.

Stack

  • Anthropic Claude Sonnet 4
  • Cohere reranker
  • Pinecone
  • AWS Lambda
  • Linear-driven workflow

Team

One senior engineer + a legal-domain reviewer

Handoff

Full code in their AWS org with IaC templates. Eval harness re-runnable on every model change. Quarterly model-upgrade reviews. The firm now adds new jurisdictions to the index in-house.

Your engagement, next on this list

Tell us the workflow. We will scope the receipt.