Just Think AIStart thinking

Case studiesB2B SaaS

Cutting first-response time 71% with a support copilot built on Zendesk + Notion

Support copilot for a 220-person B2B SaaS.

  • IndustryB2B SaaS
  • Use caseCustomer support copilot
  • RoleVP of Customer Experience
  • Engagement2-week AI Sprint, then 4-week extension
  • Timeline6 weeks total
−71%First-response timeFrom 14 min to 4 min median
34%Tickets auto-resolvedWithout human reply, deflected with answers
+8 ptsCSAT deltaQuarter over quarter on AI-handled tickets
6 weeksPayback periodSprint cost recovered in deflection alone

We had tried two AI vendors before. Both were demos. This is the first one that actually understands our product because it actually reads our docs.

VP of Customer ExperienceB2B SaaS, 220 employees

The challenge

The client's support team was drowning. 1,800 tickets per week, ten percent month-over-month growth, and a hiring freeze that meant headcount was not the answer. Their existing macros covered maybe 40% of common questions but required agents to know which macro existed and where to find it. Median first-response time had crept past fourteen minutes and CSAT was sliding.

They had tried two off-the-shelf AI chatbots. Both produced confident wrong answers because neither had access to the team's actual knowledge base — a mix of Notion runbooks, Zendesk macros, and Slack-pinned notes. Trust evaporated within a week and both were rolled back.

How we approached it

We scoped a 2-week AI Sprint with a hard rule: no autonomous replies in week 1. The copilot drafts; humans send. That single constraint unlocked the rest.

Day 1–3 — Ingest and rank. Built a Postgres + pgvector index over the client's Notion workspace, Zendesk macros, and the last 90 days of resolved tickets. Tagged every chunk with source, last-updated date, and a confidence prior based on agent thumbs-ups in historical replies.

Day 4–7 — The copilot loop. Wired into Zendesk via the Apps SDK. When an agent opens a ticket, the copilot retrieves the top three relevant docs, drafts a reply with citations, and shows the agent a confidence band. Above 80% confidence the draft pre-fills the reply box. Below, it shows as a suggestion the agent can pull in or ignore.

Day 8–10 — Eval harness + ship. Built a 200-ticket eval set from historical resolutions. The copilot scored 91% match against agent-validated answers on first pass. Shipped to production behind a feature flag for one team of six agents.

Week 3–6 — Hardening. Rolled out to the full support org. Added an autonomous-reply lane for a narrow set of FAQ-class tickets (password resets, plan changes, billing receipts) that scored 99%+ in eval. Trained the team on what "good copilot output" looks like and built a feedback loop that auto-promotes high-rated answers into the index.

The outcome

Six weeks in, the copilot was drafting on 100% of incoming tickets and autonomously closing 34% without human reply. First-response time dropped from a 14-minute median to 4 minutes. CSAT on AI-touched tickets came in 8 points higher than the baseline because agents could spend their time on harder issues instead of typing the same answer for the hundredth time.

The bigger win was retention. Two senior agents who had been considering leaving stayed because the copilot eliminated the most demoralizing part of the job: typing the same five answers all day. The client paid back the sprint cost in deflection alone within six weeks.

Stack

  • OpenAI GPT-4.1
  • Zendesk API
  • Notion API
  • Postgres + pgvector
  • Vercel

Team

Two engineers + design + a fractional product lead

Handoff

Full code in their Vercel + GitHub org. Runbook covering eval re-runs, doc re-ingest cadence, and the autonomous-reply expansion playbook. Two pairing sessions with their senior backend engineer.

Your engagement, next on this list

Tell us the workflow. We will scope the receipt.