Intermediate11 min readAutomations

The AI customer support agent that resolves 70% of tickets

A realistic design for an AI customer support agent that resolves the common cases, escalates the hard ones, and doesn't make the kind of mistake that ends up on Hacker News. The architecture, the prompts, the guardrails.

What you should be able to do

Evaluate the implementation pattern, failure modes, and guardrails before building.

May 15, 2026

In this article

The four jobs of a support agent
The architecture
Step 1: Triage
Step 2: Context gathering
Step 3: The reasoning agent
Step 4: Action execution
Step 5: Quality check
The knowledge base: where most agents fail
The escalation patterns that matter
The "70% resolution" math, honestly
What customers actually want
A few specific patterns
A worked example
The takeaway

The numbers people throw around for AI customer support — "resolves 80% of tickets," "saves $5 per ticket," "responds in 30 seconds" — are real for some companies and fictional for others. The difference is not the model. It is the design.

A well-built AI support agent in 2026 can genuinely resolve 60-75% of incoming tickets without human intervention, with customer satisfaction comparable to or better than human-only support. A poorly built one produces the kind of hallucinated, frustrating responses that end up on social media. The architecture matters more than the model choice.

This article is the realistic version: the architecture that works, the prompts that produce good responses, the guardrails that prevent disasters, and the parts where humans still need to be in the loop.

The four jobs of a support agent

A useful AI support agent does four things, in order:

Understand the ticket. What is the customer actually asking? What are they emotionally bringing to this? What category of issue is this?
Look up the right context. The customer's account, their history with you, the relevant documentation, similar resolved tickets.
Decide what to do. Reply with the answer, ask a clarifying question, route to a human, take an action on the account.
Execute the decision. Send the reply, ask the question, escalate, or perform the account action — and log everything for audit.

Most failed support agents fail at job 2 (no real customer context) or job 3 (no clear routing logic). The model itself is rarely the problem.

The architecture

Roughly:

Incoming ticket
    ↓
[Triage agent: classify, prioritise, route]
    ↓
[Context gathering: customer data, history, knowledge base RAG]
    ↓
[Reasoning agent: decide action]
    ↓
[Response drafter / action executor]
    ↓
[Quality check]
    ↓
[Send or escalate]

Each step is a distinct concern. You can build this in n8n, in a dedicated agent framework like LangGraph or CrewAI, or as a set of microservices. The architectural pattern is the same regardless of platform.

We will walk through each step.

Step 1: Triage

The triage agent receives the raw incoming ticket and classifies it.

A reliable triage system prompt:

You are a triage agent for [Company]'s customer support. Classify each incoming ticket on three dimensions:

1. CATEGORY: one of
   - account_access (login, password, MFA, account locked)
   - billing (charges, refunds, plan changes, invoices)
   - product_question (how-to, feature questions, configuration)
   - bug_report (something broken or unexpected)
   - feature_request (asking for something we don't have)
   - complaint (frustrated customer, not a specific technical issue)
   - other

2. URGENCY: one of "critical" (production down, billing dispute), "normal", "low" (informational).

3. EMOTIONAL_TONE: one of "calm", "frustrated", "very_angry". Be honest.

Output JSON. Mark any category you are unsure about with confidence < 0.7.

Triage runs cheaply on a fast model (GPT-5 fast variant or Claude Haiku). You don't need a reasoning model for this; it's pattern matching.

The output of triage feeds two decisions:

High-urgency or very-angry tickets go directly to a human, even if the agent could handle them. The brand risk of "AI gave a frustrated customer the wrong answer" is too high.
Category determines which knowledge base and which tools are made available downstream.

Step 2: Context gathering

This is where most agents are made or broken. Without good context, the agent is just an LLM guessing.

Three sources of context to pull:

Customer data. Who is this customer? Plan, account age, recent activity, payment status, any open issues. This usually comes from your CRM or product database via an API call.

Conversation history. Has this customer contacted you before? What about? How was it resolved? Avoid the "I just told you that yesterday" failure mode.

Knowledge base (via RAG). Documentation, help center articles, internal runbooks. Retrieved by semantic search against the ticket content. (RAG fundamentals are in our other articles.)

A reliable context-gathering pattern:

Given the ticket [content], gather context:

1. Look up the customer by email. If found, retrieve plan, account_age_days, recent_actions (last 7 days), open_tickets.

2. Look up the customer's ticket history (last 90 days). Retrieve up to 5 most recent tickets with their resolution.

3. Search the knowledge base for relevant articles. Retrieve top 3 by semantic similarity. Include article titles, summaries, and URLs.

4. Search resolved tickets in our database for similar issues. Retrieve top 2 with their resolutions.

Combine into a context object.

This step takes 2-5 seconds and dramatically improves what the agent has to work with.

Step 3: The reasoning agent

Now the agent decides what to do. The system prompt:

You are a customer support specialist for [Company]. Your job is to resolve the customer's issue.

For each ticket:

1. Read the ticket and the context carefully. The context includes the customer's account, their history with us, and relevant documentation.

2. Decide on one of these actions:
   - RESOLVE: you have a confident answer or solution. Draft a response.
   - CLARIFY: you need more information. Draft a clarifying question.
   - ESCALATE: this needs a human. Explain why.
   - ACT_AND_RESOLVE: you can perform an action on the account (issue refund, reset password, change plan, etc.) using available tools, then respond.

3. Your tone is direct, warm, and competent. Match the customer's register. Never patronise. Never apologise more than once. Never use "we appreciate your patience."

4. When citing documentation, link to the specific article. Do not paraphrase from memory.

5. If the customer is frustrated, acknowledge it briefly and clearly, then move to the resolution.

6. Always escalate if:
   - The customer asks to speak to a human.
   - The issue involves a financial dispute over €100 / $100.
   - You are not confident in your answer (< 70% certainty).
   - The customer's tone is angry and the issue is not a simple one-step resolution.
   - The issue involves a security or privacy concern.
   - The issue involves a complaint about a person on our team.

7. Your output must be JSON:
{
  "action": "<resolve|clarify|escalate|act_and_resolve>",
  "confidence": <0.0-1.0>,
  "reasoning": "<brief explanation>",
  "response_draft": "<the email body>",
  "escalation_reason": "<if applicable>",
  "action_to_take": "<if act_and_resolve, the specific action and arguments>"
}

This is the heart of the agent. Use a strong model here — Claude Sonnet 4.5 or GPT-5 — because the quality of this decision shapes the whole experience.

Step 4: Action execution

For RESOLVE and CLARIFY, the action is straightforward — send the email.

For ESCALATE, the action is routing to a human queue (Zendesk, Intercom, your internal tool) with the agent's analysis included so the human starts informed.

For ACT_AND_RESOLVE, the agent is performing an account action. This needs careful handling:

Allowlist of allowed actions. Don't let the agent call any tool. Be explicit: "the agent can issue refunds up to €50, reset passwords, change subscription tier within the same plan family, and cancel subscriptions per request."
Confirmation thresholds. For higher-value actions (refunds over €50, account cancellations on annual plans), require a human review even if the agent is confident.
Logging. Every action gets logged with the agent's reasoning. Audit trail matters for support quality and for regulatory compliance.

Step 5: Quality check

The final step before sending: a quality gate. This is usually a separate, cheaper AI call that reviews the drafted response.

You are a quality reviewer for AI-generated customer support responses.

Given the original ticket and the drafted response, check:

1. Does the response actually address the customer's question?
2. Is it accurate based on the context provided (no hallucinated facts)?
3. Is the tone right (warm, direct, not patronising, not over-apologetic)?
4. Are any links broken or wrong?
5. Does it contain any of these red flags:
   - Promising something we cannot deliver
   - Apologising for things that aren't our fault
   - Sounding angry or sarcastic
   - Using internal jargon
   - Disclosing internal information

Output: APPROVE or REVISE (with specific suggested fixes).

If the quality check returns APPROVE, send the response. If REVISE, either auto-fix (cheap, fast model can apply the suggested changes) or queue for human review.

In practice, this quality gate catches 5-10% of responses that the main agent generated incorrectly. Worth its cost.

The knowledge base: where most agents fail

The single biggest factor in agent quality is the knowledge base. If your help center is stale, contradictory, or incomplete, your agent will be confidently wrong.

Practical principles:

Audit before deploying. Walk through your top 100 most common ticket types and verify the knowledge base has the right answer for each. Fill gaps. Resolve contradictions. Update stale articles. This is a week of work and the highest-impact investment you can make.

Structure for retrieval. Articles should be short, focused on one issue each, with clear titles. Long monolithic articles get retrieved partially and produce bad responses.

Include explicit "do NOT" sections. Many support tickets are about how to do something the customer should not do. Knowledge base articles should explicitly say "if you are trying to X, here is why we don't recommend that, and here is the alternative."

Tag every article with applicability. "Free plan only," "EU customers only," "iOS app only." The agent uses these to filter retrievals.

Refresh quarterly. Most companies' knowledge bases drift. Schedule a quarterly review where someone goes through and flags stale content.

The escalation patterns that matter

A common failure mode is an agent that escalates everything (lazy) or never escalates (overconfident). Get the escalation patterns right:

Always escalate:

Explicit human requests
Anger above a threshold (especially after one bad agent turn)
Disputes involving real money
Security or privacy concerns
Health, safety, or legal implications
Repeated tickets from the same customer about the same issue
Cases where the agent's confidence is below 70%

Never escalate (low value):

Trivial questions with clear answers in the KB
Account housekeeping (password reset, basic profile changes)
Status queries ("did my refund go through?")
Feature requests (route to product team, not human support)

The middle ground is where the agent's judgement matters. Build instrumentation that lets you see: of all the cases the agent could have escalated but didn't, what fraction did the customer come back about? Of all the cases the agent escalated, how many did the human resolve trivially?

The "70% resolution" math, honestly

For a typical SaaS support queue:

20-30% are simple, clearly-documented questions. AI handles these well.
30-40% are medium-complexity questions where the agent needs context and judgement. AI handles these well if the knowledge base is strong and the agent has good tools.
20-30% need a human. Complex troubleshooting, emotional situations, edge cases, policy decisions.
10-20% are bug reports or feature requests that need product/engineering, not support.

Adding up the AI-handleable fractions: 50-70% is realistic. The companies hitting 70%+ have invested heavily in their knowledge base and their agent's tool integrations. The companies stuck at 30% usually have poor KB and a generic agent.

What customers actually want

Surveys consistently show:

Fast resolution is the top priority.
Accurate answers is the second.
Feeling heard matters but less than the first two.
Speaking to a human is far less important than "getting my problem solved."

This is good news for AI support: speed and accuracy are exactly the things AI is good at. The "I want to speak to a human" sentiment usually appears only after the AI has failed once. Get the first AI response right and customers prefer it over waiting in a queue.

The thing customers absolutely hate is the agent loop without escalation — talking to an AI, the AI doesn't solve the problem, the AI keeps trying, and the customer can't get to a human. Build the escalation triggers tight to avoid this.

A few specific patterns

Personalisation matters. "Hi Anna, I see you're on our Pro plan and have been with us since 2023" lands very differently from "Hi Customer." Use the context.

Acknowledge the wait. If a customer waited 30 minutes for a response, the agent's first sentence should acknowledge that. Don't pretend everything is fine.

Quote the customer back. "You mentioned that your import was failing on records with special characters in the company name." Shows the agent actually read the ticket.

End with the next step. "I've issued the refund — you'll see it in 3-5 business days." Not "let me know if you need anything else."

Don't apologise unprompted. "I'm so sorry for the inconvenience" before knowing what happened comes across as insincere. Apologise once, specifically, where warranted.

A worked example

The customer writes:

Hi, I've been trying to log in for three days and it just keeps saying my password is wrong. I'm sure it's the right password — I've used it for two years. I'm starting to think you've been hacked.

A good agent response (after context-gathering):

Hi Anna,

Three days is a long time to be locked out. I've checked your account: you have multi-factor authentication enabled, and I can see ten failed login attempts from your usual IP starting Tuesday morning. The most likely cause is a corrupted MFA code, not a hack — your account hasn't been accessed by anyone else, and your password hasn't been changed.

The fastest fix:

1. I've sent a password reset link to anna@somecompany.fi (just sent now). 2. Click it and set a new password. 3. When you log in, you'll be prompted to re-set up MFA — please do this on the device you'll actually use.

If anything looks off when you get in (settings changed, missing data), reply here and I'll loop in our security team for a closer look.

— AI Expert Support

That response is direct, specific, acknowledges the customer's worry without melodrama, takes a real action, and ends with a clear next step and escalation path.

The takeaway

A 70% resolution rate is realistic with good design. The model is rarely the bottleneck. The four levers are:

A clean, structured knowledge base.
Solid context-gathering (customer data, history, KB retrieval, similar resolved tickets).
A reasoning agent with clear decision criteria and escalation rules.
Guardrails (allowlists, quality checks, audit logging).

Build those well and your support quality improves while volume per human goes down. Build them poorly and you create a frustration machine.

Most teams in 2026 are either deploying support AI carelessly (and getting bad results) or refusing to deploy it (and missing the productivity gains). The right path is in the middle: deploy carefully, measure, iterate. The good news is that the design patterns are now well-understood, and the failure modes are well-documented enough to avoid.

Take it further

Hand-picked external courses that go deeper on this topic.

Coursera · Vanderbilt University

ChatGPT: Excel at Personal Automation with GPTs, AI & Zapier

Dr. Jules White

The clearest path from "I use ChatGPT in a tab" to "my AI handles my inbox while I sleep." Three-course specialization built around Zapier — no Python required. By the end you'll have agents that summarise emails, update spreadsheets, and trigger workflows when conditions are met.

Beginner~30 hours · 3-course specializationVerified 25 days ago

Hugging Face

AI Agents Course