agentic-ai-engineering · Rollout Readiness

Support Triage Agent

Generated 2026-05-16 · Skill: deployment-readiness · Command: /agentic-ai-engineering:agentic-ops
Readiness
Caution
Gates passed
7 / 10
Open blockers
2
Core infrastructure and HITL posture are in place, but two blockers must clear before production: adversarial eval coverage and a defined circuit-breaker for the downstream CRM tool. Recommend limited-traffic canary (5%) pending blocker resolution.

Production Gate Checklist

Passed
Eval suite in CI
22 golden cases run on every PR; pass rate gate at 95%.
Passed
HITL approval gates defined
High-severity escalations require human confirmation before Tier-2 assignment.
Passed
PII handling validated
Redaction layer tested; GDPR-relevant fields stripped before LLM context.
Passed
Tool permissions scoped
All tools operating at minimum required permission tier (read-only for lookup tools).
Passed
Fallback behavior tested
Tool-failure fallback routes to human queue with full context; tested in staging.
Passed
Rollback procedure documented
Feature flag controls agent routing; toggle disables in <2 minutes.
Passed
Observability instrumented
MELT stack deployed; OpenTelemetry spans on all agent-to-tool calls.
Not Met
Adversarial eval coverage
Prompt injection and adversarial ticket fixtures not yet in eval suite. Blocker B1.
Not Met
CRM tool circuit breaker
No circuit breaker defined for CRM write operations. Blocker B2.
Partial
Load test completed
Tested at 10 concurrent tickets; target production peak is 80. Full load test pending.

HITL Posture

DimensionSetting
Autonomy level L2 Assisted
Approval gates High-severity tickets require human confirmation before Tier-2 assignment; refunds over $500 require explicit approval
Override path Support agent can override any routing decision via the triage UI override button; logged and audited
Escalation trigger Confidence below 0.7, unrecognized category, customer sentiment flagged as high-distress, or any mention of legal/compliance terms
Fallback behavior Route to human queue with full ticket context and agent reasoning summary; max 4-hour SLA

Observability Readiness

Instrument Status Notes
Distributed tracing (OpenTelemetry) In Place CHAIN, LLM, and TOOL spans instrumented. Trace IDs propagated to CRM and ticketing system calls.
LLM call logging In Place Prompt/completion pairs logged (PII stripped) with model, temperature, and token counts.
Token cost metrics Partial Total tokens per session logged; per-tool-call cost breakdown not yet available. Cost dashboards not yet configured.
Error rate alerting In Place PagerDuty alert on error rate > 5% over 5-minute window; escalates to on-call engineer.
Latency p95 dashboard Partial Overall latency measured; per-span breakdown (model inference vs tool vs network) not yet visible in dashboard.
Session replay Missing Full agent session replay (for post-incident diagnosis) not implemented. Logs available but no replay interface.
Audit trail (HITL actions) In Place All human override and approval actions logged with timestamp, agent ID, and routing decision delta.

Rollout Risk Register

High
Adversarial ticket injection
A malicious customer could embed instructions in ticket text that override the agent's routing behavior, potentially routing tickets to wrong queues or leaking routing rules.
Add prompt injection detection layer before ticket text reaches agent context; implement adversarial eval suite (Blocker B1).
High
CRM write without circuit breaker
The CRM status-update tool has no circuit breaker. A runaway loop or cascading failure could write incorrect statuses to hundreds of tickets before detection.
Implement circuit breaker on CRM write tool: max 5 writes per session, pause after 3 consecutive errors (Blocker B2).
Medium
Load capacity gap
Peak production load (80 concurrent tickets) is 8x the tested load (10). Performance characteristics at production scale are unknown.
Complete full load test at 80 concurrent tickets before expanding beyond canary; set auto-scaling thresholds.
Medium
Context drift on long ticket threads
Ticket threads with 20+ messages approach context window limits. Agent behavior at window boundary not tested; may produce inconsistent routing decisions.
Implement context summarization for threads exceeding 15 messages; add 3 long-thread eval fixtures.
Medium
Model version drift
Agent is pinned to claude-3-5-sonnet-20241022. If the model is updated or deprecated, routing accuracy may regress silently without re-evaluation.
Pin model version in config; add model version to eval metadata; set 90-day re-eval reminder when model is updated.

Open Blockers

Blocker
B1: Adversarial eval suite not implemented — prompt injection detection not validated. Agent may misroute tickets with embedded instructions. Blocks production launch.
Owner: ML Engineer (Jane D.) · Target: 2026-05-23
Blocker
B2: CRM write tool circuit breaker absent. Runaway writes possible under error cascade. Blocks production launch for all tickets that trigger CRM updates.
Owner: Backend Engineer (Mark L.) · Target: 2026-05-20

Rollout Recommendation

Deploy to canary at 5% traffic (read-only ticket classification only — disable CRM write tool in canary config) while blockers B1 and B2 are resolved. Monitor error rate, latency p95, and HITL escalation rate during canary. Once B1 and B2 are cleared and load test passes, expand to 25% then full traffic in 1-week increments. Do not enable CRM write tool at any traffic tier until B2 (circuit breaker) is implemented and tested.