Agent Templates
Incident Responder Agent
Alert triage and hotfix coordination
Incident Responder Agent
The Incident Responder Agent helps teams quickly diagnose and resolve production incidents through automated triage and runbook execution.
What It Does
- Triages alerts - Assesses severity and impact
- Correlates errors - Links related issues across logs
- Executes runbooks - Runs predefined remediation steps
- Coordinates response - Notifies on-call and stakeholders
- Creates hotfix PRs - Generates fixes for known patterns
Severity Levels
| Level | Description | Response Time | |-------|-------------|---------------| | SEV1 | Service down, major impact | Immediate | | SEV2 | Degraded service | 15 minutes | | SEV3 | Minor issues | 1 hour | | SEV4 | Low priority | Next business day |
Configuration
agents:
- name: incident-responder
template: incident-responder
triggers:
alert:
sources:
- pagerduty
- datadog
- opsgenie
config:
# Auto-acknowledge alerts
auto_acknowledge: true
# Runbook location
runbook_path: ".bamboosnow/runbooks/"
# Escalation policy
escalation:
sev1:
notify: ["#incidents", "@oncall"]
timeout: 5m
sev2:
notify: ["#incidents"]
timeout: 15m
# Known error patterns
known_errors:
- pattern: "OOMKilled"
runbook: "memory-pressure.md"
- pattern: "connection refused"
runbook: "database-connectivity.md"
Incident Response Flow
- Alert Received - Agent ingests alert from monitoring
- Triage - Assesses severity, identifies affected systems
- Runbook Lookup - Finds relevant runbooks for error pattern
- Initial Response - Executes automated remediation
- Escalation - Notifies appropriate teams if unresolved
- Documentation - Creates incident timeline for post-mortem