Automation -- 24/7 Autonomous Agents
10 AI agents running as ADO pipelines, triggered by AWS Lambda webhooks. Same skills as local workflow, zero human interaction required. Currently supports Azure DevOps. Jira webhook support planned.
Event Flow Architecture
From tracker event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.
From ADO event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.
Ten Autonomous Agents
Two PR agents triggered automatically, eight work-item agents triggered by ADO tags (Jira label support planned).
PR Agents (automatic — no tags)
PR Reviewer
Build validation policy on PR creation. Runs /dx-pr-review — posts inline comments
with severity levels, architecture checks, and standards compliance.
PR Answerer
Webhook on PR comment. Runs /dx-pr-answer — researches
codebase, drafts replies, and applies agreed fixes automatically.
Work Item Agents (tag + KAI-TRIGGER)
How Triggers Work
All work-item agents require the KAI-TRIGGER tag in addition to the agent-specific tag.
Add both tags to the ADO work item (or corresponding Jira labels) to activate the agent.
DoR Checker
Validates Definition of Readiness — checks story completeness, acceptance criteria, and technical detail.
DoD Checker
Validates Definition of Done — checks completion criteria, test coverage, and documentation.
DoD Fixer
Fixes DoD gaps found by the checker — adds missing tests, docs, or acceptance criteria evidence.
BugFix
Analyzes the bug, finds affected code, creates a fix branch and PR. Supports cross-repo delegation.
DevAgent
Full implementation from story — researches, plans, codes, tests, and creates PR automatically.
QA Agent
Verifies AEM component implementation with browser automation, screenshots, and accessibility checks.
DOC Agent
Generates wiki documentation from completed story specs. Posts to ADO Wiki or Confluence with authoring guides.
Estimation
Estimates story points with detailed reasoning based on codebase complexity analysis.
Setup Flow
Two paths: full hub setup (7 steps) or lightweight consumer setup (3 steps).
Hub (Full Setup)
/auto-init → /auto-provision → /auto-pipelines →
/auto-deploy → /auto-lambda-env → /auto-webhooks →
/auto-alarms
Creates AWS resources, deploys Lambdas, configures all 10 pipelines, sets up monitoring.
Consumer (Lightweight)
/auto-init → /auto-pipelines (6 only) →
/auto-webhooks (PR only)
No AWS provisioning. Uses hub’s Lambda. Register in hub’s pipeline maps.
Cross-Repo Delegation
Agents automatically hand off work to the correct repository when the fix lives elsewhere.
How It Works
BugFix starts in repo A → triage discovers the fix belongs in repo B → writes
delegate.json → pipeline reads it → queues the matching pipeline in repo B.
The receiving agent picks up the work with full context from the original ticket.
BugFix
Frontend bug traced to a missing Sling Model field? Delegates to Brand-Backend automatically.
DevAgent
Story requires backend + frontend changes? DevAgent delegates the backend portion to the correct repo.
DoD-Fix
DoD gap requires a change in another repo? DoD-Fix delegates to the repo that owns the affected code.
Key Environment Variables
DX_PIPELINE_MODE — tells the agent it is running in pipeline context (enables delegation).
CROSS_REPO_PIPELINE_MAP — maps repo names to pipeline IDs for cross-repo queuing.
Hub vs Consumer
The hub owns shared infrastructure. Consumers are lightweight, relying on the hub for AWS resources.
| Capability | Hub | Consumer |
|---|---|---|
| AWS resources | Creates and owns | Uses hub’s |
| Pipelines | 10 (all agents) | 6 (PR Review, PR Answer, Eval, DevAgent, BugFix, DoD-Fix) |
| WI hooks | Creates (tag-filtered) | N/A |
| PR Answer hook | Creates (repo-scoped) | Creates (points to hub’s Lambda) |
| Lambda deployment | /auto-deploy | Skipped |
| CloudWatch alarms | /auto-alarms | Skipped |
| Cross-repo delegation | Sends to consumers | Receives from hub |
Safety & Governance
Multiple layers of protection keep autonomous agents in check -- no runaway execution, no surprise bills.
Rate Limiting
Max 20-30 runs per day per agent type. Prevents runaway execution from webhook storms or misconfigured triggers.
Token Budget
Monthly cap with three states: normal (full execution) → suggest-only (analysis only, no code changes) → halted (all agents stopped).
Deduplication
1-hour DynamoDB TTL. Duplicate webhook events are silently dropped — no double-processing, no wasted tokens.
Decision Journal
Every agent decision logged with reasoning, evidence, and outcome. Full audit trail for every autonomous action.
Execution Bundles in S3
Every pipeline run saves its full execution context (specs, diffs, decisions) to S3 with 90-day retention.
CloudWatch Alarms
DLQ depth, Lambda errors, and throttle alarms. SNS notifications to the team when thresholds are breached.
Human Control
Interactive mode for debugging, policy gates that restrict what each agent role can do, and tag-based activation — no agent runs unless explicitly triggered.
Pipeline Agent
pipeline-agent.js -- the universal runner that powers all 10 agents using the Claude Agent SDK.
Streaming Execution
Uses includePartialMessages for real-time streaming. Agent output visible in pipeline logs as it happens.
60s Heartbeat
Sends periodic heartbeat messages to keep ADO pipeline alive during long-running agent operations.
Configurable
MAX_TURNS, TIMEOUT_MINUTES, ALLOWED_TOOLS, CLAUDE_MODEL — each agent pipeline sets its own parameters.
Plugin Discovery
Reads PLUGIN_BASE_DIR to discover and load dx-core, dx-aem, and dx-automation plugins at runtime.
Skill Dispatch
Each pipeline invokes a specific skill command. The runner translates pipeline variables into the correct /dx-* invocation.
Graceful Shutdown
On timeout or error, the agent writes a summary of progress and posts partial results back to ADO before exiting.
Operations & Monitoring
Four skills for monitoring, diagnosing, evaluating, and testing the automation layer.
/auto-status
Live dashboard: DLQ depth, token budget remaining, rate limit counters, Lambda invocation metrics, and pipeline run history.
/auto-doctor
Health check: file integrity, pipeline configuration, Lambda state, webhook connectivity, DynamoDB table status.
/auto-eval
Quality evaluation: runs agents against known test cases and scores output quality, accuracy, and adherence to standards.
/auto-test
Dry-run against real data: executes the full agent flow without making changes, validates routing and pre-flight checks.