Automation -- 24/7 Autonomous Agents

11 AI agents running as ADO pipelines, triggered by AWS Lambda webhooks. Same skills as local workflow, zero human interaction required. Currently supports Azure DevOps. Jira webhook support planned.

Event Flow

Event Flow Architecture

From tracker event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.

Event Flow

From ADO event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.

Tracker Events (ADO) WI updated + tag, PR comment, PR created

Service Hooks WI hook, PR Answer hook, PR Review policy

API Gateway AWS entry point

Lambda Routers Deduplicate, Rate limit, Token budget

ADO Pipelines 11 agent pipelines

Agents

Eleven Autonomous Agents

Two PR agents triggered automatically, eight work-item agents triggered by ADO tags, and SimpleAgent triggered by a comment via an Azure-native Service Hook (Jira label support planned).

PR Agents (automatic — no tags)

PR Reviewer

Build validation policy on PR creation. Runs /dx-pr-review — posts inline comments with severity levels, architecture checks, and standards compliance.

PR created build policy

PR Answerer

Webhook on PR comment. Runs /dx-pr-answer — researches codebase, drafts replies, and applies agreed fixes automatically.

PR comment webhook

Work Item Agents (tag + KAI-TRIGGER)

How Triggers Work

All work-item agents require the KAI-TRIGGER tag in addition to the agent-specific tag. Add both tags to the ADO work item (or corresponding Jira labels) to activate the agent.

DoR Checker

Validates Definition of Readiness — checks story completeness, acceptance criteria, and technical detail.

KAI-DOR-AUTOMATION /dx-agent-re

DoD Checker

Validates Definition of Done — checks completion criteria, test coverage, and documentation.

KAI-DOD-AUTOMATION /dx-req-dod

DoD Fixer

Fixes DoD gaps found by the checker — adds missing tests, docs, or acceptance criteria evidence.

KAI-DOD-FIX-AUTOMATION /dx-req-dod

BugFix

Analyzes the bug, finds affected code, creates a fix branch and PR. Supports cross-repo delegation.

KAI-BUGFIX-AUTOMATION /dx-bug-all

DevAgent

Full implementation from story — researches, plans, codes, tests, and creates PR automatically.

KAI-DEV-AUTOMATION /dx-agent-all

QA Agent

Verifies AEM component implementation with browser automation, screenshots, and accessibility checks.

KAI-QA-AUTOMATION /aem-qa

DOC Agent

Generates wiki documentation from completed story specs. Posts to ADO Wiki or Confluence with authoring guides.

KAI-DOC-AUTOMATION /dx-doc-gen + /aem-doc-gen

Estimation

Estimates story points with detailed reasoning based on codebase complexity analysis.

KAI-ESTIMATION-AUTOMATION estimation

Comment-triggered (Azure-native — no Lambda)

How SimpleAgent Triggers

A comment containing @kai-simple on a work item fires an ADO Service Hook (event: work item commented on, filter: comment contains the token) that posts to an Incoming WebHook service connection the pipeline listens on. No Lambda, no API Gateway. The same event starts the first run and resumes a blocked one — Phase 0 decides which.

SimpleAgent

Applies a small AEM change (a11y label, color, spacing, copy) via an authoring (JCR write) or code (file edits → PR) split. 9 confidence gates, strict scope limits, resumable recovery.

@kai-simple /dx-simple

Setup

Setup Flow

Two paths: full hub setup (7 steps) or lightweight consumer setup (3 steps).

Hub (Full Setup)

/auto-init → /auto-provision → /auto-pipelines → /auto-deploy → /auto-lambda-env → /auto-webhooks → /auto-alarms

Creates AWS resources, deploys Lambdas, configures all 11 pipelines, sets up monitoring.

Consumer (Lightweight)

/auto-init → /auto-pipelines (6 only) → /auto-webhooks (PR only)

No AWS provisioning. Uses hub’s Lambda. Register in hub’s pipeline maps.

Cross-Repo

KAI-HUB Multi-Repo Routing

A central router resolves which repos a work item touches and fans one dual-mode worker pipeline out per repo. No peer-to-peer pipeline delegation.

How It Works

A human comments @kai-<agent> on a work item → the KAI-HUB router (ado-cli-hub.yml) parses the tag, dedups, runs /dx-discover-repos to resolve the touched repos → queues the agent’s one worker pipeline once per repo. Each worker clones the target repo dynamically from repos.json, so it picks up that repo’s full context.

Central Router

One webhook entry point. Parses @kai-<agent>, looks up the worker in agents.json, and fans it out per resolved repo (cross-project, Basic PAT auth).

Dual-Mode Workers

Each worker keeps its own resources.webhooks for single-repo direct triggers and also accepts a targetRepo param for hub mode — clone-and-work on any registered repo.

Two Registries

repos.json (alias → clone metadata) and agents.json (tag → workerPipelineId) are the single source of truth. Adding a repo is a registry edit, not a per-pipeline map update.

Key Environment Variables

DX_PIPELINE_MODE — marks a pipeline run (enables pipeline-only behaviors). targetRepo param — empty for single-repo direct mode, set by the hub for fan-out. See the Cross-Repo page for the full model.

Comparison

Hub vs Consumer

The hub owns shared infrastructure. Consumers are lightweight, relying on the hub for AWS resources.

Capability	Hub	Consumer
AWS resources	Creates and owns	Uses hub’s
Pipelines	11 (all agents)	6 (PR Review, PR Answer, Eval, DevAgent, BugFix, DoD-Fix)
WI hooks	Creates (tag-filtered)	N/A
PR Answer hook	Creates (repo-scoped)	Creates (points to hub’s Lambda)
Lambda deployment	/auto-deploy	Skipped
CloudWatch alarms	/auto-alarms	Skipped
Cross-repo delegation	Sends to consumers	Receives from hub

Safety

Safety & Governance

Multiple layers of protection keep autonomous agents in check -- no runaway execution, no surprise bills.

Rate Limiting

Max 20-30 runs per day per agent type. Prevents runaway execution from webhook storms or misconfigured triggers.

Token Budget

Monthly cap with three states: normal (full execution) → suggest-only (analysis only, no code changes) → halted (all agents stopped).

Deduplication

1-hour DynamoDB TTL. Duplicate webhook events are silently dropped — no double-processing, no wasted tokens.

Decision Journal

Every agent decision logged with reasoning, evidence, and outcome. Full audit trail for every autonomous action.

Execution Bundles in S3

Every pipeline run saves its full execution context (specs, diffs, decisions) to S3 with 90-day retention.

CloudWatch Alarms

DLQ depth, Lambda errors, and throttle alarms. SNS notifications to the team when thresholds are breached.

Human Control

Interactive mode for debugging, policy gates that restrict what each agent role can do, and tag-based activation — no agent runs unless explicitly triggered.

Runtime

Pipeline Agent

pipeline-agent.js -- the universal runner that powers all 11 agents using the Claude Agent SDK.

Streaming Execution

Uses includePartialMessages for real-time streaming. Agent output visible in pipeline logs as it happens.

60s Heartbeat

Sends periodic heartbeat messages to keep ADO pipeline alive during long-running agent operations.

Configurable

MAX_TURNS, TIMEOUT_MINUTES, ALLOWED_TOOLS, CLAUDE_MODEL — each agent pipeline sets its own parameters.

Plugin Discovery

Reads PLUGIN_BASE_DIR to discover and load dx-core, dx-aem, and dx-automation plugins at runtime.

Skill Dispatch

Each pipeline invokes a specific skill command. The runner translates pipeline variables into the correct /dx-* invocation.

Graceful Shutdown

On timeout or error, the agent writes a summary of progress and posts partial results back to ADO before exiting.

Operations

Operations & Monitoring

Four skills for monitoring, diagnosing, evaluating, and testing the automation layer.

/auto-status

Live dashboard: DLQ depth, token budget remaining, rate limit counters, Lambda invocation metrics, and pipeline run history.

monitoring

/auto-doctor

Health check: file integrity, pipeline configuration, Lambda state, webhook connectivity, DynamoDB table status.

diagnosis

/auto-eval

Quality evaluation: runs agents against known test cases and scores output quality, accuracy, and adherence to standards.

evaluation

/auto-test

Dry-run against real data: executes the full agent flow without making changes, validates routing and pre-flight checks.

testing