Automation -- 24/7 Autonomous Agents

10 AI agents running as ADO pipelines, triggered by AWS Lambda webhooks. Same skills as local workflow, zero human interaction required. Currently supports Azure DevOps. Jira webhook support planned.

Event Flow

Event Flow Architecture

From tracker event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.

Event Flow

From ADO event to agent execution -- every webhook is routed, deduplicated, rate-limited, and dispatched.

Tracker Events (ADO) WI updated + tag, PR comment, PR created
Service Hooks WI hook, PR Answer hook, PR Review policy
API Gateway AWS entry point
Lambda Routers Deduplicate, Rate limit, Token budget
ADO Pipelines 10 agent pipelines
Agents

Ten Autonomous Agents

Two PR agents triggered automatically, eight work-item agents triggered by ADO tags (Jira label support planned).

PR Agents (automatic — no tags)

PR Reviewer

Build validation policy on PR creation. Runs /dx-pr-review — posts inline comments with severity levels, architecture checks, and standards compliance.

PR created build policy

PR Answerer

Webhook on PR comment. Runs /dx-pr-answer — researches codebase, drafts replies, and applies agreed fixes automatically.

PR comment webhook

Work Item Agents (tag + KAI-TRIGGER)

How Triggers Work

All work-item agents require the KAI-TRIGGER tag in addition to the agent-specific tag. Add both tags to the ADO work item (or corresponding Jira labels) to activate the agent.

DoR Checker

Validates Definition of Readiness — checks story completeness, acceptance criteria, and technical detail.

KAI-DOR-AUTOMATION /dx-agent-re

DoD Checker

Validates Definition of Done — checks completion criteria, test coverage, and documentation.

KAI-DOD-AUTOMATION /dx-req-dod

DoD Fixer

Fixes DoD gaps found by the checker — adds missing tests, docs, or acceptance criteria evidence.

KAI-DOD-FIX-AUTOMATION /dx-req-dod

BugFix

Analyzes the bug, finds affected code, creates a fix branch and PR. Supports cross-repo delegation.

KAI-BUGFIX-AUTOMATION /dx-bug-all

DevAgent

Full implementation from story — researches, plans, codes, tests, and creates PR automatically.

KAI-DEV-AUTOMATION /dx-agent-all

QA Agent

Verifies AEM component implementation with browser automation, screenshots, and accessibility checks.

KAI-QA-AUTOMATION /aem-qa

DOC Agent

Generates wiki documentation from completed story specs. Posts to ADO Wiki or Confluence with authoring guides.

KAI-DOC-AUTOMATION /dx-doc-gen + /aem-doc-gen

Estimation

Estimates story points with detailed reasoning based on codebase complexity analysis.

KAI-ESTIMATION-AUTOMATION estimation
Setup

Setup Flow

Two paths: full hub setup (7 steps) or lightweight consumer setup (3 steps).

Hub (Full Setup)

/auto-init/auto-provision/auto-pipelines/auto-deploy/auto-lambda-env/auto-webhooks/auto-alarms

Creates AWS resources, deploys Lambdas, configures all 10 pipelines, sets up monitoring.

Consumer (Lightweight)

/auto-init/auto-pipelines (6 only) → /auto-webhooks (PR only)

No AWS provisioning. Uses hub’s Lambda. Register in hub’s pipeline maps.

Cross-Repo

Cross-Repo Delegation

Agents automatically hand off work to the correct repository when the fix lives elsewhere.

How It Works

BugFix starts in repo A → triage discovers the fix belongs in repo B → writes delegate.json → pipeline reads it → queues the matching pipeline in repo B. The receiving agent picks up the work with full context from the original ticket.

BugFix

Frontend bug traced to a missing Sling Model field? Delegates to Brand-Backend automatically.

DevAgent

Story requires backend + frontend changes? DevAgent delegates the backend portion to the correct repo.

DoD-Fix

DoD gap requires a change in another repo? DoD-Fix delegates to the repo that owns the affected code.

Key Environment Variables

DX_PIPELINE_MODE — tells the agent it is running in pipeline context (enables delegation). CROSS_REPO_PIPELINE_MAP — maps repo names to pipeline IDs for cross-repo queuing.

Comparison

Hub vs Consumer

The hub owns shared infrastructure. Consumers are lightweight, relying on the hub for AWS resources.

CapabilityHubConsumer
AWS resourcesCreates and ownsUses hub’s
Pipelines10 (all agents)6 (PR Review, PR Answer, Eval, DevAgent, BugFix, DoD-Fix)
WI hooksCreates (tag-filtered)N/A
PR Answer hookCreates (repo-scoped)Creates (points to hub’s Lambda)
Lambda deployment/auto-deploySkipped
CloudWatch alarms/auto-alarmsSkipped
Cross-repo delegationSends to consumersReceives from hub
Safety

Safety & Governance

Multiple layers of protection keep autonomous agents in check -- no runaway execution, no surprise bills.

Rate Limiting

Max 20-30 runs per day per agent type. Prevents runaway execution from webhook storms or misconfigured triggers.

Token Budget

Monthly cap with three states: normal (full execution) → suggest-only (analysis only, no code changes) → halted (all agents stopped).

Deduplication

1-hour DynamoDB TTL. Duplicate webhook events are silently dropped — no double-processing, no wasted tokens.

Decision Journal

Every agent decision logged with reasoning, evidence, and outcome. Full audit trail for every autonomous action.

Execution Bundles in S3

Every pipeline run saves its full execution context (specs, diffs, decisions) to S3 with 90-day retention.

CloudWatch Alarms

DLQ depth, Lambda errors, and throttle alarms. SNS notifications to the team when thresholds are breached.

Human Control

Interactive mode for debugging, policy gates that restrict what each agent role can do, and tag-based activation — no agent runs unless explicitly triggered.

Runtime

Pipeline Agent

pipeline-agent.js -- the universal runner that powers all 10 agents using the Claude Agent SDK.

Streaming Execution

Uses includePartialMessages for real-time streaming. Agent output visible in pipeline logs as it happens.

60s Heartbeat

Sends periodic heartbeat messages to keep ADO pipeline alive during long-running agent operations.

Configurable

MAX_TURNS, TIMEOUT_MINUTES, ALLOWED_TOOLS, CLAUDE_MODEL — each agent pipeline sets its own parameters.

Plugin Discovery

Reads PLUGIN_BASE_DIR to discover and load dx-core, dx-aem, and dx-automation plugins at runtime.

Skill Dispatch

Each pipeline invokes a specific skill command. The runner translates pipeline variables into the correct /dx-* invocation.

Graceful Shutdown

On timeout or error, the agent writes a summary of progress and posts partial results back to ADO before exiting.

Operations

Operations & Monitoring

Four skills for monitoring, diagnosing, evaluating, and testing the automation layer.

/auto-status

Live dashboard: DLQ depth, token budget remaining, rate limit counters, Lambda invocation metrics, and pipeline run history.

monitoring

/auto-doctor

Health check: file integrity, pipeline configuration, Lambda state, webhook connectivity, DynamoDB table status.

diagnosis

/auto-eval

Quality evaluation: runs agents against known test cases and scores output quality, accuracy, and adherence to standards.

evaluation

/auto-test

Dry-run against real data: executes the full agent flow without making changes, validates routing and pre-flight checks.

testing
KAI by Dragan Filipovic