Automation Infrastructure
Lambda handler architecture, API Gateway routing, and the complete pipeline variable reference for all 11 ADO pipelines.
Lambda Handler Architecture
Two AWS Lambda functions handle all webhook events from ADO, routing them through deduplication and rate limiting before dispatching pipelines.
From ADO event to pipeline dispatch -- every webhook is validated, deduplicated, and rate-limited.
WI Router (POST /wi)
Handles all workitem.updated events. Scans work item tags against configured
agents to determine which pipeline to queue. Single Lambda, single route, multiple agents.
- Validate webhook (Basic Auth + secret header)
- Parse payload — extract work item ID, tags, type
- Scan AGENTS array — match tag gate + work item type
- First matching agent wins (tags should be mutually exclusive)
- Deduplicate (DynamoDB, 1-hour TTL)
- Rate limit (DynamoDB atomic counter)
- Queue ADO pipeline with templateParameters
PR Router (POST /pr-answer)
Handles git.pullrequest.comment-event — triggered when someone comments on a PR.
- Event type includes “comment”
- PR is active
- PR is mine (createdBy matches
MY_IDENTITIES) - Comment is NOT from me (avoid loop)
- Comment is NOT from bot (avoid loop)
WI Router Agent Config
Seven agents routed by a single Lambda. Adding a new agent requires only environment variable changes.
| Agent | WI Type | Tag Gate Env Var | Pipeline ID Env Var |
|---|---|---|---|
| DoR | User Story | TAG_GATE_DOR | ADO_DOR_PIPELINE_ID |
| DoD | User Story | TAG_GATE_DOD | ADO_DOD_PIPELINE_ID |
| BugFix | Bug | TAG_GATE_BUGFIX | ADO_BUGFIX_PIPELINE_ID |
| QA | User Story | TAG_GATE_QA | ADO_QA_PIPELINE_ID |
| DevAgent | User Story | TAG_GATE_DEV | ADO_DEV_PIPELINE_ID |
| DOCAgent | User Story | TAG_GATE_DOC | ADO_DOC_PIPELINE_ID |
| Estimation | User Story | TAG_GATE_ESTIMATION | ADO_ESTIMATION_PIPELINE_ID |
Adding a New Agent
No new API Gateway route or ADO service hook needed. Just add TAG_GATE_<NAME> and
ADO_<NAME>_PIPELINE_ID env vars to the Lambda, add an entry to the AGENTS array
in wi-router.mjs, and deploy with lambda/deploy.sh wi-router.
Design Decisions
Why tag-based routing and why two separate Lambdas.
Why Tag-Based Routing?
The original design used 6 API Gateway routes (/dor, /dod, etc.)
mapped to one Lambda. Adding a new agent required a new route, a new service hook, and a code change.
Tag-based routing eliminates this — the Lambda scans work item tags to find the matching agent. Adding a new agent only requires Lambda env vars. No new route, no new hook.
Why 2 Lambdas?
PR Router handles a fundamentally different event type (git.pullrequest.comment-event)
with different gates (identity checking, bot loop prevention) and different queue parameters.
Keeping it separate reduces complexity in both handlers.
Shared Libraries
Deployed in the same zip as each Lambda -- reusable utilities for event processing.
dedupe.js
DynamoDB conditional put for event deduplication. 1-hour TTL prevents duplicate webhook processing from ADO retry storms.
rate-limiter.js
DynamoDB atomic counters for per-pipeline and per-identity rate limits. Prevents runaway execution (max 20-30 runs per day per agent type).
retry.js
Exponential backoff with jitter (3 retries). Used for ADO pipeline queue API calls and DynamoDB operations.
aws-sig.js
AWS Signature V4 for ADO pipeline queue API calls. Handles auth header generation for cross-service communication.
dlq.js
Sends failed events to SQS Dead Letter Queue for later investigation. Ensures no events are silently dropped.
ADO Service Hooks
Three Lambda-routed hooks, plus one Azure-native hook for SimpleAgent (no Lambda).
| Hook | Event | Filter | Agents |
|---|---|---|---|
| WI User Story | workitem.updated | workItemType: User Story | DoR, DoD, QA, DevAgent, DOCAgent, Estimation |
| WI Bug | workitem.updated | workItemType: Bug | BugFix |
| PR Answer | git.pullrequest.comment-event | — | PR Answer |
| SimpleAgent | workitem.commented | comment contains @kai-simple | SimpleAgent (no Lambda) |
PR Review
PR Review uses a build validation policy, not a service hook. It triggers automatically when a PR is created, running as part of the PR build validation.
SimpleAgent -- Azure-native trigger (no Lambda)
SimpleAgent is the one agent with no Lambda in its path. A comment containing
@kai-simple on a work item fires an ADO Service Hook
(event: work item commented on, filter: comment contains the token) that delivers to
an Incoming WebHook service connection, which the pipeline declares under
resources.webhooks. The same event drives both the first run and recovery —
Phase 0 (resume-check.sh) decides fresh-vs-resume. The WI Router’s
AGENTS array has no simple entry, so the Lambda never sees it.
The filter token is config-driven (dx-simple.recovery.trigger-token).
Azure-native vs Lambda router
When to skip the Lambda and trigger a pipeline directly from an ADO Service Hook.
Azure-native (SimpleAgent model)
Service Hook → Incoming WebHook service connection → pipeline resources.webhooks.
No AWS infra, nothing to deploy. Trade-off: you lose the Lambda’s dedupe, per-agent rate
limiting, monthly token-budget gating, and central tag-classification — and any loop-prevention
must live in the pipeline. One hook + one service connection per agent. Best for
human-initiated, low-volume agents.
Lambda router (the other 10)
One hook fans out to many pipelines via tag-classification, with dedupe (ADO retry storms), rate limiting, and the monthly token budget applied centrally. Best for high-volume autonomous agents. Adding an agent is just env vars — no new hook.
Which pipelines could adopt the Azure-native trigger?
ADO Service Hooks natively filter on tag, comment text (contains), work-item type, state/field change, and PR events — so any event-driven agent can be triggered this way.
- In use: SimpleAgent (
@kai-simplecomment). - Already Lambda-free: PR Reviewer (build validation policy).
- Natural next candidate: PR Answerer via a
@kai-answercomment keyword — but it would lose the Lambda’s cheap identity / loop / dedupe gates, which would move into the pipeline. - Could migrate, with a trade-off: the tag-driven WI agents (DoR, DoD, QA, DevAgent, DOCAgent,
Estimation, BugFix) — each via a per-agent tag,
@kai-...comment, or State-transition hook — but they’d lose dedupe, rate-limiting, and the token budget. Keep these on the Lambda router. - Not event-triggered: DoD-Fixer (chained after the DoD check).
Pipeline Variable Reference
Complete reference for all ADO pipeline variables across 11 pipelines (10 CLI + 1 Eval).
Common Variables (all CLI pipelines)
| Variable | Secret | Description |
|---|---|---|
ANTHROPIC_API_KEY | Yes | Claude API key for Claude Code CLI |
ADO_ORG_URL | No | ADO org URL for ADO REST API access and cross-repo delegation |
Per-Pipeline Additional Variables
| Pipeline | Additional Variables | Total |
|---|---|---|
| DoR | DOR_WIKI_URL | 4 |
| PR Review | REVIEWER_IDENTITIES | 4 |
| PR Answer | MY_IDENTITIES | 4 |
| DoD | — | 3 |
| DoD Fix | targetRepo (param) | 4 |
| BugFix | targetRepo (param) | 4 |
| QA | AEM_AUTHOR_URL, AEM_PUBLISH_URL, AEM_USER, AEM_PASS | 7 |
| DevAgent | targetRepo (param), FIGMA_PERSONAL_ACCESS_TOKEN | 5 |
| DOCAgent | AEM_AUTHOR_URL, AEM_PUBLISH_URL, AEM_USER, AEM_PASS | 7 |
| Estimation | — | 3 |
| Eval | — | 0 |
Implicit Variables (set automatically)
DX_PIPELINE_MODE
Hardcoded ‘true’. Marks a pipeline run and gates automation-only behavior.
All CLI pipelines set this.
targetRepo (param)
Set by the KAI-HUB router for multi-repo fan-out. Empty → direct mode (checkout: self);
set → clone repos.json[targetRepo] and point TARGET_DIR at it.
SYSTEM_ACCESSTOKEN
Set from $(System.AccessToken). Used for plugin install and queuing worker runs.
Auto-rotates, no secrets to manage.
Hub Pipeline + Registries
The hub pipeline (ado-cli-hub.yml) reads two registries to fan an agent's one worker pipeline out per resolved repo.
Registries (AI/automation repo)
.ai/automation/registries/repos.json # alias -> {repoId, adoProject, cloneUrl, defaultBranch, platform, brand, role}
.ai/automation/registries/agents.json # tag -> {workerPipelineId, event, writes} The hub parses the @kai-<agent> tag, runs /dx-discover-repos to resolve the
touched repos, and queues the agent’s one worker per repo (cross-project via each repo’s
adoProject, Basic PAT auth). Single-repo projects don’t use the hub — their dual-mode worker
fires directly via its own resources.webhooks.
Adding a Repo
Add the repo as an alias in repos.json — no per-pipeline map edits. There is no
CROSS_REPO_PIPELINE_MAP or SOURCE_REPO_NAME anymore; peer-to-peer delegation
was replaced by the central router. See dx-hub/shared/registry-format.md.
Deployment
Lambda deployment and API Gateway configuration.
Deploy Commands
# Deploy both Lambdas cd .ai/automation lambda/deploy.sh all # Deploy individually lambda/deploy.sh wi-router lambda/deploy.sh pr-router
The deploy script reads infra.json for function names and source files, copies shared
libs from agents/lib/, creates a flat zip, and calls
aws lambda update-function-code.
API Gateway Routes
HTTP API v2 (API Gateway). Single API Gateway, 2 routes, 2 Lambda targets:
POST /wi→ WI RouterPOST /pr-answer→ PR Router
DynamoDB Tables
Two tables shared by both Lambdas:
- Dedupe table — conditional put with 1-hour TTL
- Rate limit table — atomic counters per pipeline/identity