Automation Infrastructure

Lambda handler architecture, API Gateway routing, and the complete pipeline variable reference for all 11 ADO pipelines.

Architecture

Lambda Handler Architecture

Two AWS Lambda functions handle all webhook events from ADO, routing them through deduplication and rate limiting before dispatching pipelines.

Webhook Flow

From ADO event to pipeline dispatch -- every webhook is validated, deduplicated, and rate-limited.

ADO Event Service Hook fires
API Gateway POST /wi or POST /pr-answer
Lambda Auth + parse payload
DynamoDB Dedupe + rate limit
Decision run or skip
ADO Pipeline Queue agent run

WI Router (POST /wi)

Handles all workitem.updated events. Scans work item tags against configured agents to determine which pipeline to queue. Single Lambda, single route, multiple agents.

Processing flow:
  1. Validate webhook (Basic Auth + secret header)
  2. Parse payload — extract work item ID, tags, type
  3. Scan AGENTS array — match tag gate + work item type
  4. First matching agent wins (tags should be mutually exclusive)
  5. Deduplicate (DynamoDB, 1-hour TTL)
  6. Rate limit (DynamoDB atomic counter)
  7. Queue ADO pipeline with templateParameters

PR Router (POST /pr-answer)

Handles git.pullrequest.comment-event — triggered when someone comments on a PR.

Five gates:
  1. Event type includes “comment”
  2. PR is active
  3. PR is mine (createdBy matches MY_IDENTITIES)
  4. Comment is NOT from me (avoid loop)
  5. Comment is NOT from bot (avoid loop)
Agents

WI Router Agent Config

Seven agents routed by a single Lambda. Adding a new agent requires only environment variable changes.

AgentWI TypeTag Gate Env VarPipeline ID Env Var
DoRUser StoryTAG_GATE_DORADO_DOR_PIPELINE_ID
DoDUser StoryTAG_GATE_DODADO_DOD_PIPELINE_ID
BugFixBugTAG_GATE_BUGFIXADO_BUGFIX_PIPELINE_ID
QAUser StoryTAG_GATE_QAADO_QA_PIPELINE_ID
DevAgentUser StoryTAG_GATE_DEVADO_DEV_PIPELINE_ID
DOCAgentUser StoryTAG_GATE_DOCADO_DOC_PIPELINE_ID
EstimationUser StoryTAG_GATE_ESTIMATIONADO_ESTIMATION_PIPELINE_ID

Adding a New Agent

No new API Gateway route or ADO service hook needed. Just add TAG_GATE_<NAME> and ADO_<NAME>_PIPELINE_ID env vars to the Lambda, add an entry to the AGENTS array in wi-router.mjs, and deploy with lambda/deploy.sh wi-router.

Design

Design Decisions

Why tag-based routing and why two separate Lambdas.

Why Tag-Based Routing?

The original design used 6 API Gateway routes (/dor, /dod, etc.) mapped to one Lambda. Adding a new agent required a new route, a new service hook, and a code change.

Tag-based routing eliminates this — the Lambda scans work item tags to find the matching agent. Adding a new agent only requires Lambda env vars. No new route, no new hook.

Why 2 Lambdas?

PR Router handles a fundamentally different event type (git.pullrequest.comment-event) with different gates (identity checking, bot loop prevention) and different queue parameters. Keeping it separate reduces complexity in both handlers.

Libraries

Shared Libraries

Deployed in the same zip as each Lambda -- reusable utilities for event processing.

dedupe.js

DynamoDB conditional put for event deduplication. 1-hour TTL prevents duplicate webhook processing from ADO retry storms.

rate-limiter.js

DynamoDB atomic counters for per-pipeline and per-identity rate limits. Prevents runaway execution (max 20-30 runs per day per agent type).

retry.js

Exponential backoff with jitter (3 retries). Used for ADO pipeline queue API calls and DynamoDB operations.

aws-sig.js

AWS Signature V4 for ADO pipeline queue API calls. Handles auth header generation for cross-service communication.

dlq.js

Sends failed events to SQS Dead Letter Queue for later investigation. Ensures no events are silently dropped.

Hooks

ADO Service Hooks

Three Lambda-routed hooks, plus one Azure-native hook for SimpleAgent (no Lambda).

HookEventFilterAgents
WI User Storyworkitem.updatedworkItemType: User StoryDoR, DoD, QA, DevAgent, DOCAgent, Estimation
WI Bugworkitem.updatedworkItemType: BugBugFix
PR Answergit.pullrequest.comment-eventPR Answer
SimpleAgentworkitem.commentedcomment contains @kai-simpleSimpleAgent (no Lambda)

PR Review

PR Review uses a build validation policy, not a service hook. It triggers automatically when a PR is created, running as part of the PR build validation.

SimpleAgent -- Azure-native trigger (no Lambda)

SimpleAgent is the one agent with no Lambda in its path. A comment containing @kai-simple on a work item fires an ADO Service Hook (event: work item commented on, filter: comment contains the token) that delivers to an Incoming WebHook service connection, which the pipeline declares under resources.webhooks. The same event drives both the first run and recovery — Phase 0 (resume-check.sh) decides fresh-vs-resume. The WI Router’s AGENTS array has no simple entry, so the Lambda never sees it. The filter token is config-driven (dx-simple.recovery.trigger-token).

Trigger Choice

Azure-native vs Lambda router

When to skip the Lambda and trigger a pipeline directly from an ADO Service Hook.

Azure-native (SimpleAgent model)

Service Hook → Incoming WebHook service connection → pipeline resources.webhooks. No AWS infra, nothing to deploy. Trade-off: you lose the Lambda’s dedupe, per-agent rate limiting, monthly token-budget gating, and central tag-classification — and any loop-prevention must live in the pipeline. One hook + one service connection per agent. Best for human-initiated, low-volume agents.

Lambda router (the other 10)

One hook fans out to many pipelines via tag-classification, with dedupe (ADO retry storms), rate limiting, and the monthly token budget applied centrally. Best for high-volume autonomous agents. Adding an agent is just env vars — no new hook.

Which pipelines could adopt the Azure-native trigger?

ADO Service Hooks natively filter on tag, comment text (contains), work-item type, state/field change, and PR events — so any event-driven agent can be triggered this way.

  • In use: SimpleAgent (@kai-simple comment).
  • Already Lambda-free: PR Reviewer (build validation policy).
  • Natural next candidate: PR Answerer via a @kai-answer comment keyword — but it would lose the Lambda’s cheap identity / loop / dedupe gates, which would move into the pipeline.
  • Could migrate, with a trade-off: the tag-driven WI agents (DoR, DoD, QA, DevAgent, DOCAgent, Estimation, BugFix) — each via a per-agent tag, @kai-... comment, or State-transition hook — but they’d lose dedupe, rate-limiting, and the token budget. Keep these on the Lambda router.
  • Not event-triggered: DoD-Fixer (chained after the DoD check).
Variables

Pipeline Variable Reference

Complete reference for all ADO pipeline variables across 11 pipelines (10 CLI + 1 Eval).

Common Variables (all CLI pipelines)

VariableSecretDescription
ANTHROPIC_API_KEYYesClaude API key for Claude Code CLI
ADO_ORG_URLNoADO org URL for ADO REST API access and cross-repo delegation

Per-Pipeline Additional Variables

PipelineAdditional VariablesTotal
DoRDOR_WIKI_URL4
PR ReviewREVIEWER_IDENTITIES4
PR AnswerMY_IDENTITIES4
DoD3
DoD FixtargetRepo (param)4
BugFixtargetRepo (param)4
QAAEM_AUTHOR_URL, AEM_PUBLISH_URL, AEM_USER, AEM_PASS7
DevAgenttargetRepo (param), FIGMA_PERSONAL_ACCESS_TOKEN5
DOCAgentAEM_AUTHOR_URL, AEM_PUBLISH_URL, AEM_USER, AEM_PASS7
Estimation3
Eval0

Implicit Variables (set automatically)

DX_PIPELINE_MODE

Hardcoded ‘true’. Marks a pipeline run and gates automation-only behavior. All CLI pipelines set this.

targetRepo (param)

Set by the KAI-HUB router for multi-repo fan-out. Empty → direct mode (checkout: self); set → clone repos.json[targetRepo] and point TARGET_DIR at it.

SYSTEM_ACCESSTOKEN

Set from $(System.AccessToken). Used for plugin install and queuing worker runs. Auto-rotates, no secrets to manage.

KAI-HUB

Hub Pipeline + Registries

The hub pipeline (ado-cli-hub.yml) reads two registries to fan an agent's one worker pipeline out per resolved repo.

Registries (AI/automation repo)

Registry files
.ai/automation/registries/repos.json    # alias -> {repoId, adoProject, cloneUrl, defaultBranch, platform, brand, role}
.ai/automation/registries/agents.json   # tag   -> {workerPipelineId, event, writes}

The hub parses the @kai-<agent> tag, runs /dx-discover-repos to resolve the touched repos, and queues the agent’s one worker per repo (cross-project via each repo’s adoProject, Basic PAT auth). Single-repo projects don’t use the hub — their dual-mode worker fires directly via its own resources.webhooks.

Adding a Repo

Add the repo as an alias in repos.json — no per-pipeline map edits. There is no CROSS_REPO_PIPELINE_MAP or SOURCE_REPO_NAME anymore; peer-to-peer delegation was replaced by the central router. See dx-hub/shared/registry-format.md.

Deploy

Deployment

Lambda deployment and API Gateway configuration.

Deploy Commands

Lambda Deployment
# Deploy both Lambdas
cd .ai/automation
lambda/deploy.sh all

# Deploy individually
lambda/deploy.sh wi-router
lambda/deploy.sh pr-router

The deploy script reads infra.json for function names and source files, copies shared libs from agents/lib/, creates a flat zip, and calls aws lambda update-function-code.

API Gateway Routes

HTTP API v2 (API Gateway). Single API Gateway, 2 routes, 2 Lambda targets:

  • POST /wi → WI Router
  • POST /pr-answer → PR Router

DynamoDB Tables

Two tables shared by both Lambdas:

  • Dedupe table — conditional put with 1-hour TTL
  • Rate limit table — atomic counters per pipeline/identity
KAI by Dragan Filipovic