Learning & Feedback

How agents accumulate knowledge over time -- from raw signal capture to rule promotion. Fix patterns, review issues, build errors, and execution metrics feed back into future runs.

Existing

Existing Feedback Mechanisms

Seven mechanisms already capture signals during execution. Most are per-session and lose memory between runs.

Self-Healing Loop

Scope: Per-session, per-step. dx-step-fix recovers from failures during execution (targeted fix + root cause escalation). Memory is lost after the session ends — each run starts fresh.

Review-Fix Loop

Scope: Per-session, per-story. dx-step-verify runs up to 3 review-fix cycles. Same issues may recur across stories because findings are not persisted.

Weekly Retrospective

Scope: Per-week, automation only. Written to S3 bundles. Human reviews retro and manually adjusts prompts/rules. Gap: only covers automation agents, not local CLI usage.

Evaluation Framework

Scope: Pre-deployment quality gate. Fixture pass/fail results. Human updates fixtures when prompts change. Gap: does not learn from production runs.

Decision Journal

Scope: Per-run, automation only. Every agent decision logged with reasoning to S3. Human reviews journals and manually adjusts policy. Gap: no automated pattern extraction.

Policy Gate

Scope: Global, static. pipeline-policy.yaml restricts what each agent role can do. Manually maintained — human updates after incidents.

Shared Rules

Scope: Project-wide, git-tracked. .ai/rules/ files read by dx skills and automation agents. Manual only — no automated discovery or suggestion.

What's Missing

No per-project pattern accumulation across stories. No automated rule promotion. No calibration data for planning estimates. No cross-session memory for local CLI usage.

Storage

Three Storage Tiers

Different knowledge types belong in different places -- from auto-loaded rules to raw machine data.

Promotion Flow

Raw signals accumulate, patterns emerge, and proven fixes get promoted to rules.

Raw Signals .ai/learning/ raw/*.jsonl
3+ Matches Pattern detected
Propose Rule .claude/rules/ learned-*.md
Dev Confirms Rule written or declined

Tier 1: .claude/rules/

Permanent Conventions (Auto-Loaded)

Knowledge that influences every Claude interaction. Import conventions, naming patterns, test fixture patterns. Populated after a pattern is seen 3+ times and the developer confirms.

Format: learned-<topic>.md

Tier 2: .ai/

Domain Knowledge (Skill-Readable)
  • .ai/rules/ — shared rules for dx skills and automation
  • .ai/learning/ — accumulated data (fix history, metrics)
  • .ai/project/ — AEM component gotchas, market quirks

Tier 3: .ai/learning/raw/

Raw Signals (gitignored)

Machine-readable append-only logs: fixes.jsonl, runs.jsonl, review-issues.jsonl. Per-developer, no repo noise. Input for aggregation into Tier 1 and Tier 2 files.

Capabilities

Seven Learning Capabilities

Each capability captures a specific signal type and feeds it back into future runs.

1. Fix Pattern Memory

Storage: fixes.md + fixes.jsonl
After every dx-step-fix cycle, capture error type, message pattern, fix applied, and result. Before attempting a fix, dx-step-fix checks fixes.md for known patterns. If a strategy worked 3/3 times, try that first. Every 10 entries, fixes.md is regenerated.

2. Review Issue Tracker

Storage: review-issues.md + review-issues.jsonl
After every dx-step-verify or dx-code-reviewer run, capture issue type, file/component, auto-fix success, and cycle count. If 5+ issues of the same type appear, suggest adding a convention rule. Security patterns are promoted immediately (no threshold). Feeds into plan generation for preventive steps.

3. Build Error Catalog

Storage: patterns.md
After every dx-step-build run, capture generalized error message patterns, affected file types/modules, root cause categories, and applied fixes. Common errors get fixed faster via known pattern matching.

4. Convention Discovery

Storage: conventions-discovered.md
After code review finds convention patterns, capture convention name, examples, and confidence level. Can be promoted to .ai/rules/ when confirmed. Reduces false positive review findings over time.

5. Execution Metrics

Storage: metrics.md + runs.jsonl
After every coordinator completes, capture ticket, flow, phases, steps, fixes, heals, tokens, duration, and result. Enables trend analysis, token budget planning, phase bottleneck identification, and estimate calibration.

6. Story Archetype Matching

After each successful story completion, capture story type, component area, step count/types, and patterns needed. During planning, check for similar past stories to reuse step structure and predict fix/healing needs.

7. Weekly Retrospective

On demand (/dx-retro) or automated. Reads all .ai/learning/raw/*.jsonl files, aggregates success rates, common errors, and token trends. Identifies improving/degrading patterns. Suggests new rules and convention updates. Writes .ai/learning/retro/YYYY-WNN-summary.md.

Tiers

Storage Tiers Hierarchy

Knowledge type determines storage location -- from auto-loaded rules to raw machine data.

Knowledge TypeBest LocationWhyWho Reads It
Project conventions (coding patterns, naming rules).claude/rules/Auto-loaded on every Claude invocationEvery Claude interaction
Domain knowledge (component patterns, architecture).ai/rules/Read by skills via config. Shared across team.Skills that read .ai/rules/*.md
Operational metrics (run stats, timing).ai/learning/Machine data, needs aggregation before useful/dx-retro, /dx-learn
Fix recipes (error to fix mappings).claude/rules/ (once proven)Becomes a rule: “when you see X, fix with Y”dx-step-fix, dx-step-build
Project-specific knowledge (market quirks).ai/project/Same area as seed dataaem-file-resolver, dx skills
Skills

/dx-learn and /dx-retro

Two dedicated skills for aggregating and analyzing learning data.

/dx-learn -- Aggregate

Reads all .jsonl files in .ai/learning/raw/ and generates aggregated markdown summaries: fixes.md, review-issues.md, patterns.md, metrics.md, conventions-discovered.md. Run manually or triggered automatically every 10 entries when auto-aggregate is enabled.

/dx-retro -- Retrospective

Generates weekly retrospective with run counts, success rates, top fix/review patterns, trend analysis, and suggested actions. Writes to .ai/learning/retro/YYYY-WNN-summary.md. Can be run on demand or scheduled via automation pipelines.

Implementation

Implementation Sketch

Which skill reads what and writes where -- the data flow of the learning system.

SkillReads Before ActingWrites After ActingPromotion Trigger
dx-step-fixfixes.md for known patterns (3+ successes = try first)Append to fixes.jsonl; regenerate fixes.md every 10 entries3+ successes, 0 failures
dx-step-verifyAppend issues to review-issues.jsonl5+ same-type issues = suggest convention rule
dx-planmetrics.md + conventions-discovered.md + fixes.md
dx-code-reviewerAppend issues; propose learned-convention-*.md5+ same-type across stories
aem-verifyUpdate component-gotchas.md2+ same component issues
*-all coordinatorsAppend run data to runs.jsonl

Security Patterns

Security-related patterns (secrets, vulnerabilities) are promoted immediately with no threshold. The developer confirmation requirement is waived for security — these become rules automatically.

Skills

Skill-Level Integration

Each skill decides where to read from and write to. No central learning system -- skills own their signals.

SkillBefore ActingAfter Acting
dx-step-fixRead fixes.md for known patternsAppend to fixes.jsonl
dx-step-verifyAppend issues to review-issues.jsonl
dx-planRead metrics.md + conventions-discovered.md + fixes.md
dx-code-reviewerAppend issues; propose rule at 5+ occurrences
aem-verifyUpdate component-gotchas.md at 2+ occurrences
*-all coordinatorsAppend run data to runs.jsonl

/dx-learn

Reads all .jsonl files in .ai/learning/raw/, generates aggregated markdown summaries: fixes.md, review-issues.md, patterns.md, metrics.md, conventions-discovered.md.

/dx-retro

Generates weekly retrospective with run counts, success rates, top fix/review patterns, trend analysis, and suggested actions.

Layout

Consumer Project Layout

How learning data is organized in a consumer repo.

Directory Structure

.ai/
learning/
  fixes.md                    -- aggregated fix patterns
  review-issues.md            -- aggregated review findings
  patterns.md                 -- build error catalog
  conventions-discovered.md   -- discovered conventions
  metrics.md                  -- execution metrics summary
  retro/
    YYYY-WNN-summary.md      -- weekly retrospective
  raw/                        -- .gitignored
    runs.jsonl
    fixes.jsonl
    review-issues.jsonl

Key Design Decisions

  • Skills decide where to write — no central system
  • Threshold before promotion — 3+ for build fixes, 5+ for conventions
  • Developer confirmation for rules — never auto-create .claude/rules/ files
  • Security patterns promoted immediately — no threshold for secrets/vulnerabilities
  • Raw data is private — gitignored, per-developer
  • Summaries are shared — committable, team knowledge
  • Opt-in consumption — skills check if learning files exist but work without them
  • No breaking changes — learning is additive enhancement, not dependency
Config

Configuration

Learning is controlled via .ai/config.yaml with sensible defaults.

Config Options

.ai/config.yaml
learning:
enabled: true                    # Master switch
auto-aggregate: true             # Auto-regenerate summaries every 10 entries
retention-days: 90               # Raw data retention (0 = forever)
git-commit-summaries: true       # Include .md summaries in commits
git-ignore-raw: true             # Gitignore .jsonl raw data
Migration Path

Incremental rollout from signal capture to active learning.

Patch Signal capture (append-only)
Minor /dx-learn /dx-retro
Minor Pattern consumption
Major Active learning
KAI by Dragan Filipovic