Learning & Feedback
How agents accumulate knowledge over time -- from raw signal capture to rule promotion. Fix patterns, review issues, build errors, and execution metrics feed back into future runs.
Existing Feedback Mechanisms
Seven mechanisms already capture signals during execution. Most are per-session and lose memory between runs.
Self-Healing Loop
Scope: Per-session, per-step. dx-step-fix recovers from failures during execution (targeted fix + root cause escalation). Memory is lost after the session ends — each run starts fresh.
Review-Fix Loop
Scope: Per-session, per-story. dx-step-verify runs up to 3 review-fix cycles. Same issues may recur across stories because findings are not persisted.
Weekly Retrospective
Scope: Per-week, automation only. Written to S3 bundles. Human reviews retro and manually adjusts prompts/rules. Gap: only covers automation agents, not local CLI usage.
Evaluation Framework
Scope: Pre-deployment quality gate. Fixture pass/fail results. Human updates fixtures when prompts change. Gap: does not learn from production runs.
Decision Journal
Scope: Per-run, automation only. Every agent decision logged with reasoning to S3. Human reviews journals and manually adjusts policy. Gap: no automated pattern extraction.
Policy Gate
Scope: Global, static.
pipeline-policy.yaml restricts what each agent role can do.
Manually maintained — human updates after incidents.
Shared Rules
Scope: Project-wide, git-tracked.
.ai/rules/ files read by dx skills and automation agents.
Manual only — no automated discovery or suggestion.
What's Missing
No per-project pattern accumulation across stories. No automated rule promotion. No calibration data for planning estimates. No cross-session memory for local CLI usage.
Three Storage Tiers
Different knowledge types belong in different places -- from auto-loaded rules to raw machine data.
Raw signals accumulate, patterns emerge, and proven fixes get promoted to rules.
Tier 1: .claude/rules/
Knowledge that influences every Claude interaction. Import conventions, naming patterns, test fixture patterns. Populated after a pattern is seen 3+ times and the developer confirms.
Format: learned-<topic>.md
Tier 2: .ai/
.ai/rules/— shared rules for dx skills and automation.ai/learning/— accumulated data (fix history, metrics).ai/project/— AEM component gotchas, market quirks
Tier 3: .ai/learning/raw/
Machine-readable append-only logs: fixes.jsonl, runs.jsonl,
review-issues.jsonl. Per-developer, no repo noise. Input for aggregation
into Tier 1 and Tier 2 files.
Seven Learning Capabilities
Each capability captures a specific signal type and feeds it back into future runs.
1. Fix Pattern Memory
Storage: fixes.md + fixes.jsonl
After every dx-step-fix cycle, capture error type, message pattern, fix applied,
and result. Before attempting a fix, dx-step-fix checks fixes.md for known patterns.
If a strategy worked 3/3 times, try that first. Every 10 entries, fixes.md is regenerated.
2. Review Issue Tracker
Storage: review-issues.md + review-issues.jsonl
After every dx-step-verify or dx-code-reviewer run, capture issue type, file/component,
auto-fix success, and cycle count. If 5+ issues of the same type appear, suggest adding a
convention rule. Security patterns are promoted immediately (no threshold). Feeds into plan
generation for preventive steps.
3. Build Error Catalog
Storage: patterns.md
After every dx-step-build run, capture generalized error message patterns, affected file
types/modules, root cause categories, and applied fixes. Common errors get fixed faster
via known pattern matching.
4. Convention Discovery
Storage: conventions-discovered.md
After code review finds convention patterns, capture convention name, examples, and confidence
level. Can be promoted to .ai/rules/ when confirmed. Reduces false positive
review findings over time.
5. Execution Metrics
Storage: metrics.md + runs.jsonl
After every coordinator completes, capture ticket, flow, phases, steps, fixes, heals, tokens,
duration, and result. Enables trend analysis, token budget planning, phase bottleneck identification,
and estimate calibration.
6. Story Archetype Matching
After each successful story completion, capture story type, component area, step count/types, and patterns needed. During planning, check for similar past stories to reuse step structure and predict fix/healing needs.
7. Weekly Retrospective
On demand (/dx-retro) or automated. Reads all .ai/learning/raw/*.jsonl files,
aggregates success rates, common errors, and token trends. Identifies improving/degrading patterns.
Suggests new rules and convention updates. Writes .ai/learning/retro/YYYY-WNN-summary.md.
Storage Tiers Hierarchy
Knowledge type determines storage location -- from auto-loaded rules to raw machine data.
| Knowledge Type | Best Location | Why | Who Reads It |
|---|---|---|---|
| Project conventions (coding patterns, naming rules) | .claude/rules/ | Auto-loaded on every Claude invocation | Every Claude interaction |
| Domain knowledge (component patterns, architecture) | .ai/rules/ | Read by skills via config. Shared across team. | Skills that read .ai/rules/*.md |
| Operational metrics (run stats, timing) | .ai/learning/ | Machine data, needs aggregation before useful | /dx-retro, /dx-learn |
| Fix recipes (error to fix mappings) | .claude/rules/ (once proven) | Becomes a rule: “when you see X, fix with Y” | dx-step-fix, dx-step-build |
| Project-specific knowledge (market quirks) | .ai/project/ | Same area as seed data | aem-file-resolver, dx skills |
/dx-learn and /dx-retro
Two dedicated skills for aggregating and analyzing learning data.
/dx-learn -- Aggregate
Reads all .jsonl files in .ai/learning/raw/ and generates aggregated markdown
summaries: fixes.md, review-issues.md, patterns.md,
metrics.md, conventions-discovered.md. Run manually or triggered automatically
every 10 entries when auto-aggregate is enabled.
/dx-retro -- Retrospective
Generates weekly retrospective with run counts, success rates, top fix/review patterns,
trend analysis, and suggested actions. Writes to .ai/learning/retro/YYYY-WNN-summary.md.
Can be run on demand or scheduled via automation pipelines.
Implementation Sketch
Which skill reads what and writes where -- the data flow of the learning system.
| Skill | Reads Before Acting | Writes After Acting | Promotion Trigger |
|---|---|---|---|
dx-step-fix | fixes.md for known patterns (3+ successes = try first) | Append to fixes.jsonl; regenerate fixes.md every 10 entries | 3+ successes, 0 failures |
dx-step-verify | — | Append issues to review-issues.jsonl | 5+ same-type issues = suggest convention rule |
dx-plan | metrics.md + conventions-discovered.md + fixes.md | — | — |
dx-code-reviewer | — | Append issues; propose learned-convention-*.md | 5+ same-type across stories |
aem-verify | — | Update component-gotchas.md | 2+ same component issues |
*-all coordinators | — | Append run data to runs.jsonl | — |
Security Patterns
Security-related patterns (secrets, vulnerabilities) are promoted immediately with no threshold. The developer confirmation requirement is waived for security — these become rules automatically.
Skill-Level Integration
Each skill decides where to read from and write to. No central learning system -- skills own their signals.
| Skill | Before Acting | After Acting |
|---|---|---|
dx-step-fix | Read fixes.md for known patterns | Append to fixes.jsonl |
dx-step-verify | — | Append issues to review-issues.jsonl |
dx-plan | Read metrics.md + conventions-discovered.md + fixes.md | — |
dx-code-reviewer | — | Append issues; propose rule at 5+ occurrences |
aem-verify | — | Update component-gotchas.md at 2+ occurrences |
*-all coordinators | — | Append run data to runs.jsonl |
/dx-learn
Reads all .jsonl files in .ai/learning/raw/, generates aggregated markdown
summaries: fixes.md, review-issues.md, patterns.md,
metrics.md, conventions-discovered.md.
/dx-retro
Generates weekly retrospective with run counts, success rates, top fix/review patterns, trend analysis, and suggested actions.
Consumer Project Layout
How learning data is organized in a consumer repo.
Directory Structure
.ai/
learning/
fixes.md -- aggregated fix patterns
review-issues.md -- aggregated review findings
patterns.md -- build error catalog
conventions-discovered.md -- discovered conventions
metrics.md -- execution metrics summary
retro/
YYYY-WNN-summary.md -- weekly retrospective
raw/ -- .gitignored
runs.jsonl
fixes.jsonl
review-issues.jsonl Key Design Decisions
- Skills decide where to write — no central system
- Threshold before promotion — 3+ for build fixes, 5+ for conventions
- Developer confirmation for rules — never auto-create
.claude/rules/files - Security patterns promoted immediately — no threshold for secrets/vulnerabilities
- Raw data is private — gitignored, per-developer
- Summaries are shared — committable, team knowledge
- Opt-in consumption — skills check if learning files exist but work without them
- No breaking changes — learning is additive enhancement, not dependency
Configuration
Learning is controlled via .ai/config.yaml with sensible defaults.
Config Options
learning: enabled: true # Master switch auto-aggregate: true # Auto-regenerate summaries every 10 entries retention-days: 90 # Raw data retention (0 = forever) git-commit-summaries: true # Include .md summaries in commits git-ignore-raw: true # Gitignore .jsonl raw data
Incremental rollout from signal capture to active learning.