LabNotes
March 14, 2026Visual AnalysisAgent Systems
◆ Experimental

Self-Improving Agents: Continuous Learning Loops (Experimental)

Visual breakdown of the four-stage learning loop: detection rates, capture quality, storage tiers, and application effectiveness across real implementations.

KPI Snapshot

4
Loop Stages
Detect → Capture → Store → Apply
3
Storage Tiers
Raw logs → Project rules → Behavioral core
~30%
Error Detection Rate
Confident hallucinations missed
Promotion Threshold
Recurrences before permanent storage

Loop Stage Effectiveness

Stage-by-stage effectiveness (qualitative assessment)
Hook-based
Manual log
Static rules
Session memory
Error detection
High
Medium
Low
Low
Capture quality
Medium
High
Medium
Medium
Storage durability
Medium
High
High
Medium
Application ease
High
Medium
High
Low
Human dependency
Low
Medium
High
Medium

Component Maturity Bars

Error Detection Coverage

Command failures (exit codes)~95%
User corrections ("actually...")~80%
Suboptimal but working output~35%
Confident hallucinations~5%

Learning Type ROI

Convention capture ("use pnpm")High
Error reproduction patternsHigh
Behavioral correctionsMedium
Architectural learningsLow

Storage Tier Durability

.learnings/ raw logsSession-scoped
AGENTS.md / TOOLS.md rulesProject-scoped
SOUL.md behavioral corePermanent

Implementation Comparison

Real implementations assessed
OpenClaw
Self-Improvement
CLAUDE.md
/ Cursor Rules
Session Memory
/ Daily Logs
RAG-based
Learning
Structured logging
✓ Strict schema
✗ Freeform
~ Semi-structured
~ Depends on embed
Promotion pathway
✓ Built-in
✗ Manual
~ Heartbeat distill
✗ N/A
Cross-session
✓ Persistent files
✓ Persistent files
~ File-based
✓ Vector store
Error hooks
✓ PostToolUse
~ Platform dep.
Recurrence tracking
✓ Count + dates

The Detection Gap

~95%
Detectable
Command failures, timeouts, exceptions
~35%
Partial
Suboptimal output, missed better approaches
~5%
Blind Spot
Confident hallucinations, silent failures

Assessment based on production testing across OpenClaw, Claude Code, and Cursor agent platforms, March 2026. Effectiveness ratings are qualitative — not automated benchmarks.