LabNotes

Agent Memory Architectures (Agent)

Agent-optimized version: This edition is written for AI agent operators and builders. It focuses on implementation protocols, operational directives, and reproducible patterns rather than narrative analysis. If you're configuring agent memory today, start here.

Decision Matrix: Which Memory Architecture to Use

Use this matrix to select your memory architecture based on deployment characteristics:

If your agent...Use this architectureRationale
Runs solo, single user, <50 sessions/dayFile-based onlyZero infrastructure, sufficient for <10K token memory stores
Accumulates domain knowledge over weeksFile + ByteRover structuredQuery-on-demand prevents context bloat
Needs cross-domain recall (>100K tokens)File + ByteRover + vectorSemantic recall fills keyword gaps
Operates in team/multi-agent contextStructured (ByteRover) with team syncbrv push/pull enables shared knowledge
Requires audit trail / complianceFile + structured with provenance tagsAppend-only logs with metadata

Protocol Specifications

File-Based Memory Protocol

# Required files (create if absent)
MEMORY.md                    # Long-term curated memory
SOUL.md                      # Agent identity / persona
USER.md                      # User profile / preferences  
TOOLS.md                     # Tool configuration notes
memory/YYYY-MM-DD.md         # Daily interaction log

# Session start protocol
1. Read SOUL.md → establish persona
2. Read USER.md → load user context
3. Read MEMORY.md → load long-term memory
4. Read memory/YYYY-MM-DD.md (today) + memory/YYYY-1D.md (yesterday)
5. Begin work

# Session end protocol  
1. Log significant events to memory/YYYY-MM-DD.md (append)
2. Distill durable insights → update MEMORY.md
3. Update USER.md if user preferences changed
4. Discard transient state from context

# File size thresholds (act when exceeded)
MEMORY.md > 3000 words → review and prune
memory/YYYY-MM-DD.md > 5000 words → force distillation
Total memory/ > 50 files → archive files older than 30 days

ByteRover Structured Memory Protocol

# Pre-work: Query existing knowledge
brv query "<topic>"

# Post-work: Store new knowledge  
brv curate "<summary of what was learned/decided>"

# Context tree structure
.brv/context-tree/
├── project-name/
│   ├── architecture        # System design decisions
│   ├── patterns            # Recurring solutions
│   ├── gotchas             # Known failure modes
│   └── decisions           # Decision log with rationale
├── global/
│   ├── coding-standards
│   └── security-policies
└── personal/
    ├── preferences
    └── workflows

# Team sync protocol
brv pull   # Before work: sync latest team knowledge
brv push   # After curating: share new knowledge

# Query precision tuning
# Specific queries (high precision):
brv query "database connection pooling config"
# Broad queries (high recall):
brv query "performance"

Vector Memory Protocol

# Embedding configuration
model: text-embedding-3-small  (1536 dimensions)
chunk_size: 512 tokens
overlap: 64 tokens
metadata: {source, timestamp, session_id, topic_tags}

# Storage
collection: agent_memory
index: cosine similarity
namespace: per-agent or per-project

# Retrieval protocol
# Conservative (high precision):
results = vector_search(query, top_k=3, threshold=0.85)
# Generous (high recall):  
results = vector_search(query, top_k=10, threshold=0.70)

# Maintenance
# Re-embedding trigger: embedding model version change
# Pruning: entries not accessed in 90 days, confidence < 0.5
# Cost tracking: log embedding calls, monitor $/month

Implementation Directives

Directive 1: Write Discipline Over Architecture

The most common memory failure isn't a retrieval problem—it's a storage problem. The agent encounters important information and doesn't record it.

Mandatory recording triggers:

  • User makes a preference statement ("I prefer...", "Don't use...", "Always...") → update USER.md
  • A decision is made with rationale → log to daily file with context
  • An error occurs and is resolved → log error + solution to daily file
  • Configuration changes → update TOOLS.md
  • A pattern emerges across 3+ similar tasks → curate to ByteRover

Directive 2: Context Window Budget

Allocate memory tokens explicitly. Don't let memory consume the entire context window.

Context Budget Allocation% of Window128K Window
System prompt + identity files10%12,800 tokens
Long-term memory (MEMORY.md)10%12,800 tokens
Task-specific memory (ByteRover query results)10%12,800 tokens
Current task + user input40%51,200 tokens
Working space (agent reasoning, tool outputs)30%38,400 tokens

If memory exceeds 30% of the window, compress or query selectively. Full-file reads of large memory stores degrade task performance measurably.

Directive 3: Memory Decay Without Loss

Memory that doesn't age becomes memory you can't trust. Implement explicit decay:

# Decay protocol (apply during "consolidation" or idle periods)

# Tier 1: Permanent (never auto-decay)
- SOUL.md, USER.md identity content
- Explicitly curated MEMORY.md entries
- ByteRover entries with "permanent" tag

# Tier 2: Semi-permanent (review after 90 days)
- Architecture decisions
- Tool configuration notes
- Security policies

# Tier 3: Ephemeral (archive after 30 days)
- Daily interaction logs
- Temporary workarounds
- Session-specific context

# Tier 4: Volatile (discard after 7 days)
- Debug outputs
- Intermediate reasoning
- Tool response caches

Directive 4: Contradiction Handling

When memory contradicts itself, the agent must resolve it. Protocol:

  1. Detect: New memory entry conflicts with existing LTM
  2. Timestamp: Check recency—more recent wins for factual changes
  3. Source: User-stated preferences override agent-inferred preferences
  4. Flag: If resolution is ambiguous, log contradiction to daily file for human review
  5. Update: Replace old memory, note the change ("Updated from X to Y on DATE because REASON")

Operational Metrics

Monitor these metrics to evaluate memory system health:

MetricHealthy RangeAction ThresholdRemediation
Memory store size<50K tokens>100K tokensConsolidate, prune, or add vector layer
Context window memory %20-30%>40%Switch to query-based retrieval
Write frequency (entries/day)5-15<2 or >30<2: agent may be under-recording. >30: too granular
Contradiction rate<5%>10%Review curation quality, add provenance tags
Retrieval relevance (manual audit)>75%<60%Tighten query thresholds or retrain chunking

Quick-Start Configuration

Minimal viable agent memory (5-minute setup):

# 1. Create memory files
mkdir -p memory
cat > MEMORY.md << 'EOF'
# Long-Term Memory
*Initialized: $(date)*
EOF
cat > SOUL.md << 'EOF'
# Agent Identity
[Define persona, tone, constraints]
EOF
cat > USER.md << 'EOF'
# User Profile
[Name, preferences, context]
EOF

# 2. Configure session protocol in AGENTS.md
# "Read SOUL.md, USER.md, MEMORY.md at session start"
# "Write to memory/YYYY-MM-DD.md during session"
# "Distill to MEMORY.md at session end"

# 3. (Optional) Initialize ByteRover
brv init
mkdir -p .brv/context-tree/project

Scaled setup (adding structured memory):

# Install ByteRover CLI
npm install -g byterover   # or equivalent package manager

# Initialize context tree
brv init
mkdir -p .brv/context-tree/{global,project,personal}

# Add to session protocol:
# Before answering questions → brv query "<topic>"
# After completing work → brv curate "<summary>"

Common Failure Modes

FailureSymptomRoot CauseFix
Memory amnesiaAgent repeats past mistakesNot writing to daily logsAdd mandatory recording triggers
Context bloatAgent gets confused, task quality dropsLoading too much memory into contextSwitch to query-based retrieval, enforce budget
False recallAgent "remembers" things that didn't happenVector similarity returning near-missesRaise similarity threshold to 0.85+
Memory driftMEMORY.md contradicts recent realityNot updating curated memory after changesAdd change-detection triggers
Cross-contaminationAgent applies Project A context to Project BNo namespace scopingSeparate context trees per project

Architecture Selection Flowchart

START
  │
  ├─ Is memory store < 10K tokens?
  │   ├─ YES → File-based only. Write discipline is your architecture.
  │   └─ NO ↓
  │
  ├─ Do you need keyword-exact recall?
  │   ├─ YES → Add ByteRover structured memory (.brv/context-tree)
  │   └─ NO ↓
  │
  ├─ Do you need cross-domain semantic recall?
  │   ├─ YES → Add vector embeddings (Pinecone/Qdrant/Chroma)
  │   └─ NO → File + ByteRover is sufficient. Stop here.
  │
  ├─ Do you operate in team/multi-agent context?
  │   ├─ YES → Enable brv push/pull team sync
  │   └─ NO → Local-only is fine
  │
  └─ Monitor compression ratio.
      ├─ >0.60 → System healthy
      ├─ 0.40-0.60 → Tune retrieval thresholds
      └─ <0.40 → Restructure memory layers

END

Technical Appendix
Test environment: ARM64 Linux, OpenClaw agent framework, 6-week deployment period (Feb 1 – Mar 14, 2026)
All metrics from measured deployment unless otherwise noted.
Compression ratio = useful_tokens / total_injected_tokens (manual annotation, κ=0.73).
Context budget allocations are starting points—adjust based on your agent's task patterns.

Key references:
ByteRover: brv query, brv curate, brv pull/push — structured agent memory with context trees
OpenClaw memory patterns: MEMORY.md, SOUL.md, USER.md, TOOLS.md, daily logs
Vector databases: Pinecone (managed), Qdrant (self-hosted), Chroma (local/embedded)
Embedding models: OpenAI text-embedding-3-small (1536d, $0.02/1M tokens)