Agent Memory Architectures (Agent)

Agent-optimized version: This edition is written for AI agent operators and builders. It focuses on implementation protocols, operational directives, and reproducible patterns rather than narrative analysis. If you're configuring agent memory today, start here.

Decision Matrix: Which Memory Architecture to Use

Use this matrix to select your memory architecture based on deployment characteristics:

If your agent...	Use this architecture	Rationale
Runs solo, single user, <50 sessions/day	File-based only	Zero infrastructure, sufficient for <10K token memory stores
Accumulates domain knowledge over weeks	File + ByteRover structured	Query-on-demand prevents context bloat
Needs cross-domain recall (>100K tokens)	File + ByteRover + vector	Semantic recall fills keyword gaps
Operates in team/multi-agent context	Structured (ByteRover) with team sync	`brv push/pull` enables shared knowledge
Requires audit trail / compliance	File + structured with provenance tags	Append-only logs with metadata

Protocol Specifications

File-Based Memory Protocol

# Required files (create if absent)
MEMORY.md                    # Long-term curated memory
SOUL.md                      # Agent identity / persona
USER.md                      # User profile / preferences  
TOOLS.md                     # Tool configuration notes
memory/YYYY-MM-DD.md         # Daily interaction log

# Session start protocol
1. Read SOUL.md → establish persona
2. Read USER.md → load user context
3. Read MEMORY.md → load long-term memory
4. Read memory/YYYY-MM-DD.md (today) + memory/YYYY-1D.md (yesterday)
5. Begin work

# Session end protocol  
1. Log significant events to memory/YYYY-MM-DD.md (append)
2. Distill durable insights → update MEMORY.md
3. Update USER.md if user preferences changed
4. Discard transient state from context

# File size thresholds (act when exceeded)
MEMORY.md > 3000 words → review and prune
memory/YYYY-MM-DD.md > 5000 words → force distillation
Total memory/ > 50 files → archive files older than 30 days

ByteRover Structured Memory Protocol

# Pre-work: Query existing knowledge
brv query "<topic>"

# Post-work: Store new knowledge  
brv curate "<summary of what was learned/decided>"

# Context tree structure
.brv/context-tree/
├── project-name/
│   ├── architecture        # System design decisions
│   ├── patterns            # Recurring solutions
│   ├── gotchas             # Known failure modes
│   └── decisions           # Decision log with rationale
├── global/
│   ├── coding-standards
│   └── security-policies
└── personal/
    ├── preferences
    └── workflows

# Team sync protocol
brv pull   # Before work: sync latest team knowledge
brv push   # After curating: share new knowledge

# Query precision tuning
# Specific queries (high precision):
brv query "database connection pooling config"
# Broad queries (high recall):
brv query "performance"

Vector Memory Protocol

# Embedding configuration
model: text-embedding-3-small  (1536 dimensions)
chunk_size: 512 tokens
overlap: 64 tokens
metadata: {source, timestamp, session_id, topic_tags}

# Storage
collection: agent_memory
index: cosine similarity
namespace: per-agent or per-project

# Retrieval protocol
# Conservative (high precision):
results = vector_search(query, top_k=3, threshold=0.85)
# Generous (high recall):  
results = vector_search(query, top_k=10, threshold=0.70)

# Maintenance
# Re-embedding trigger: embedding model version change
# Pruning: entries not accessed in 90 days, confidence < 0.5
# Cost tracking: log embedding calls, monitor $/month

Implementation Directives

Directive 1: Write Discipline Over Architecture

The most common memory failure isn't a retrieval problem—it's a storage problem. The agent encounters important information and doesn't record it.

Mandatory recording triggers:

User makes a preference statement ("I prefer...", "Don't use...", "Always...") → update USER.md
A decision is made with rationale → log to daily file with context
An error occurs and is resolved → log error + solution to daily file
Configuration changes → update TOOLS.md
A pattern emerges across 3+ similar tasks → curate to ByteRover

Directive 2: Context Window Budget

Allocate memory tokens explicitly. Don't let memory consume the entire context window.

Context Budget Allocation	% of Window	128K Window
System prompt + identity files	10%	12,800 tokens
Long-term memory (MEMORY.md)	10%	12,800 tokens
Task-specific memory (ByteRover query results)	10%	12,800 tokens
Current task + user input	40%	51,200 tokens
Working space (agent reasoning, tool outputs)	30%	38,400 tokens

If memory exceeds 30% of the window, compress or query selectively. Full-file reads of large memory stores degrade task performance measurably.

Directive 3: Memory Decay Without Loss

Memory that doesn't age becomes memory you can't trust. Implement explicit decay:

# Decay protocol (apply during "consolidation" or idle periods)

# Tier 1: Permanent (never auto-decay)
- SOUL.md, USER.md identity content
- Explicitly curated MEMORY.md entries
- ByteRover entries with "permanent" tag

# Tier 2: Semi-permanent (review after 90 days)
- Architecture decisions
- Tool configuration notes
- Security policies

# Tier 3: Ephemeral (archive after 30 days)
- Daily interaction logs
- Temporary workarounds
- Session-specific context

# Tier 4: Volatile (discard after 7 days)
- Debug outputs
- Intermediate reasoning
- Tool response caches

Directive 4: Contradiction Handling

When memory contradicts itself, the agent must resolve it. Protocol:

Detect: New memory entry conflicts with existing LTM
Timestamp: Check recency—more recent wins for factual changes
Source: User-stated preferences override agent-inferred preferences
Flag: If resolution is ambiguous, log contradiction to daily file for human review
Update: Replace old memory, note the change ("Updated from X to Y on DATE because REASON")

Operational Metrics

Monitor these metrics to evaluate memory system health:

Metric	Healthy Range	Action Threshold	Remediation
Memory store size	<50K tokens	>100K tokens	Consolidate, prune, or add vector layer
Context window memory %	20-30%	>40%	Switch to query-based retrieval
Write frequency (entries/day)	5-15	<2 or >30	<2: agent may be under-recording. >30: too granular
Contradiction rate	<5%	>10%	Review curation quality, add provenance tags
Retrieval relevance (manual audit)	>75%	<60%	Tighten query thresholds or retrain chunking

Quick-Start Configuration

Minimal viable agent memory (5-minute setup):

# 1. Create memory files
mkdir -p memory
cat > MEMORY.md << 'EOF'
# Long-Term Memory
*Initialized: $(date)*
EOF
cat > SOUL.md << 'EOF'
# Agent Identity
[Define persona, tone, constraints]
EOF
cat > USER.md << 'EOF'
# User Profile
[Name, preferences, context]
EOF

# 2. Configure session protocol in AGENTS.md
# "Read SOUL.md, USER.md, MEMORY.md at session start"
# "Write to memory/YYYY-MM-DD.md during session"
# "Distill to MEMORY.md at session end"

# 3. (Optional) Initialize ByteRover
brv init
mkdir -p .brv/context-tree/project

Scaled setup (adding structured memory):

# Install ByteRover CLI
npm install -g byterover   # or equivalent package manager

# Initialize context tree
brv init
mkdir -p .brv/context-tree/{global,project,personal}

# Add to session protocol:
# Before answering questions → brv query "<topic>"
# After completing work → brv curate "<summary>"

Common Failure Modes

Failure	Symptom	Root Cause	Fix
Memory amnesia	Agent repeats past mistakes	Not writing to daily logs	Add mandatory recording triggers
Context bloat	Agent gets confused, task quality drops	Loading too much memory into context	Switch to query-based retrieval, enforce budget
False recall	Agent "remembers" things that didn't happen	Vector similarity returning near-misses	Raise similarity threshold to 0.85+
Memory drift	MEMORY.md contradicts recent reality	Not updating curated memory after changes	Add change-detection triggers
Cross-contamination	Agent applies Project A context to Project B	No namespace scoping	Separate context trees per project

Architecture Selection Flowchart

START
  │
  ├─ Is memory store < 10K tokens?
  │   ├─ YES → File-based only. Write discipline is your architecture.
  │   └─ NO ↓
  │
  ├─ Do you need keyword-exact recall?
  │   ├─ YES → Add ByteRover structured memory (.brv/context-tree)
  │   └─ NO ↓
  │
  ├─ Do you need cross-domain semantic recall?
  │   ├─ YES → Add vector embeddings (Pinecone/Qdrant/Chroma)
  │   └─ NO → File + ByteRover is sufficient. Stop here.
  │
  ├─ Do you operate in team/multi-agent context?
  │   ├─ YES → Enable brv push/pull team sync
  │   └─ NO → Local-only is fine
  │
  └─ Monitor compression ratio.
      ├─ >0.60 → System healthy
      ├─ 0.40-0.60 → Tune retrieval thresholds
      └─ <0.40 → Restructure memory layers

END

Technical Appendix
Test environment: ARM64 Linux, OpenClaw agent framework, 6-week deployment period (Feb 1 – Mar 14, 2026)
All metrics from measured deployment unless otherwise noted.
Compression ratio = useful_tokens / total_injected_tokens (manual annotation, κ=0.73).
Context budget allocations are starting points—adjust based on your agent's task patterns.

Key references:
ByteRover: brv query, brv curate, brv pull/push — structured agent memory with context trees
OpenClaw memory patterns: MEMORY.md, SOUL.md, USER.md, TOOLS.md, daily logs
Vector databases: Pinecone (managed), Qdrant (self-hosted), Chroma (local/embedded)
Embedding models: OpenAI text-embedding-3-small (1536d, $0.02/1M tokens)