Agent Memory Is the New Context Window
Anthropic shipped 1M context windows in GA this week. OpenAI did the same last week. Gemini has had it since early 2024. The major providers are converged at 1M tokens—and by all signs, they'll stay there for a while.
Meanwhile, three projects trending on GitHub this week all solve the same problem from different angles: how agents remember and learn across sessions. The implicit message is clear: the bottleneck has shifted from "how much can the model hold" to "how well can the agent retain."
The Context Ceiling
Latent Space's swyx put it bluntly in this week's episode with semiconductor analyst Doug O'Laughlin: context windows have been effectively stuck at 1M tokens for two years. The constraint isn't algorithmic—it's physical. There's not enough HBM and DRAM at inference sites to serve larger contexts at scale.
The prediction: context windows won't meaningfully exceed 1M in the next two years. Sam Altman's forecast of 100x longer context? "Take the under on that," swyx says.
The emerging concept is context rationing—a future where free-tier users get tiny context windows (1K tokens, perhaps) while paid tiers compete on how much context they can economically serve. A 1M context window is "a mansion," as O'Laughlin puts it. Not everyone gets one.
Three Signals, One Direction
Against this backdrop, three GitHub trending projects from the week of March 8-14 paint a coherent picture:
1. vectorize-io/hindsight — "Agent Memory That Learns"
Hindsight (3,736 stars, 595 new this week) takes a pragmatic approach: extract lessons from completed tasks and store them as retrievable memories. The name is telling—it learns after the fact, building a growing library of what worked and what didn't.
This is the "experience replay" of the agent world. Instead of stuffing everything into context, Hindsight curates. Past failures become future guardrails. Past successes become reusable patterns.
2. NousResearch/hermes-agent — "The Agent That Grows With You"
Hermes Agent (6,813 stars, 4,787 new this week) from NousResearch emphasizes adaptive behavior over time. The framing is agent personalization through accumulated interaction—not fine-tuning weights, but building a persistent behavioral profile.
NousResearch has consistently pushed the open-source agent frontier. Their entry here validates the thesis that memory is the next competitive axis for agent platforms.
3. IBM's Trajectory-Based Learning (via @dair_ai)
Not a GitHub project but a research signal with hard numbers: IBM extracted reusable strategy, recovery, and optimization tips from agent trajectories. The results on AppWorld benchmarks:
- Task completion: 69.6% → 73.2% (+3.6pp)
- Scenario goals: 50.0% → 64.3% (+14.3pp)
The biggest gains came on hard tasks—the ones where brute-force context stuffing fails anyway. This is the strongest evidence yet that curated memory outperforms raw context length.
The Memory Taxonomy
Not all agent memory is the same. The emerging landscape breaks down into distinct types:
- Episodic: "What happened in this session" — raw trajectory logs, conversation history. Expensive to store, cheap to retrieve.
- Semantic: "What I know about the world" — facts, entity relationships, domain knowledge. The RAG layer.
- Procedural: "How to do things" — strategies, recovery patterns, optimization tips. What IBM's work extracts.
- Behavioral: "How I should act" — personality, preferences, learned heuristics. What Hermes builds over time.
The projects trending this week span all four types. Hindsight leans procedural. Hermes leans behavioral. IBM's work bridges procedural and semantic. The market is telling us: agents need all four, and the plumbing doesn't exist yet.
What This Means for Builders
If you're building agent applications today, three practical takeaways:
1. Design for context scarcity, not abundance. Don't assume you'll have 1M tokens of context. Design your agent to work with 32K-128K context and external memory. The winners will be context-efficient, not context-greedy.
2. Invest in memory architecture now. Vector stores for semantic memory, structured logs for episodic memory, prompt-embedded heuristics for procedural memory. The infrastructure is immature—building it yourself today is a competitive advantage that becomes table stakes in 12 months.
3. Benchmark memory quality, not just retrieval speed. The IBM results show that the quality of what agents remember matters more than how fast they retrieve it. Curated, distilled memory beats raw log search.
The MCP Angle
Worth noting: the Model Context Protocol conversation this week has been dominated by ergonomics complaints, not capability gaps. As @pamelafox joked, "MCP was pronounced dead on Twitter, after mass exposure to curl." But @llama_index's take cuts deeper: MCP tools are strong for deterministic, centrally maintained APIs with rapidly changing ground truth. Skills are lighter but more failure-prone.
The connection to memory is direct: MCP is becoming the transport layer for agent memory systems. New web MCP support in Chrome v146 enables agents to continuously browse and compile contextual summaries. The protocol isn't dead—it's being repurposed from "tool calling" to "memory plumbing."
Bottom Line
Context windows hit their physical ceiling. Agent memory is the next scaling dimension. The projects and research trending this week aren't just incremental improvements—they're building the retention layer that the entire agent ecosystem will depend on.
The question isn't "how much context do you have." It's "how much do you remember."
Sources: Latent Space AINews (3/13/2026), GitHub Trending (3/8-14/2026), @dair_ai, @llama_index, @pamelafox, Doug O'Laughlin interview via Latent Space podcast.