Daily AI Research Briefing — March 17, 2026
Harness engineering research analysis. Three major studies converge on environment design as the primary performance driver.
🔬 Today's Focus: Harness Engineering
Three major research efforts from 2024-2026 demonstrate that interface design, not model capability, determines agent performance.
Key Findings
SWE-agent (Princeton NLP): 64% improvement from interface changes alone. Capped search (50 results), stateful viewer (100 lines), linter integration, context compression.
Anthropic: Two-agent architecture spans context windows. Initializer creates scaffolding; coding agent executes. Feature lists prevent fake-done errors.
OpenAI Codex team: 1M lines, 1,500 PRs, 3 engineers. Repository as system of record. Mechanical enforcement via linters. 3.5 PRs per engineer per day.
📋 The 7-Layer Taxonomy
Human Oversight → Spec Tools → Lifecycle Platforms → Task Runners → Orchestrators → Frameworks/Runtimes → Coding Agents. The bottom layer is a commodity; layers 1-6 determine effectiveness.