Zed Editor Architecture / Chapter 1

GitHub Trending /

Ch.1: Foundations

# The Harness You Can't See > The agent harness — not the LLM — is the primary determinant of what an agent can actually do. Claw Code and DeerFlow prove this from opposite starting points. This series will show you why the scaffolding matters more than the brain, and how to choose between two fundamentally different philosophies of building it. --- Three years ago, building an "AI agent" meant wrapping an API call in a while loop and calling it a day. You sent a prompt, got a response, maybe looped a few times if you were feeling ambitious. The LLM was the product. Everything else was plumbing. Today, the plumbing has become the product. An agent is no longer a single model call — it's a system of systems: a runtime that manages turns, a permission layer that constrains actions, a memory system that persists knowledge across sessions, a tool registry that determines what the agent can reach, and an orchestration layer that decides when to delegate, when to retry, and when to stop. The model is a component. The harness is the architecture. This series is a comparative deep-dive into two open-source agent harnesses — **Claw Code** and **DeerFlow** — that sit at opposite ends of the design spectrum. Claw Code (`ultraworkers/claw-code`) is a minimalist Rust CLI agent with six built-in tools, a five-tier permission hierarchy, and an explicit philosophy that less is more. DeerFlow (`bytedance/deer-flow`) is a full-stack Python super-agent platform with twenty-one middleware layers, semantic memory, sub-agent delegation, and a "batteries included" philosophy backed by ByteDance. If you're here for a winner, you'll be disappointed. These projects are not competitors — they are divergent solutions to the same fundamental problem: *how to give an LLM structured access to the world without losing control.* The choice between them tells you more about your operational context — your team size, your risk tolerance, your deployment environment — than about the projects themselves. Here is the core argument of this series in one paragraph, which I will spend the remaining chapters proving: > **The harness is the invisible determinant of agent capability. Claw Code and DeerFlow represent two fundamentally different philosophies of agent infrastructure — one a minimalist, safety-first CLI harness, the other a full-stack, batteries-included platform. The choice between them reveals more about your operational context than about the projects themselves. Beneath the surface difference in scale, both projects face the same irreducible problems: context management, memory persistence, safe tool execution, and autonomous decision-making. How they solve these problems — and what they trade off in the process — is the subject of these fourteen chapters.** ## A Brief History of Agent Architecture The evolution of agent infrastructure has followed a recognizable pattern. In 2023, the dominant pattern was the single-prompt agent: a system prompt, a loop, and a prayer. Projects like AutoGPT demonstrated what was possible but also revealed the limits — context windows filled, loops went infinite, tools were called with no safety net. By 2024, the industry had converged on a basic agent loop pattern: user input → model call → tool execution → result injection → repeat. LangChain and similar frameworks codified this pattern. The question shifted from "can we build an agent loop?" to "how do we build one that doesn't burn money, leak data, or loop forever?" 2025 brought the realization that the agent loop itself was not enough. You needed: - **Context management**: the system prompt is not a dumping ground. Every token has a cost. - **Memory**: the agent should remember what it learned across sessions, not start from zero every time. - **Safety**: the agent needs guardrails, not just prompt instructions. - **Observability**: you can't debug black-box agent behavior. - **Extensibility**: the harness must grow with the agent's capabilities. Both Claw Code and DeerFlow are responses to these demands. They a

Chapter 1 of 5 12 min Article 9 min audio 3 min video Learning path

The Harness You Can't See

The agent harness — not the LLM — is the primary determinant of what an agent can actually do. Claw Code and DeerFlow prove this from opposite starting points. This series will show you why the scaffolding matters more than the brain, and how to choose between two fundamentally different philosophies of building it.

---

Three years ago, building an "AI agent" meant wrapping an API call in a while loop and calling it a day. You sent a prompt, got a response, maybe looped a few times if you were feeling ambitious. The LLM was the product. Everything else was plumbing.

Today, the plumbing has become the product. An agent is no longer a single model call — it's a system of systems: a runtime that manages turns, a permission layer that constrains actions, a memory system that persists knowledge across sessions, a tool registry that determines what the agent can reach, and an orchestration layer that decides when to delegate, when to retry, and when to stop. The model is a component. The harness is the architecture.

This series is a comparative deep-dive into two open-source agent harnesses — Claw Code and DeerFlow — that sit at opposite ends of the design spectrum. Claw Code (ultraworkers/claw-code) is a minimalist Rust CLI agent with six built-in tools, a five-tier permission hierarchy, and an explicit philosophy that less is more. DeerFlow (bytedance/deer-flow) is a full-stack Python super-agent platform with twenty-one middleware layers, semantic memory, sub-agent delegation, and a "batteries included" philosophy backed by ByteDance.

If you're here for a winner, you'll be disappointed. These projects are not competitors — they are divergent solutions to the same fundamental problem: *how to give an LLM structured access to the world without losing control.* The choice between them tells you more about your operational context — your team size, your risk tolerance, your deployment environment — than about the projects themselves.

Here is the core argument of this series in one paragraph, which I will spend the remaining chapters proving:

The harness is the invisible determinant of agent capability. Claw Code and DeerFlow represent two fundamentally different philosophies of agent infrastructure — one a minimalist, safety-first CLI harness, the other a full-stack, batteries-included platform. The choice between them reveals more about your operational context than about the projects themselves. Beneath the surface difference in scale, both projects face the same irreducible problems: context management, memory persistence, safe tool execution, and autonomous decision-making. How they solve these problems — and what they trade off in the process — is the subject of these fourteen chapters.

A Brief History of Agent Architecture

The evolution of agent infrastructure has followed a recognizable pattern. In 2023, the dominant pattern was the single-prompt agent: a system prompt, a loop, and a prayer. Projects like AutoGPT demonstrated what was possible but also revealed the limits — context windows filled, loops went infinite, tools were called with no safety net.

By 2024, the industry had converged on a basic agent loop pattern: user input → model call → tool execution → result injection → repeat. LangChain and similar frameworks codified this pattern. The question shifted from "can we build an agent loop?" to "how do we build one that doesn't burn money, leak data, or loop forever?"

2025 brought the realization that the agent loop itself was not enough. You needed:

  • Context management: the system prompt is not a dumping ground. Every token has a cost.
  • Memory: the agent should remember what it learned across sessions, not start from zero every time.
  • Safety: the agent needs guardrails, not just prompt instructions.
  • Observability: you can't debug black-box agent behavior.
  • Extensibility: the harness must grow with the agent's capabilities.

Both Claw Code and DeerFlow are responses to these demands. They arrived at different architectural answers — Claw through a rigorously minimal Rust runtime, DeerFlow through a sprawling Python middleware chain — but they are asking the same questions.

flowchart TD
    subgraph "The Agent Harness Stack"
        LLM[LLM Provider] --> Runtime[Agent Runtime / Loop]
        Runtime --> Context[Context Engineering]
        Runtime --> Memory[Memory System]
        Runtime --> Tools[Tool Registry]
        Runtime --> Safety[Safety & Permissions]
        Runtime --> Orchestration[Multi-Agent Coordination]
        Context --> SystemPrompt[System Prompt Assembly]
        Context --> Compression[Window Compression]
        Memory --> Persistence[Session / Fact Storage]
        Tools --> Builtin[Built-in Tools]
        Tools --> MCP[MCP / Extensions]
        Safety --> Guardrails[Guardrails / Audit]
        Safety --> Sandbox[Execution Isolation]
        Orchestration --> SubAgents[Sub-Agent Delegation]
    end

    style LLM fill:#e1d5e7,stroke:#333
    style Runtime fill:#dae8fc,stroke:#333
    style Context fill:#d5e8d4,stroke:#333
    style Memory fill:#d5e8d4,stroke:#333
    style Tools fill:#d5e8d4,stroke:#333
    style Safety fill:#d5e8d4,stroke:#333
    style Orchestration fill:#d5e8d4,stroke:#333

This diagram is the roadmap. Each component — context, memory, tools, safety, orchestration — will be examined through the lens of both projects. The architecture above is not Claw's or DeerFlow's specifically; it is the universal agent harness stack that both projects (and every serious agent platform) must implement. The differences lie in *how* they implement each layer and *what they trade off* in the process.

Why Two Harnesses?

The choice of case studies is deliberate. Claw Code and DeerFlow are not random samples — they are archetypes:

  • Claw Code represents the Unix philosophy applied to agent infrastructure: do one thing well, compose through protocols. It emerged from the UltraWorkers community, a loose collective of developers exploring autonomous software development. Its PHILOSOPHY.md states bluntly: "the code is evidence; the coordination system is the product lesson." The Rust implementation is clean, explicit, and disciplined. The six built-in tools — Bash, Read, Write, Edit, Glob, Grep — form a minimal but complete set of primitives.
  • DeerFlow represents the "batteries included" philosophy: ship everything, let the user compose. Built by ByteDance's Volcengine team, it started as a Deep Research framework and evolved into a super-agent platform with sub-agents, memory, sandboxes, skills, MCP support, and IM channel integrations. The architecture is layered, the middleware chain is 21 deep, and the feature list runs for pages.

These are not just different implementations — they are different *theories of the user*. Claw trusts the user to know what they need and provides sharp, safe tools. DeerFlow trusts the platform to anticipate what the user will need and provides pre-built solutions.

What This Series Is (and Isn't)

This series is a comparative architecture analysis. Each chapter examines one layer of the agent harness stack through both projects' implementations. The goal is not to declare a winner but to build a mental model of what an agent harness *must* do and how design choices ripple through the system.

The chapters ahead:

| Chapter | Topic | |---------|-------| | E01 | The design philosophies and organizational origins | | E02 | Core architecture: ConversationRuntime<C,T> vs LangGraph middleware chain | | E03 | Context engineering: system prompts, dynamic injection, prefix caching | | E04 | Context window compression: summary tags vs summarization middleware | | E05 | Autonomous decision-making: policy engines vs sub-agent delegation | | E06 | Memory: session persistence vs fact extraction | | E07 | Summary: the cognitive operating system in practice | | E08 | Tool systems: six primitives vs extensible registry | | E09 | Safety: permission hierarchy vs guardrail middleware | | E10 | Multi-agent: approval tokens vs sub-agent concurrency | | E11 | Extensibility: MCP protocol and ecosystem | | E12 | Summary: the decision tree for practitioners | | E13 | Blind spots: honest limitations of both | | E14 | Conclusion: the cognitive map |

A Note on Methodology

The analysis that follows is grounded in source code, not documentation. Every claim about Claw Code is traceable to specific Rust source files in rust/crates/runtime/src/conversation.rs for the agent loop, compact.rs for compression, permissions.rs for the authorization model. Every claim about DeerFlow is traceable to backend/packages/harness/deerflow/agents/lead_agent/agent.py for the factory, agents/middlewares/ for the chain, memory/updater.py for fact extraction.

I have read both codebases line by line across their critical subsystems. The digests run to thousands of words each and cover every component discussed in this series. When I report a design decision, I am reporting what the code *does*, not what the README *says*.

---

References:

  • Claw Code PHILOSOPHY.md — project intent and system-design framing
  • Claw Code concept.md — Russian-language concept document with architectural principles
  • DeerFlow README.md — project overview, feature set, architecture summary
  • DeerFlow backend/CLAUDE.md — technical architecture and development guidelines
  • DeerFlow backend/docs/ARCHITECTURE.md — system architecture reference

---

The most important part of an agent is invisible. You can't see the harness in the output — you only see the consequences when it fails. A safety boundary you never hit, a context window you never fill, a session you can resume without losing state — these are the signs of good harness design. In the chapters ahead, I'll show you how two very different projects achieve this invisibility, and why their approaches lead to dramatically different trade-offs. The paradox is this: both are right, and both are wrong. Which one works for you depends on what you're trying to build.