revfactory/harness / Chapter 4

Programming /

execution_modes

# When Teams, When Subs, When to Mix Them > Agent teams are the default. Sub-agents are the override. Hybrid is not a power-user feature; it is the answer to one specific data-flow problem. The SKILL.md decision tree forces you to override the default in exactly two cases, and getting the override wrong costs tokens without buying quality. The single most consequential sentence in `skills/harness/SKILL.md` is short: *"에이전트 팀이 최우선 기본값이다."* (Agent teams are the highest-priority default.) It appears in Phase 2-1, immediately under the execution-mode comparison table. Most readers absorb it as marketing. It is not. It is a rule with operational consequences. The rule does not say "agent teams are always correct." It says they are the *default*. The override exists, and the SKILL.md gives you a decision tree for using it. The decision is not about which mode is more powerful. It is about which mode matches the data flow your task actually has. The argument of this chapter: **the execution mode decision is a data-flow decision, and getting it right is the difference between a team that costs tokens and a team that produces insight.** ## Key Takeaways - Two agent-team constraints drive the decision tree: one team per session, no nested teams. - Sub-agents are the right mode when the cost of team coordination exceeds the value of cross-agent communication. Three scenarios fit. - Hybrid mode is structurally about reconciling parallel collection with consensus-building in adjacent phases. - A session can run *two* teams sequentially by deleting the first (`TeamDelete`) and creating the second. The first team's artifacts persist in `_workspace/`. - The orchestrator's pattern is the bridge: agent-team mode calls `TeamCreate`/`SendMessage`/`TaskCreate`; sub-agent mode calls the `Agent` tool directly. Mixing them mid-phase requires explicit transition rules. ## The decision tree, exactly The decision tree from `agent-design-patterns.md` is short enough to quote in full: ``` 에이전트가 2개 이상인가? ├── Yes → 에이전트 간 통신이 필요한가? │ ├── Yes → 에이전트 팀 (기본값) │ │ 교차 검증·발견 공유·실시간 피드백으로 품질 향상. │ │ │ └── No → 서브 에이전트도 가능 │ 결과 전달만 필요한 생성-검증, 전문가 풀 등. │ └── No (1개) → 서브 에이전트 단일 에이전트는 팀 구성 불필요. ``` The question the tree asks is *"do the agents need to communicate?"* If yes, agent team. If no, sub-agent is permitted. The framing matters: communication need is a property of the *task*, not the *preference*. A user who wants team mode for a task where sub-agents suffice is paying coordination cost for no gain. For Fan-out/Fan-in, the answer to "do they need to communicate?" is almost always yes (Chapter 2). For Expert Pool, the answer is almost always no — the router classifies, calls one specialist, the specialist returns to the router, and that is the whole flow. The specialist does not need to consult the other specialists. That is why Expert Pool is the canonical sub-agent pattern. The router does not need team coordination overhead; the specialist does not need to know about its peers. ## The two hard constraints on team mode The spec is explicit that agent-team mode has two structural constraints you cannot ignore: 1. **One team per session.** A session can have at most one active team. To switch teams, you `TeamDelete` the first, then `TeamCreate` the second. The first team's artifacts persist in `_workspace/`, so the second team can read them. But two teams cannot be active simultaneously. 2. **No nested teams.** A team member cannot create its own sub-team. If you want a two-level hierarchy, you flatten it — the second level is a team member with a supervisor role, not a nested team. These constraints shape the execution-mode decision more than the README suggests. They explain why some teams that look like they want team mode actually want sub-agents: - **Long-running agentic workflows** that need to spawn additional specialists dynamically — these cannot use team mode (no nesting). They must use sub-agents with `run_in_background: true`, or build the dynamic logic into the orchestrator as a router. - **Projects that need multiple parallel teams** — these cannot all be in team mode at once. The factory's solution is Phase-by-Phase reconfiguration: Phase 1 runs in team A, `TeamDelete`, Phase 2 runs in team B, `TeamDelete`, Phase 3 runs in team C. Each team's work survives in `_workspace/` and is `Read` by the next. You start Phase 1 with a pile of URLs. Phase 2 fans out across them. Phase 3 must reconcile the parallel outputs into a single coherent finding. Which mode? The answer is hybrid — sub-agents for Phase 2 (independent collection, no cross-talk needed), team for Phase 3 (reconciliation requires debate, consensus, and challenge). That is the use case the spec calls out for hybrid mode. It is not a power-user feature. It is the answer when adjacent phases have different data-flow shapes. ## Hybrid mode, structurally ```mermaid flowchart LR subgraph SESSION["One Claude Code Sessi

Chapter 4 of 5 9m Article Learning path

When Teams, When Subs, When to Mix Them

Agent teams are the default. Sub-agents are the override. Hybrid is not a power-user feature; it is the answer to one specific data-flow problem. The SKILL.md decision tree forces you to override the default in exactly two cases, and getting the override wrong costs tokens without buying quality.

The single most consequential sentence in skills/harness/SKILL.md is short: *"에이전트 팀이 최우선 기본값이다."* (Agent teams are the highest-priority default.) It appears in Phase 2-1, immediately under the execution-mode comparison table. Most readers absorb it as marketing. It is not. It is a rule with operational consequences.

The rule does not say "agent teams are always correct." It says they are the *default*. The override exists, and the SKILL.md gives you a decision tree for using it. The decision is not about which mode is more powerful. It is about which mode matches the data flow your task actually has.

The argument of this chapter: the execution mode decision is a data-flow decision, and getting it right is the difference between a team that costs tokens and a team that produces insight.

Key Takeaways

  • Two agent-team constraints drive the decision tree: one team per session, no nested teams.
  • Sub-agents are the right mode when the cost of team coordination exceeds the value of cross-agent communication. Three scenarios fit.
  • Hybrid mode is structurally about reconciling parallel collection with consensus-building in adjacent phases.
  • A session can run *two* teams sequentially by deleting the first (TeamDelete) and creating the second. The first team's artifacts persist in _workspace/.
  • The orchestrator's pattern is the bridge: agent-team mode calls TeamCreate/SendMessage/TaskCreate; sub-agent mode calls the Agent tool directly. Mixing them mid-phase requires explicit transition rules.

The decision tree, exactly

The decision tree from agent-design-patterns.md is short enough to quote in full:

에이전트가 2개 이상인가?
├── Yes → 에이전트 간 통신이 필요한가?
│         ├── Yes → 에이전트 팀 (기본값)
│         │         교차 검증·발견 공유·실시간 피드백으로 품질 향상.
│         │
│         └── No → 서브 에이전트도 가능
│                  결과 전달만 필요한 생성-검증, 전문가 풀 등.
│
└── No (1개) → 서브 에이전트
              단일 에이전트는 팀 구성 불필요.

The question the tree asks is *"do the agents need to communicate?"* If yes, agent team. If no, sub-agent is permitted. The framing matters: communication need is a property of the *task*, not the *preference*. A user who wants team mode for a task where sub-agents suffice is paying coordination cost for no gain.

For Fan-out/Fan-in, the answer to "do they need to communicate?" is almost always yes (Chapter 2). For Expert Pool, the answer is almost always no — the router classifies, calls one specialist, the specialist returns to the router, and that is the whole flow. The specialist does not need to consult the other specialists.

That is why Expert Pool is the canonical sub-agent pattern. The router does not need team coordination overhead; the specialist does not need to know about its peers.

The two hard constraints on team mode

The spec is explicit that agent-team mode has two structural constraints you cannot ignore:

1. One team per session. A session can have at most one active team. To switch teams, you TeamDelete the first, then TeamCreate the second. The first team's artifacts persist in _workspace/, so the second team can read them. But two teams cannot be active simultaneously. 2. No nested teams. A team member cannot create its own sub-team. If you want a two-level hierarchy, you flatten it — the second level is a team member with a supervisor role, not a nested team.

These constraints shape the execution-mode decision more than the README suggests. They explain why some teams that look like they want team mode actually want sub-agents:

  • Long-running agentic workflows that need to spawn additional specialists dynamically — these cannot use team mode (no nesting). They must use sub-agents with run_in_background: true, or build the dynamic logic into the orchestrator as a router.
  • Projects that need multiple parallel teams — these cannot all be in team mode at once. The factory's solution is Phase-by-Phase reconfiguration: Phase 1 runs in team A, TeamDelete, Phase 2 runs in team B, TeamDelete, Phase 3 runs in team C. Each team's work survives in _workspace/ and is Read by the next.

You start Phase 1 with a pile of URLs. Phase 2 fans out across them. Phase 3 must reconcile the parallel outputs into a single coherent finding. Which mode? The answer is hybrid — sub-agents for Phase 2 (independent collection, no cross-talk needed), team for Phase 3 (reconciliation requires debate, consensus, and challenge).

That is the use case the spec calls out for hybrid mode. It is not a power-user feature. It is the answer when adjacent phases have different data-flow shapes.

Hybrid mode, structurally

flowchart LR
    subgraph SESSION["One Claude Code Session"]
        direction TB
        P2["Phase 2<br/>Parallel Collection<br/>SUB-AGENTS<br/>(run_in_background: true)"]
        P3["Phase 3<br/>Consensus Integration<br/>AGENT TEAM<br/>(TeamCreate / SendMessage)"]
        P4["Phase 4<br/>Independent Verification<br/>SUB-AGENT<br/>(QA reads integrated output)"]
    end

    P2 -->|files in _workspace/02_*| P3
    P3 -->|files in _workspace/03_integrated.md| P4
    P4 -->|grading.json| DONE["Final Output"]

    classDef team fill:#fde68a,stroke:#b45309;
    classDef sub fill:#bae6fd,stroke:#0369a1;
    class P2,P4 sub;
    class P3 team;

The orchestrator-template.md §템플릿 C specifies three common hybrid compositions:

1. Parallel collection (sub) → consensus integration (team) — the case above. Phase 2 fans out with sub-agents; Phase 3 creates a team to reconcile. 2. Team generation → sub-agent verification — Phase 2 builds an artifact as a team; Phase 3 sends the artifact to a single QA sub-agent for independent verification. 3. Phase-by-phase team reconfiguration — Phase 1 runs in team A, Phase 2 in team B (after TeamDelete), Phase 3 in team C. The artifact chain persists through _workspace/.

The common element in all three is *data-flow shape*. The mode changes when the data flow changes. A team that needs cross-agent communication in one phase and isolation in the next is a hybrid. A team whose data flow is constant is a single mode.

Sub-agents are not just "smaller teams"

A common misreading is that sub-agents are a watered-down version of agent teams. The spec is explicit that this is wrong. Sub-agents have a different data-flow profile:

| Property | Agent team | Sub-agent | |----------|------------|-----------| | Cross-agent communication | Yes (SendMessage) | No | | Shared task list | Yes (TaskCreate/TaskUpdate) | No | | Result return target | Each team member to the leader | Sub-agent to main only | | Cost | Higher (more tokens) | Lower | | Speed | Slower (coordination overhead) | Faster (no overhead) | | Suitable for | Cross-domain discovery, peer review | Independent tasks, single review |

Sub-agents are *not* a lesser team. They are a different tool, suited to a different shape of work. The decision is not "which is better?" — it is "which matches my data flow?"

For the QA agent in Phase 4, sub-agents are the right mode: one QA agent reads the integrated output, runs the assertions, returns a grading.json. There is no benefit to peer review of the QA result within the same phase — the QA is itself the verification. A team mode would add tokens without changing the answer.

For Phase 3 in the research hybrid, agent team is right: the reconcilers need to challenge each other's interpretations of conflicting data. A sub-agent could read the four research outputs and produce a reconciliation, but it would lack the cross-challenge that produces a better synthesis.

The cost of getting it wrong

The two wrong-mode choices are:

  • Team mode where sub-agents suffice. You pay the coordination cost (task list updates, message exchanges, idle notifications) for no discovery benefit. The team produces the same output a sub-agent would have, with 3-5x the token consumption.
  • Sub-agents where team mode is needed. You lose the cross-discovery loop. In research, this is the loss of "researcher A finds something that redirects researcher B's investigation." In code review, this is the loss of "security reviewer flags an issue that turns out to be a performance issue." The output looks finished but is missing the cross-domain signal.

The first wrong choice wastes money. The second wrong choice ships a defective output. Both are silent failures — they do not error, they just cost more or produce less.

A four-line mental test

For any new project, run this test against the dominant phase:

1. Does the dominant phase have *cross-domain discovery*? If yes, team mode. 2. Does the dominant phase have *independent outputs that get reconciled later*? If yes, sub-agents in this phase; team mode in the reconciliation phase. 3. Does the dominant phase have *a single deliverable from a single expert*? If yes, sub-agents. 4. Does the dominant phase have *producer-reviewer loops*? If yes, team mode (with retry cap from Chapter 2).

If the answers do not fit one mode, hybrid. If they do, single mode. The test takes a minute and saves tokens. It is also the test Harness's own decision tree encodes; the four-line form just makes it easier to apply at planning time.

What this changes about Chapter 2

The six patterns in Chapter 2 had a column called "Team-mode verdict." That column is now load-bearing. Each pattern's spec includes its mode coupling. Fan-out/Fan-in must be team mode. Expert Pool is sub-agent sufficient. Pipeline is team mode limited. Producer-Reviewer is team mode useful. Supervisor is team mode useful. Hierarchical Delegation is team mode limited (and flat at two levels).

The execution-mode decision is not independent of the pattern choice. They are coupled. Picking the pattern without picking the mode produces a broken factory output. The factory's discipline is to pick both at Phase 2-1, in that order: mode first (because it determines the tool set), then pattern (because it determines the team composition).

Why one mode per phase, not per project

A natural question is: why not let each *agent* pick its mode? The spec's answer is operational, not philosophical. The orchestrator template needs to know the mode in advance, because:

  • Team mode requires TeamCreate calls in the orchestrator.
  • Sub-agent mode requires Agent tool calls with run_in_background: true.
  • Hybrid mode requires TeamDelete and TeamCreate at phase boundaries.

If the mode is implicit in each agent's prompt, the orchestrator cannot know when to call which tool. By forcing the mode to be a phase-level property, the orchestrator's code is straightforward. By forcing the pattern to be a project-level property (Phase 2-2), the agent composition is stable across phases.

The factory's claim is that this constraint is the cost of clarity. You give up per-agent mode flexibility. You gain predictable orchestrator behavior. The trade is worth it for production teams, where the orchestrator's reliability is more valuable than the flexibility.

---

References:

---

The factory has a pipeline (Chapter 1), a design vocabulary (Chapter 2), and an execution rule (this chapter). What it has not yet built is a smoke test for the most common failure class in multi-agent systems: bugs that live between skills, not within them. Chapter 4 is about that gap — and the seven boundary-mismatch bugs the QA agent guide was built to defend against.