Fixing the Blind Orchestrator: How sessions_send Replaced a Broken Multi-Agent Workflow

I told Opus to delegate a code search to the coder agent. The coder ran the search, found results, and announced them directly to Telegram. Opus had no idea any of that happened. It sat there waiting until I manually typed “coder’s back” to wake it up.

That’s the moment I realized the orchestrator was blind to its own subagent’s output.

The Auto-Announce Problem

OpenClaw’s multi-agent setup has Opus (the main orchestrator) delegating file operations and code tasks to a cheaper coder subagent. The original approach used sessions_spawn with mode: "run", which fires off the subagent and returns immediately.

The subagent does its work and auto-announces results directly to the user’s chat surface. Simple enough in theory. Two problems killed it in practice.

First, Telegram truncates messages over 4,096 characters. A code search that returns 20 results with file paths and context easily blows past that. The user sees half the output. The other half vanishes.

Second, and worse: auto-announce doesn’t trigger a new orchestrator turn. Opus spawns the coder, immediately continues its own turn (with nothing to do), and then sits idle. The coder finishes, dumps results to Telegram, and Opus never sees them. The orchestrator is structurally blind to the work it delegated.

The user has to send a manual message to wake Opus up and tell it “coder finished, here are the results.” That defeats the entire purpose of delegation.

The Wrong Fix

My first attempt was Discord thread bindings. The idea: spawn the coder in a Discord thread, let it post results there, and have Opus monitor the thread.

// This approach had limitations
sessions_spawn({
  agentId: "coder",
  task: "description",
  mode: "session",
  thread: true
})

Problems stacked up fast. Thread bindings only work when the parent session is on Discord. If the user is on Telegram, threads don’t exist. It also required polling via sessions_history to check for results, adding complexity and latency. More moving parts, more failure modes, more code to maintain.

The fix was more complicated than the problem.

The Actual Solution

The answer was already in the API. sessions_send blocks the orchestrator’s turn until the subagent responds, then returns the result inline. No auto-announce. No polling. No manual wake-up.

sessions_send({
  agentId: "coder",
  message: "Search the code index for authentication patterns. Summarize findings.",
  timeoutSeconds: 120
})
// Result comes back inline in same turn
// Opus processes it immediately and continues

The flow is linear:

  1. User asks Opus something that requires code scanning
  2. Opus calls sessions_send with the coder agent and a timeout
  3. Opus’s turn blocks. It waits.
  4. Coder spawns, executes the task, returns its result
  5. Result flows back to Opus inline, in the same turn
  6. Opus processes the result, formats a response, replies to the user

No intermediate messages. No manual pokes. No truncation. The user sees one clean response from Opus that incorporates the coder’s findings.

Why This Works Better Than sessions_spawn

The difference comes down to who owns the result.

With sessions_spawn, the subagent owns delivery. It decides where results go (auto-announce to the user’s chat surface), and the orchestrator is out of the loop. The orchestrator delegated work but can’t act on the output.

With sessions_send, the orchestrator owns delivery. Results come back to Opus, which can filter, summarize, combine with other data, and present a coherent response. The user never sees raw subagent output unless Opus decides to pass it through.

Behaviorsessions_spawnsessions_send
BlockingReturns immediatelyWaits for response
Result deliveryDirect to user (auto-announce)Back to orchestrator (inline)
Orchestrator awarenessBlind to resultsFull visibility
Truncation riskHigh (platform message limits)None (internal transfer)
Manual wake-up neededYesNo

Verified in Production

Test run confirmed the behavior:

  • Opus spawned coder subagent
  • Coder executed a git log command in a project directory
  • Opus blocked for approximately 6 seconds while coder worked
  • Coder returned results
  • Opus processed them in the same turn
  • No auto-announce fired to Telegram
  • No manual intervention required

The blocking behavior is visible in logs as agent.wait entries. Opus genuinely pauses its inference until the coder responds or the timeout expires.

Timeout Guidelines

Different tasks need different timeouts. Too short and the coder gets killed mid-task. Too long and a failed task hangs the orchestrator.

What I’ve settled on after testing:

  • Simple commands (git status, version checks): 30 seconds
  • File scanning and grep operations: 60 seconds
  • Code search queries: 120 seconds
  • Complex refactors or multi-file operations: 300 seconds

The timeout is a ceiling, not a target. Most tasks complete well under their limit. The coder usually returns in 5 to 15 seconds for standard operations.

What This Means for Multi-Agent Design

The broader lesson: delegation without feedback is just fire-and-forget. If your orchestrator can’t see what its subagents produce, it’s not orchestrating. It’s guessing.

sessions_spawn still has its place for truly independent tasks where the user wants direct output and the orchestrator doesn’t need to process it. But for any workflow where the orchestrator needs to act on results, combine outputs, or make decisions based on subagent work, sessions_send is the correct primitive.

The pattern is simple: spawn-and-await beats spawn-and-hope.

Key Takeaways

  1. Auto-announce delivery breaks orchestrator workflows because the orchestrator never sees the results. Use sessions_send for delegated tasks where the orchestrator needs the output.
  2. Blocking semantics are a feature, not a limitation. The orchestrator waiting for results is exactly what you want in a sequential workflow.
  3. Platform message limits (Telegram’s 4,096 characters, Discord’s 2,000) become irrelevant when results stay internal between agents.
  4. Set timeouts based on task complexity. 30 seconds for simple commands, 120 for searches, 300 for complex operations.
  5. Thread bindings and polling add complexity without solving the core problem. The simpler API call was the right answer all along.