Squad Squad

Response Modes

⚠️ Experimental — Squad is alpha software. APIs, commands, and behavior may change between releases.

Try this to get concise answers:

Respond in terse mode — just the facts

Try this to speed up simple tasks:

Use lightweight mode for quick fixes

Squad automatically picks the right response mode based on complexity — from instant direct answers (2s) to full multi-agent parallel work (60s). You can override anytime.


The Four Modes

Not every request needs the full agent machinery. Squad automatically selects a response mode based on the complexity of your message.

ModeTimeWhat HappensWhen Used
Direct~2–3sCoordinator answers without spawning an agentStatus checks, factual questions
Lightweight~8–12sOne agent, minimal prompt — skips charter, history, and decisionsSmall fixes, quick follow-ups
Standard~25–35sFull agent spawn with charter, history, and decisionsNormal work requests
Full~40–60sMultiple agents spawn in parallel, each with full contextComplex multi-domain tasks

Direct

The coordinator handles it alone — no sub-agent is spawned. This isn’t a response mode in the SDK sense; the coordinator answers the question itself using context it already has (team roster, decisions, history).

> What port does the server run on?
> Where are we on the auth work?
> Who's on the team?

Fast answers from context the coordinator already has. If the coordinator responds with DIRECT:, no agent session is created.

Lightweight

One agent is spawned with a reduced prompt — skips loading charter, history, and decisions to save time.

> Fix the typo in the README
> Add that missing import
> Update the version number

Good for small, well-defined tasks where full context isn’t needed.

Standard

Full agent spawn. The agent reads its charter, history, and team decisions before working.

> Build the user profile API endpoint
> Refactor the auth middleware
> Write tests for the payment module

This is the default mode for most work.

Full

Multiple agents spawn in parallel, each with full context. A design review ceremony may trigger first.

> Team, build the dashboard
> Rebuild the authentication system
> Implement the search feature end-to-end

Used for complex tasks that span multiple domains (frontend, backend, testing).


How Modes Are Selected

The coordinator picks the mode automatically based on:

  • Complexity of the request
  • Number of domains involved
  • Whether context is needed (history, decisions, skills)

You don’t need to specify a mode. When uncertain, the coordinator biases toward upgrading — it’s better to spend a few extra seconds loading context than to miss something.


Tips

  • If a response feels slow for a simple question, it’s likely using Standard when Direct would suffice. This is rare — the coordinator is good at picking the right mode.
  • “Team, …” prompts typically trigger Full mode.
  • Direct-named agent prompts (“Kane, …”) typically trigger Standard mode.
  • Response times depend on the Copilot platform. The numbers above are approximate.

Sample Prompts

force lightweight mode for this quick fix

Explicitly requests a reduced-context spawn for a simple task.

what port does the API run on?

Quick factual question that triggers Direct mode with no agent spawn.

Kane, do a thorough analysis of the auth system

Requests Standard mode with full context load for complex work.

what response mode was used for that last task?

Checks which mode the coordinator selected for the previous request.

Team, rebuild the authentication system end-to-end

Multi-domain prompt that triggers Full mode with parallel agent spawns.