Response Modes
⚠️ Experimental — Squad is alpha software. APIs, commands, and behavior may change between releases.
Try this to get concise answers:
Respond in terse mode — just the facts
Try this to speed up simple tasks:
Use lightweight mode for quick fixes
Squad automatically picks the right response mode based on complexity — from instant direct answers (2s) to full multi-agent parallel work (60s). You can override anytime.
The Four Modes
Not every request needs the full agent machinery. Squad automatically selects a response mode based on the complexity of your message.
| Mode | Time | What Happens | When Used |
|---|---|---|---|
| Direct | ~2–3s | Coordinator answers without spawning an agent | Status checks, factual questions |
| Lightweight | ~8–12s | One agent, minimal prompt — skips charter, history, and decisions | Small fixes, quick follow-ups |
| Standard | ~25–35s | Full agent spawn with charter, history, and decisions | Normal work requests |
| Full | ~40–60s | Multiple agents spawn in parallel, each with full context | Complex multi-domain tasks |
Direct
The coordinator handles it alone — no sub-agent is spawned. This isn’t a response mode in the SDK sense; the coordinator answers the question itself using context it already has (team roster, decisions, history).
> What port does the server run on?
> Where are we on the auth work?
> Who's on the team?
Fast answers from context the coordinator already has. If the coordinator responds with DIRECT:, no agent session is created.
Lightweight
One agent is spawned with a reduced prompt — skips loading charter, history, and decisions to save time.
> Fix the typo in the README
> Add that missing import
> Update the version number
Good for small, well-defined tasks where full context isn’t needed.
Standard
Full agent spawn. The agent reads its charter, history, and team decisions before working.
> Build the user profile API endpoint
> Refactor the auth middleware
> Write tests for the payment module
This is the default mode for most work.
Full
Multiple agents spawn in parallel, each with full context. A design review ceremony may trigger first.
> Team, build the dashboard
> Rebuild the authentication system
> Implement the search feature end-to-end
Used for complex tasks that span multiple domains (frontend, backend, testing).
How Modes Are Selected
The coordinator picks the mode automatically based on:
- Complexity of the request
- Number of domains involved
- Whether context is needed (history, decisions, skills)
You don’t need to specify a mode. When uncertain, the coordinator biases toward upgrading — it’s better to spend a few extra seconds loading context than to miss something.
Tips
- If a response feels slow for a simple question, it’s likely using Standard when Direct would suffice. This is rare — the coordinator is good at picking the right mode.
- “Team, …” prompts typically trigger Full mode.
- Direct-named agent prompts (“Kane, …”) typically trigger Standard mode.
- Response times depend on the Copilot platform. The numbers above are approximate.
Sample Prompts
force lightweight mode for this quick fix
Explicitly requests a reduced-context spawn for a simple task.
what port does the API run on?
Quick factual question that triggers Direct mode with no agent spawn.
Kane, do a thorough analysis of the auth system
Requests Standard mode with full context load for complex work.
what response mode was used for that last task?
Checks which mode the coordinator selected for the previous request.
Team, rebuild the authentication system end-to-end
Multi-domain prompt that triggers Full mode with parallel agent spawns.