Response Modes
Try this to get concise answers:
Respond in terse mode — just the facts
Try this to speed up simple tasks:
Use lightweight mode for quick fixes
Squad automatically picks the right response mode based on complexity — from instant direct answers (2s) to full multi-agent parallel work (60s). You can override anytime.
The Four Modes
Not every request needs the full agent machinery. Squad automatically selects a response mode based on the complexity of your message.
The Four Modes
| Mode | Time | What Happens | When Used |
|---|---|---|---|
| Direct | ~2–3s | Coordinator answers from memory — no agent spawned | Status checks, factual questions |
| Lightweight | ~8–12s | One agent, minimal prompt — no charter/history/decisions reads | Small fixes, quick follow-ups |
| Standard | ~25–35s | Full agent spawn with charter, history, and decisions | Normal work requests |
| Full | ~40–60s | Multi-agent parallel spawn | Complex multi-domain tasks |
Direct
The coordinator handles it alone. No sub-agent is spawned.
> What port does the server run on?
> Where are we on the auth work?
> Who's on the team?
Fast answers from context the coordinator already has.
Lightweight
One agent is spawned with a reduced prompt — skips loading charter, history, and decisions to save time.
> Fix the typo in the README
> Add that missing import
> Update the version number
Good for small, well-defined tasks where full context isn’t needed.
Standard
Full agent spawn. The agent reads its charter, history, and team decisions before working.
> Build the user profile API endpoint
> Refactor the auth middleware
> Write tests for the payment module
This is the default mode for most work.
Full
Multiple agents spawn in parallel, each with full context. A design review ceremony may trigger first.
> Team, build the dashboard
> Rebuild the authentication system
> Implement the search feature end-to-end
Used for complex tasks that span multiple domains (frontend, backend, testing).
How Modes Are Selected
The coordinator picks the mode automatically based on:
- Complexity of the request
- Number of domains involved
- Whether context is needed (history, decisions, skills)
You don’t need to specify a mode. When uncertain, the coordinator biases toward upgrading — it’s better to spend a few extra seconds loading context than to miss something.
Tips
- If a response feels slow for a simple question, it’s likely using Standard when Direct would suffice. This is rare — the coordinator is good at picking the right mode.
- “Team, …” prompts typically trigger Full mode.
- Direct-named agent prompts (“Kane, …”) typically trigger Standard mode.
- Response times depend on the Copilot platform. The numbers above are approximate.
Sample Prompts
force lightweight mode for this quick fix
Explicitly requests a reduced-context spawn for a simple task.
what port does the API run on?
Quick factual question that triggers Direct mode with no agent spawn.
Kane, do a thorough analysis of the auth system
Requests Standard mode with full context load for complex work.
what response mode was used for that last task?
Checks which mode the coordinator selected for the previous request.
Team, rebuild the authentication system end-to-end
Multi-domain prompt that triggers Full mode with parallel agent spawns.