Response Modes

Try this to get concise answers:

Respond in terse mode — just the facts

Try this to speed up simple tasks:

Use lightweight mode for quick fixes

Squad automatically picks the right response mode based on complexity — from instant direct answers (2s) to full multi-agent parallel work (60s). You can override anytime.


The Four Modes

Not every request needs the full agent machinery. Squad automatically selects a response mode based on the complexity of your message.


The Four Modes

Mode Time What Happens When Used
Direct ~2–3s Coordinator answers from memory — no agent spawned Status checks, factual questions
Lightweight ~8–12s One agent, minimal prompt — no charter/history/decisions reads Small fixes, quick follow-ups
Standard ~25–35s Full agent spawn with charter, history, and decisions Normal work requests
Full ~40–60s Multi-agent parallel spawn Complex multi-domain tasks

Direct

The coordinator handles it alone. No sub-agent is spawned.

> What port does the server run on?
> Where are we on the auth work?
> Who's on the team?

Fast answers from context the coordinator already has.

Lightweight

One agent is spawned with a reduced prompt — skips loading charter, history, and decisions to save time.

> Fix the typo in the README
> Add that missing import
> Update the version number

Good for small, well-defined tasks where full context isn’t needed.

Standard

Full agent spawn. The agent reads its charter, history, and team decisions before working.

> Build the user profile API endpoint
> Refactor the auth middleware
> Write tests for the payment module

This is the default mode for most work.

Full

Multiple agents spawn in parallel, each with full context. A design review ceremony may trigger first.

> Team, build the dashboard
> Rebuild the authentication system
> Implement the search feature end-to-end

Used for complex tasks that span multiple domains (frontend, backend, testing).


How Modes Are Selected

The coordinator picks the mode automatically based on:

  • Complexity of the request
  • Number of domains involved
  • Whether context is needed (history, decisions, skills)

You don’t need to specify a mode. When uncertain, the coordinator biases toward upgrading — it’s better to spend a few extra seconds loading context than to miss something.


Tips

  • If a response feels slow for a simple question, it’s likely using Standard when Direct would suffice. This is rare — the coordinator is good at picking the right mode.
  • “Team, …” prompts typically trigger Full mode.
  • Direct-named agent prompts (“Kane, …”) typically trigger Standard mode.
  • Response times depend on the Copilot platform. The numbers above are approximate.

Sample Prompts

force lightweight mode for this quick fix

Explicitly requests a reduced-context spawn for a simple task.

what port does the API run on?

Quick factual question that triggers Direct mode with no agent spawn.

Kane, do a thorough analysis of the auth system

Requests Standard mode with full context load for complex work.

what response mode was used for that last task?

Checks which mode the coordinator selected for the previous request.

Team, rebuild the authentication system end-to-end

Multi-domain prompt that triggers Full mode with parallel agent spawns.