Disaster Recovery — When Things Go Wrong
Try this to recover from data loss:
My .ai-team/ directory was deleted — help me recover the team state
Try this to revert bad code:
An agent wrote bad code — how do I revert it?
Try this to reset confused agents:
The squad is confused — reset their context
Recovery procedures for deleted .ai-team/, bad agent code, confused squads, and upgrade issues. Most problems are fixable with Git or re-init.
1. “I accidentally deleted .ai-team/”
Recovery scenarios: deleted .ai-team/, bad agent code, confused squad, upgrade issues.
Solution: It’s in Git. Restore it.
git checkout .ai-team/
If you haven’t committed .ai-team/ yet, it’s gone. Rebuild:
npx github:bradygaster/squad
Start from scratch. If you exported your squad before, import the export:
npx github:bradygaster/squad import squad-export-2025-07-15.zip
Prevention: Commit .ai-team/ early. Don’t let it stay uncommitted for long.
2. “An agent wrote bad code”
What happened: Morpheus implemented a feature, but it has a bug. Michael (the Lead) didn’t catch it during review.
Solution: Use the reviewer rejection protocol.
> Michael, review Morpheus's latest commit. There's a bug in the
> payment validation logic.
🏗️ Michael — reviewing Morpheus's commit
Issue found: Payment validation allows negative amounts.
Rejected. Morpheus, fix the validation to reject negative amounts.
Morpheus fixes it:
🔧 Morpheus — fixing payment validation to reject negative amounts
Alternatively, have a different agent fix it:
> Sonny, fix the payment validation bug Morpheus introduced.
> It's allowing negative amounts.
Sonny reads the code, fixes the issue, commits.
Prevention: Use code review. Michael should always review code before it lands.
3. “An agent made a wrong decision”
What happened: Neo decided to use REST when GraphQL was the better choice. The decision is logged in .ai-team/decisions.md.
Solution: Add a directive to override it.
> Team, I'm overriding Neo's decision from last session. We're using
> GraphQL, not REST. The client needs flexible queries. Document this.
📋 Scribe — logged decision override
### 2025-07-15: Using GraphQL instead of REST
**By:** You (overriding Neo's recommendation)
**What:** API will use GraphQL, not REST
**Why:** Client needs flexible queries, REST would require too many endpoints
Neo's previous recommendation archived.
Agents now read the new decision and build accordingly.
Prevention: Review major decisions before agents implement them. Use the Lead as a sounding board, not a dictator.
4. “My squad is confused after a bad session”
What happened: Agents learned incorrect information during a session. Now they’re making mistakes.
Solution: Have the Scribe archive old learnings and start fresh.
> Scribe, archive agent histories from the last session. We made
> mistakes and I don't want agents repeating them.
📋 Scribe — archiving recent session histories
Moved to .ai-team/history-archive/:
- Neo's session from 2025-07-14
- Morpheus's session from 2025-07-14
- Trinity's session from 2025-07-14
Agents now only have context from earlier sessions.
Agents forget the bad session. They still have their long-term skills and decisions.
Prevention: End sessions if agents are going in the wrong direction. Don’t let them accumulate bad context.
5. “I want to start over completely”
Solution: Delete .ai-team/ and reinstall.
rm -rf .ai-team/
npx github:bradygaster/squad
Squad is ready. What are you building?
You’re back to day one. Clean slate.
Prevention: Only do this if the squad is truly beyond repair. Usually archiving histories (above) is enough.
6. “Upgrade broke something”
What happened: You upgraded Squad to a new version, and now something doesn’t work.
Solution: Squad upgrades never touch .ai-team/. The issue is likely in:
- Workflow templates — check
.ai-team-templates/ - Squad agent definition — check
.github/agents/squad.agent.md - Model configuration — check
.ai-team/model-config.json
Roll back the Squad agent definition:
git checkout HEAD^ .github/agents/squad.agent.md
Or reinstall the previous version:
npx github:bradygaster/squad@0.1.5
Your team’s knowledge is safe. .ai-team/ is untouched.
Prevention: Check the CHANGELOG before upgrading. If the upgrade is major, test in a branch first.
7. “An agent is stuck in a loop”
What happened: Tank keeps writing the same failing test over and over.
Solution: Stop the agent manually.
> Tank, stop. The test is failing because of a known issue in the
> test environment, not the code. Skip this test for now.
If the agent doesn’t stop:
> Scribe, pause Tank's work. I need to fix the test environment first.
Or just close the Copilot session (Ctrl+C) and start a new one.
Prevention: If a test is flaky, tell agents to skip it until the environment is fixed.
8. “Skills are outdated or wrong”
What happened: A skill file in .ai-team/skills/ contains outdated information. Agents are following bad advice.
Solution: Edit or delete the skill file.
# Edit the skill
code .ai-team/skills/auth-rate-limiting.md
# Or delete it
rm .ai-team/skills/auth-rate-limiting.md
git add .ai-team/skills/
git commit -m "Remove outdated auth rate limiting skill"
Prevention: Review skills periodically. If your patterns change, update the skills.
9. “Decisions.md is a mess”
What happened: .ai-team/decisions.md has 200 entries and it’s hard to find anything.
Solution: Archive old decisions.
> Scribe, archive decisions older than 3 months. Move them to
> .ai-team/decisions-archive.md.
📋 Scribe — archiving old decisions
Moved 87 decisions older than 2025-04-01 to decisions-archive.md.
decisions.md now contains only recent decisions.
Agents still have access to archived decisions if they need them, but the main file is cleaner.
Prevention: Periodically archive old decisions. Keep decisions.md focused on recent, relevant choices.
10. “I can’t tell which agent did what”
What happened: Multiple agents worked on the same feature, and the commit history is tangled.
Solution: Check agent histories.
> Show me Neo's history for the last session.
🏗️ Neo — session history 2025-07-15
Tasks:
- Reviewed architecture for the payment feature
- Rejected Morpheus's first implementation (missing validation)
- Approved Morpheus's second implementation
Decisions made:
- Use Stripe Checkout instead of raw Payment Intents
- Store payment metadata in the orders table
Each agent logs what they did in their history.md.
Prevention: Agents automatically log their work. You don’t have to do anything.
Tips
.ai-team/is in Git. If you delete it, restore from Git. If it’s uncommitted, it’s gone.- Code review catches bad agent code. Use the Lead to review before merging.
- Override bad decisions with directives. If an agent made the wrong call, tell the team the correct one.
- Archive confused histories. If a session went badly, archive the learnings so agents forget.
- Upgrades don’t touch
.ai-team/. Your team’s knowledge is safe across upgrades. - Edit skill files directly. They’re just markdown. If a skill is wrong, fix it or delete it.
- Agent histories are the audit log. Check them to see what each agent did.