You and two AI agents working the same feature in real-time.
Your terminal (main session): You type. You decide. You merge.PTY Session 1 — Claude (implements what you describe): "Add OAuth to the login flow"PTY Session 2 — Codex (watches your git diff, reviews continuously): "Reviewing latest changes..."
3-5 developers. Each has a laptop. One shared server.
Dev A (laptop): Claude — frontend workDev B (laptop): Claude — backend workDev C (laptop): Gemini — testingShared server: Codex — continuous review + CI integration
Why: The server runs 24/7 Codex that reviews every push. Developers get feedback without waiting for human reviewers. The AI reviewer never sleeps, never forgets to check.
Why: Patient data never moves. AI runs where the data lives. Only aggregated, anonymized results travel. HIPAA/GDPR compliant by architecture, not by policy.
Laptop (M4, 32GB): Claude Haiku — fast search, classification Gemini — broad analysisGPU Server A (A100 × 4): ollama/llama-70b — heavy local inference Fine-tuned model — domain-specific tasksGPU Server B (H100 × 8): ollama/llama-405b — largest open model Codex — parallel code generation
Why: Laptop for coordination (free). GPU servers for heavy lifting (local, no API cost). Claude only for final integration (minimal API spend). Total API cost: ~2insteadof50.
Why: Three different AI agents prototype three different architectures in parallel. Then three different AI agents debate which one to ship. Then one builds it. One hour instead of one sprint.
Why: Phase 1 uses local LLM with no network — customer data can’t leak. Phase 2 uses Claude but only sees anonymized summary. Sandbox ensures no file writes outside worktree. Every keystroke logged for audit. The AI is powerful AND governed.