There's a pattern I kept seeing in my own workflow — and I kept seeing it in everyone else's.
You open ChatGPT, or Claude, or whatever your current favorite is. You paste in a function. You ask it to write tests. Then you ask it to also check for security issues. Then you ask it to update the README. Then, because you forgot, you scroll back up to add context from three messages ago, re-explain what the function does, and hope the model hasn't lost the thread.
That's the prompt engineering tax. And it compounds.
The problem isn't that these models are bad. They're extraordinary. The problem is the mental overhead you pay every time you switch contexts.
When you ask a single assistant to be your coder, your security reviewer, your docs writer, and your QA tester — in the same conversation — you're doing the coordination work. You're tracking state. You're deciding what to hand off and when. You're re-explaining context constantly because the model has no persistent memory of your project, your standards, or your preferences.
This is not a prompt engineering problem. You can't prompt your way out of it.
It's an architecture problem.
Single-model approaches are also brittle in a specific way: they hallucinate confidently in domains they're being stretched into. Ask a coding assistant to "also check for SQL injection vulnerabilities" and it will — it'll give you something that looks like a security review. Whether it would survive an actual audit is a different question. The model is performing security review, not doing it from a place of specialized, calibrated knowledge.
Think about how high-performing engineering teams actually work. You don't have one person who writes the code, reviews the code, writes the docs, runs the tests, and checks compliance. You have specialists. They have context within their domain. They have standards. They hand off to each other through defined interfaces.
That's what a properly orchestrated agent team looks like.
When a coding agent writes a function, it's not also context-switching to think about documentation standards or security CVEs. It writes code. Then a security agent — which has been trained, prompted, and calibrated specifically around threat modeling and vulnerability patterns — reviews it. Then a docs agent writes the documentation, with knowledge of what good technical documentation actually looks like.
These agents run in parallel or in sequence depending on the task. They share structured context, not conversation history. They have persistent memory of your project — your coding standards, your past decisions, your preferences — that carries across sessions without you re-explaining anything.
The difference isn't just efficiency. It's quality. A specialist makes different tradeoffs than a generalist. It knows what it doesn't know.
Getting one agent to do something useful isn't hard anymore. Getting a team of agents to work together coherently — that's where most frameworks fall apart.
Orchestration means:
Most people who've tried to build multi-agent systems with raw APIs know how quickly this gets complicated. You end up writing more orchestration glue than actual agent logic.
A.L.I.C.E. is our answer to this. It's an open-source agent orchestration platform that ships 28 specialist agents in a single command. You run npx @robbiesrobotics/alice-agents and you get a team: a coding agent, a security agent, a docs agent, a marketing agent, a research agent, and 23 more — each with its own domain knowledge, memory, and operating instructions.
The orchestrator (Olivia) handles task routing. The specialists handle execution. You handle the actual problem you're trying to solve.
We didn't build A.L.I.C.E. because we needed more AI tools. We built it because the tools that existed made us do too much coordination work ourselves. The value of AI in a workflow should scale with the complexity of the problem, not plateau because you keep bumping into the limits of a single-context conversation.
The team model isn't a feature. It's a different way of thinking about what AI assistance should be.
If you're still managing AI through a single chat window, you're not using AI wrong — you're using the wrong architecture.
Specialists beat generalists at depth. Teams beat individuals at parallelism. Orchestration beats improvisation at reliability.
That's the thesis. We're building around it.