Building production AI agents — lessons learned

After building and operating Claire AI (an OSS personal executive assistant) and the ACE (Agentic-Centric Engineering) system, I’ve collected a few lessons that might help if you’re taking agentic systems from prototype to something people rely on daily.

1. Clear boundaries beat “one model does everything”

Agents that have a well-defined scope and clear tool boundaries tend to be more reliable and easier to debug. In ACE, we use domain packs and engines: each pack owns a slice of behavior (e.g. GSD execution, daily planning), and the orchestrator routes to them instead of asking a single model to do everything. That makes it easier to add new capabilities without breaking existing ones and to trace failures back to a specific component.

2. Tool use and structured outputs are first-class

If your agent needs to call APIs, update state, or drive workflows, tool use and structured outputs (e.g. JSON schema) are not optional. We rely on tool definitions that the model can reason about and on parsing responses into typed structures before passing them to the rest of the system. That reduces hallucinated “success” and makes it possible to retry or escalate when something fails.

3. Humans in the loop where it matters

Not every step needs approval, but critical decisions (e.g. publishing, external actions, irreversible changes) should have a clear human-in-the-loop gate. Defining those gates up front and encoding them in the workflow (e.g. “confirm before deploy”) keeps the agent useful without giving it more authority than you intend.

4. Observability from day one

Logging inputs, tool calls, and outcomes (even in a simple form) pays off quickly. When something goes wrong, you need to know what the agent saw, what it did, and what came back. We’ve found that a small set of structured events per “turn” is enough to debug most issues without building a full tracing stack on day one.

This is a short reflection from building Claire AI and ACE. More posts on agent architecture, orchestration, and production patterns will follow. If you’re building agentic systems, I’d be happy to compare notes.