Case / AI agents

Tool-use agent · production

Production LLM agent with prompt caching, tool routing, structured output, per-tenant billing meters, and a reviewer loop that catches bad calls before they ship. Vendor-agnostic abstraction over multiple model providers.

LLM · TypeScript · Postgres

The problem

The team needed an LLM agent that handled real production traffic across multiple tenants. Earlier prototypes lacked structured output validation, had no per-tenant billing meters, and would happily call the wrong tool with the wrong arguments when prompts drifted. There was no reviewer loop to catch bad calls before they shipped customer-facing actions. Cost was unpredictable and a tenant outage could starve the rest.

The approach

We built a tool-use agent with prompt caching, typed tool catalogs, structured-output schemas validated on every call, and a vendor-agnostic abstraction over the major model providers. A reviewer loop runs a smaller deterministic check on each tool call before it ships; failures route to a dead-letter queue with full trace context. Per-tenant billing meters track token usage, tool calls, and resolution outcomes. Provider swaps are a config change, not a rewrite.

Stack and engineering choices

TypeScript agent runtime
Postgres trace store
Vendor-agnostic model abstraction
Typed tool catalogs
Structured-output JSON schemas
Reviewer loop with rollback
Per-tenant billing meters

Outcome

The agent runs in production across multiple tenants with predictable latency and cost. Bad tool calls are caught at the reviewer stage instead of in customer-facing systems. Switching model providers is a config change. Per-tenant billing is accurate enough to invoice from.

See more production AI agent work at quadevs across other engagements with similar shape.

Have a project that overlaps this work?

Send a one-paragraph brief. We reply within one business day.

hello@quadevs.com