Multi-agent workflows you can replay, audit, and trust.
Durable state. Deterministic replay. Tool budgets. The reliability tier most agent demos lack.
Agent Orchestrator runs multi-agent workflows the way payment systems run payments. Every step is journaled to Postgres, every tool call has a budget, every workflow can be paused, resumed, and replayed bit-for-bit. The Inspector UI shows the live agent graph and lets you step through past runs at the message level. TypeScript end-to-end, Postgres for state, Redis with BullMQ for the queue.
Why this exists
A single agent doing one task on one happy path is a demo. A graph of agents handing work to each other across hours, with retries, partial failures, and tool budgets, is a system. The leap from one to the other is where most agent projects stall.
The bigger frameworks (LangGraph, CrewAI, AutoGen) help you describe the graph but leave durability and replay as exercises for the reader. The smaller libraries solve the prompting layer and treat workflow concerns as out of scope. The result is that every team trying to ship an agent product writes the same Postgres journal, the same BullMQ wiring, and the same retry plumbing.
Agent Orchestrator is that infrastructure layer, written once. Workflows live in Postgres. Steps are events. Replay reads the events back deterministically. Tool budgets are enforced before the model is invoked. The Inspector UI is the missing observability tier.
What it does
Every feature below ships in the public repository today. Clone, configure, run.
Workflow as code
Workflows are TypeScript functions. Agents and tools are declared inline. The graph is whatever shape your code makes.
Durable state
Every step writes a row to Postgres before it runs. Crashes resume from the last checkpoint, not the start.
Deterministic replay
Re-run any workflow from any checkpoint with the same inputs. Tool calls are replayed from the journal, not re-executed.
Tool budgets
Per workflow, per agent, per tool. Counts and £ caps. A budget breach raises a typed error before the model is called.
Inspector UI
Next.js app that connects to the orchestrator. Live graph state, message timelines, replay controls. Built for engineers, not for marketing screenshots.
Drizzle schemas
Type-safe Postgres access via Drizzle ORM. Migrations live in the repo. The schema is small enough to read in one sitting.
BullMQ queues
Each agent step is a BullMQ job. Backoff, concurrency, and dead-letter queues come for free. Redis is the only extra dependency.
Stream-friendly
Workflow events stream to the Inspector via SSE. Tail any run in real time without polling.
Tenancy isolation
Workflows are scoped to a tenant. Postgres row-level security policies enforce isolation at the database layer.
OpenTelemetry
Every workflow step emits a span. Traces nest agent calls under workflow calls and tool calls under agent calls.
Tech stack
Architecture, in one diagram
The whole system on a single screen. Every box maps to a real folder in the repo.
┌────────────────────────────────────────────────────────────────────┐ │ Agent Orchestrator runtime │ │ │ │ ┌──────────────────┐ ┌───────────────────┐ ┌──────────────┐ │ │ │ Workflow runner │ ◀ │ BullMQ step queue │ ◀ │ HTTP / tRPC │ │ │ │ - state machine │ │ - retries │ │ - new run │ │ │ │ - replay engine │ │ - DLQ │ │ - resume │ │ │ └────────┬─────────┘ └────────┬──────────┘ └──────────────┘ │ │ │ │ │ │ ▼ ▼ │ │ ┌──────────────────────────────────────────────┐ │ │ │ Postgres (Drizzle) │ │ │ │ - workflows, runs, events, budgets, tools │ │ │ │ - row-level security per tenant │ │ │ └──────────────────────────────────────────────┘ │ │ ▲ │ │ │ │ │ ┌──────────────────────┐ │ │ │ Inspector UI │ Next.js · tRPC · SSE event tail │ │ │ live graph + replay │ │ │ └──────────────────────┘ │ └────────────────────────────────────────────────────────────────────┘
Quick start
From clone to first request in under five minutes.
git clone https://github.com/sarmakska/agent-orchestrator.git cd agent-orchestrator
pnpm install docker compose up -d # Postgres + Redis on localhost cp .env.example .env # set DATABASE_URL, REDIS_URL, OPENAI_KEY
pnpm db:migrate # Drizzle migrations pnpm dev:runner # workflow runner + queue worker pnpm dev:inspector # Inspector UI on :4000
pnpm orch run examples/research-and-summarise.ts \
--input '{"topic":"battery storage 2026"}'Where it fits
The patterns this repository was built around.
Research and write workflows
A planner agent breaks a topic into questions; researcher agents fetch and summarise; an editor stitches output. Replayable and audit-ready.
Internal ops automation
Multi-step workflows over Slack, Linear, Notion, GitHub. Tool budgets stop a runaway loop from rate-limiting your accounts.
Data extraction pipelines
Document-in, structured-out at scale. The orchestrator handles retries, partial failures, and idempotency keys so you do not.
Customer-facing agent flows
When customers can pause and resume a long-running task. Durable state survives restarts; replay lets support reproduce any past run.
Related products
The wider Sarma Linux toolkit. Every project ships with the same opinions: open source, MIT, real depth, no marketing fluff.
Run agents like payments. Durable, replayable, audited.
Clone the repo, follow the four-step quick start, ship something real.