Open Source · MIT · TypeScript

Multi-agent workflows you can replay, audit, and trust.

Durable state. Deterministic replay. Tool budgets. The reliability tier most agent demos lack.

Agent Orchestrator runs multi-agent workflows the way payment systems run payments. Every step is journaled to Postgres, every tool call has a budget, every workflow can be paused, resumed, and replayed bit-for-bit. The Inspector UI shows the live agent graph and lets you step through past runs at the message level. TypeScript end-to-end, Postgres for state, Redis with BullMQ for the queue.

View on GitHub Whitepaper How it works Get help shipping

Durable

State

Replay

Deterministic

Budgets

Per tool, per agent

Inspector

Live graph UI

MIT

Licence

Why this exists

A single agent doing one task on one happy path is a demo. A graph of agents handing work to each other across hours, with retries, partial failures, and tool budgets, is a system. The leap from one to the other is where most agent projects stall.

The bigger frameworks (LangGraph, CrewAI, AutoGen) help you describe the graph but leave durability and replay as exercises for the reader. The smaller libraries solve the prompting layer and treat workflow concerns as out of scope. The result is that every team trying to ship an agent product writes the same Postgres journal, the same BullMQ wiring, and the same retry plumbing.

Agent Orchestrator is that infrastructure layer, written once. Workflows live in Postgres. Steps are events. Replay reads the events back deterministically. Tool budgets are enforced before the model is invoked. The Inspector UI is the missing observability tier.

What it does

Every feature below ships in the public repository today. Clone, configure, run.

Workflow as code

Workflows are TypeScript functions. Agents and tools are declared inline. The graph is whatever shape your code makes.

Durable state

Every step writes a row to Postgres before it runs. Crashes resume from the last checkpoint, not the start.

Deterministic replay

Re-run any workflow from any checkpoint with the same inputs. Tool calls are replayed from the journal, not re-executed.

Tool budgets

Per workflow, per agent, per tool. Counts and £ caps. A budget breach raises a typed error before the model is called.

Inspector UI

Next.js app that connects to the orchestrator. Live graph state, message timelines, replay controls. Built for engineers, not for marketing screenshots.

Drizzle schemas

Type-safe Postgres access via Drizzle ORM. Migrations live in the repo. The schema is small enough to read in one sitting.

BullMQ queues

Each agent step is a BullMQ job. Backoff, concurrency, and dead-letter queues come for free. Redis is the only extra dependency.

Stream-friendly

Workflow events stream to the Inspector via SSE. Tail any run in real time without polling.

Tenancy isolation

Workflows are scoped to a tenant. Postgres row-level security policies enforce isolation at the database layer.

OpenTelemetry

Every workflow step emits a span. Traces nest agent calls under workflow calls and tool calls under agent calls.

Architecture, in one diagram

The whole system on a single screen. Every box maps to a real folder in the repo.

┌────────────────────────────────────────────────────────────────────┐
│                    Agent Orchestrator runtime                       │
│                                                                     │
│  ┌──────────────────┐    ┌───────────────────┐   ┌──────────────┐  │
│  │ Workflow runner   │ ◀ │ BullMQ step queue │ ◀ │ HTTP / tRPC   │  │
│  │  - state machine  │    │  - retries        │    │  - new run    │  │
│  │  - replay engine  │    │  - DLQ            │    │  - resume     │  │
│  └────────┬─────────┘    └────────┬──────────┘   └──────────────┘  │
│           │                        │                                │
│           ▼                        ▼                                │
│  ┌──────────────────────────────────────────────┐                  │
│  │  Postgres (Drizzle)                          │                  │
│  │   - workflows, runs, events, budgets, tools  │                  │
│  │   - row-level security per tenant            │                  │
│  └──────────────────────────────────────────────┘                  │
│           ▲                                                         │
│           │                                                         │
│  ┌──────────────────────┐                                          │
│  │  Inspector UI        │  Next.js · tRPC · SSE event tail         │
│  │  live graph + replay │                                          │
│  └──────────────────────┘                                          │
└────────────────────────────────────────────────────────────────────┘

Quick start

From clone to first request in under five minutes.

01

git clone https://github.com/sarmakska/agent-orchestrator.git
cd agent-orchestrator

02

pnpm install
docker compose up -d  # Postgres + Redis on localhost
cp .env.example .env  # set DATABASE_URL, REDIS_URL, OPENAI_KEY

03

pnpm db:migrate     # Drizzle migrations
pnpm dev:runner     # workflow runner + queue worker
pnpm dev:inspector  # Inspector UI on :4000

04

pnpm orch run examples/research-and-summarise.ts \
   --input '{"topic":"battery storage 2026"}'

Where it fits

The patterns this repository was built around.

Research and write workflows

A planner agent breaks a topic into questions; researcher agents fetch and summarise; an editor stitches output. Replayable and audit-ready.

Internal ops automation

Multi-step workflows over Slack, Linear, Notion, GitHub. Tool budgets stop a runaway loop from rate-limiting your accounts.

Data extraction pipelines

Document-in, structured-out at scale. The orchestrator handles retries, partial failures, and idempotency keys so you do not.

Customer-facing agent flows

When customers can pause and resume a long-running task. Durable state survives restarts; replay lets support reproduce any past run.

Related products

The wider Sarma Linux toolkit. Every project ships with the same opinions: open source, MIT, real depth, no marketing fluff.

SarmaLink-AI

Multi-provider AI assistant with sub-50ms failover across 36 engines.

Open product page

MCP Server Toolkit

Production-ready Model Context Protocol server starter, with plugins.

Open product page

Voice Agent Starter

Sub-second real-time voice loop with WebRTC, barge-in, and pluggable STT/TTS.

Open product page

AI Eval Runner

Evals as code. Datasets, scorers, traces, regressions, all in one CLI.

Open product page

Local LLM Router

OpenAI-compatible proxy that routes between local Ollama and cloud LLMs.

Open product page

StaffPortal

Open-source HR + ops platform built to replace three SaaS subscriptions.

Open product page

RAG-over-PDF

A minimal, production-shaped RAG starter with cited streaming answers.

Open product page

Receipt Scanner

Vision-OCR receipt scanning starter with Zod-typed JSON output.

Open product page

Webhook-to-Email

A tiny, production-grade webhook receiver with HMAC and React Email.

Open product page

Multi-agent workflows you can replay, audit, and trust.

Why this exists

What it does

Workflow as code

Durable state

Deterministic replay

Tool budgets

Inspector UI

Drizzle schemas

BullMQ queues

Stream-friendly

Tenancy isolation

OpenTelemetry

Tech stack

Architecture, in one diagram

Quick start

Where it fits

Research and write workflows

Internal ops automation

Data extraction pipelines

Customer-facing agent flows

Related products

SarmaLink-AI

MCP Server Toolkit

Voice Agent Starter

AI Eval Runner

Local LLM Router

StaffPortal

RAG-over-PDF

Receipt Scanner

Webhook-to-Email

Run agents like payments. Durable, replayable, audited.