WebAssembly sandbox/MIT/Built on wasmtime

sandboxd, three fences and no ambient authority

Run untrusted WebAssembly with three independent fences: fuel, wall-clock, memory. A deny-by-default host ABI on top of wasmtime. No WASI, no clock, no filesystem, no network, no environment until you explicitly grant it.

Typed SandboxError per failure mode so callers branch on why a run stopped (bill it, retry it, ban the module) without scraping strings. Six public items. One file for the host boundary.

View on GitHub Read whitepaper How it works

At a glance

independent fences

audited host import

522

fuel for fib(30), every run

~10ms

cold CLI invocation

MIT

license

Fuel deterministic, replayable, billable

Epoch interruption catches time-in-host calls

ResourceLimiter refuses growth past the cap

Allow-list walked before instantiation

Why this exists

I wanted to run code I did not write, and did not trust, inside my own process, without giving it the process. The classic answers are a container or a virtual machine per call, but spinning one of those up to evaluate a few hundred instructions of someone's plugin is absurd overhead, and it still leaves you trusting a much bigger surface.

WebAssembly is the right shape for this. A guest cannot name an address it was not given, cannot call a function it was not handed, and runs on a runtime built for exactly this. What was missing for me was a small, auditable layer that turns wasmtime's primitives into three hard fences with a typed answer for why a run stopped.

sandboxd is that layer. Fuel metering, epoch interruption driven by a per-run watchdog, a ResourceLimiter for memory, and a linker that defines only the imports you opt into. The host boundary is one file you can read in a coffee break. The public API is six items.

Why this matters

A container per call is the wrong shape

Running untrusted code in-process used to mean a container or a virtual machine per call, with all the cold-start cost and the much larger trust surface that brings. WebAssembly with three hard fences and zero ambient authority is a much smaller blast radius. sandboxd is the small, auditable layer that turns wasmtime primitives into a typed answer for why a run stopped.

The attacks, and how each one dies

Five hostile fixtures ship in the repo. Every row is exercised by an integration test in tests/sandbox.rs. Two of them are the same module stopped by two different fences, that redundancy is the whole design.

Fixture	Attack	How it is stopped	Error	Exit
infinite_loop.wat	spin forever on a back-edge loop	fuel runs out, every instruction deducts from the budget until zero	FuelExhausted	2
infinite_loop.wat (huge fuel)	spin forever with fuel set so high it never empties	the epoch watchdog bumps the engine epoch after the deadline; the guest trips at its next loop check	Timeout	3
memory_bomb.wat	call memory.grow in a loop until the host is starved	the ResourceLimiter refuses growth at the cap; memory.grow returns -1; the guest reaction is reported as a cap breach	MemoryLimitExceeded	4
disallowed_import.wat	import env::secret, a capability that does not exist	rejected at instantiation, before any guest code runs; the error names the import	DisallowedImport	5
logger.wat (no grant)	import host::log without it being granted	same deny-by-default rejection, even the one known capability is off until you ask for it	DisallowedImport	5

Built-in guarantees

Each one is a property the design enforces, not a flag you have to remember to set.

Fuel-metered CPU bound

Every WebAssembly instruction deducts from a budget. When the budget hits zero the guest stops with SandboxError::FuelExhausted. Fuel is deterministic, the same module on the same inputs consumes the same fuel every run, which is what makes it a replayable quota and a credible billing unit.

Wall-clock fence via epoch interruption

A per-run watchdog thread sleeps until the configured deadline, bumps the engine epoch once, then exits. The guest trips at its next safe point with SandboxError::Timeout. This catches code that does not burn fuel predictably, including time spent inside host calls.

Memory cap via ResourceLimiter

A wasmtime ResourceLimiter refuses linear-memory and table growth past the configured cap. memory.grow returns minus one to the guest; however the guest reacts, the run is reported as SandboxError::MemoryLimitExceeded.

Deny-by-default host ABI

No WASI, no clock, no filesystem, no network, no environment. Every import is walked against an allow-list before instantiation. A module that imports anything ungranted is rejected with SandboxError::DisallowedImport naming the exact offending import, before any guest code runs.

One audited capability, host::log

Opt in with HostAbi::deny_all().allow_log() and receive a shared sink. The implementation reads a pointer and length, validates with checked_add, slices with get so out-of-range reads trap, and runs the bytes through from_utf8_lossy so bad UTF-8 never crashes the host. Small enough to audit in full.

Typed SandboxError per failure mode

FuelExhausted, Timeout, MemoryLimitExceeded, DisallowedImport, InvalidModule, ExportNotFound, Trap. Callers branch on why a run stopped (bill it, retry it, ban the module) without scraping strings. The CLI maps each variant to its own exit code.

Fresh store per run

Fuel, the epoch deadline, the memory limiter, linear memory and globals are all per-store. One run cannot observe or influence another. Run isolation is the default, not an afterthought.

Determinism you can rely on

fib(30) returns I32(832040) and consumes exactly 522 fuel every single time, on every machine, on every run. That repeatability is what lets fuel double as a quota you can reproduce and a unit you can charge against.

Tiny public surface

Sandbox, Limits, HostAbi, SandboxError, Value, RunOutput. Six items. Nothing else to learn. The whole API fits on one screen and the host boundary is one file you can read in a coffee break.

Honest about non-goals

No defence against microarchitectural side channels. No protection from DoS within the limits. As sound as wasmtime, no more. Stated up front in the threat model so you can decide if this is the right tool before you adopt it.

Tech stack

Rust 1.80+wasmtime 45CraneliftthiserrorclapWebAssemblyWAT fixturescargo

A run, start to finish

Bytes in, typed answer out. Every branch ends in a named SandboxError variant.

rendering

sandboxd run pipeline: compile, allow-list, per-run store with limits, watchdog, typed outcomes per failure mode.

Quick start

CLI in two minutes, library in five.

git clone https://github.com/sarmakska/sandboxd.git
cd sandboxd
cargo build --release

# A pure module: add(2, 40)
./target/release/sandboxd fixtures/well_behaved.wat \
  --invoke add --arg 2 --arg 40
# result: I32(42)   (fuel consumed: 4, on stderr)

# Fuel kills an infinite loop
./target/release/sandboxd fixtures/infinite_loop.wat --fuel 1000000
# exit 2

# A short deadline kills the same loop when fuel is effectively unlimited
./target/release/sandboxd fixtures/infinite_loop.wat \
  --fuel 100000000000 --timeout-ms 100
# exit 3

# Deny-by-default in action
./target/release/sandboxd fixtures/logger.wat                # exit 5
./target/release/sandboxd fixtures/logger.wat --allow-log    # runs

// Library: embed in your service
use std::time::Duration;
use sandboxd::{Sandbox, Limits, Value};

let sandbox = Sandbox::deny_all()?;
let limits = Limits::new(1_000_000, Duration::from_millis(500), 1 << 20);
let out = sandbox.run(
    wasm_bytes, "add",
    &[Value::I32(2), Value::I32(40)], &limits,
)?;
assert_eq!(out.values, vec![Value::I32(42)]);

Use cases

What sandboxd fits well, and what it does not.

Plugin systems for SaaS

Let customers ship WebAssembly modules that extend your product. They cannot read the filesystem, open a socket, or exfiltrate secrets, because none of those imports exist. They run on a fuel budget you set per plan tier.

User-supplied formulas and rules

A spreadsheet engine, a pricing rule editor, a routing condition language. Compile the user expression to wasm, run it under a tight fuel and time budget, return the result. A runaway formula stops in milliseconds.

Extension scripts in editors and IDEs

Third-party extensions that compute, transform, or validate without touching the host. The host grants exactly the imports it audited and nothing else, so a malicious extension cannot reach beyond its sandbox.

Untrusted code review and CI gating

Run a candidate module on a representative input and observe the fuel consumed. Use that as a deterministic budget for production. The CLI prints fuel-consumed on every successful run.

Replayable usage metering

Bill plugins per million fuel units. Because fuel is deterministic, every customer can reproduce their bill on their own machine. No proprietary counters, no disputes.

Safe evaluation in chat and agent tools

An agent wants to execute a small piece of guest-supplied code. WebAssembly with three fences and zero ambient authority is the right shape for that, much smaller blast radius than a container or a subprocess.

What it is not for

sandboxd is not a replacement for a full Linux distribution. If your guest legitimately needs files, sockets, a clock, or processes, this is the wrong tool, WASI or a container is the right one. sandboxd is also not a defence against microarchitectural side channels, and is only as sound as the underlying wasmtime.

Open source · MIT

Read it. Audit it. Embed it.

MIT licensed. Six public items. One file for the host boundary. The roadmap is a monotonic clock, a seeded RNG, and per-run fuel and memory reported together. Everything else is deliberately out of scope.

Star on GitHub Read the wiki

Ready to contain it?

View on GitHub Read whitepaper How it works Hire me to build with it

All open-source projects