How it works · sandboxd

How sandboxd works

Three independent fences, one audited host import, and a fresh Store per run. Six tests prove the fences hold. The whole thing is small enough to read in one sitting.

TL;DR

Bytes in,
typed answer out.
Nothing in between.

Compile the module. Walk its imports against an allow-list. Build a fresh Store with fuel, an epoch deadline and a memory cap. Define only the granted host imports on the linker. Arm a watchdog. Instantiate. Call the export. Return RunOutput or a typed SandboxError.

No WASI. No clock the guest can read. No filesystem. No sockets. No environment. Six items on the public API. One file (src/host.rs) for the host boundary.

Two of the three fences are deliberately redundant on the most common attack, an infinite loop, because fuel and time fail in different ways and I want the sandbox to survive either.

<span class="dim">Untrusted module: infinite_loop.wat</span> <span class="dim">Limits: fuel 1e6, timeout 100ms, memory 1MiB</span> <span class="hl">Step 1 · Compile</span> wasmtime::Module::new(engine, bytes) <span class="ok">[OK] valid</span> <span class="hl">Step 2 · Imports</span> walk against allow-list <span class="ok">[OK] no ungranted imports</span> <span class="hl">Step 3 · Store</span> set_fuel(1e6), set_epoch_deadline(1) limiter(memory_max = 1MiB) <span class="hl">Step 4 · Linker</span> define only granted imports (none here) <span class="hl">Step 5 · Watchdog</span> spawn thread; sleep 100ms; bump epoch <span class="hl">Step 6 · Run</span> linker.instantiate(store, &module) <span class="dim">guest spins...</span> <span class="dim">Trap::OutOfFuel</span> <span class="hl">Step 7 · Map</span> → SandboxError::FuelExhausted <span class="dim">CLI exit: 2</span>
Core flow

The run loop

rendering
Sandbox::run internals: parse, allow-list, per-run store with fuel + epoch + memory limiter, watchdog, typed errors.
Subsystems

Each piece, deep-dived

wasmtime engine, configured once

Why it exists

wasmtime is the runtime that does the heavy lifting: compilation via Cranelift, linear-memory protection, trap handling. sandboxd is a thin policy layer on top of it.

How it actually works

A single wasmtime::Engine is built with consume_fuel set true and epoch_interruption set true. Both flags are mandatory: fuel is bookkeeping the runtime cannot opt out of once enabled, and epoch interruption is the only mechanism wasmtime exposes for asynchronous wall-clock cancellation. wat is enabled so .wat fixtures parse without a separate step.

Fuel metering, the deterministic CPU fence

Why it exists

A deterministic instruction counter is the only CPU bound that gives the same answer on every machine. That is what makes fuel usable as a quota and a billing unit.

How it actually works

Limits::new takes a fuel budget. Store::set_fuel arms it before instantiation. Every wasm instruction deducts. When the budget hits zero the guest traps with Trap::OutOfFuel, which map_runtime_error translates to SandboxError::FuelExhausted. RunOutput::fuel_consumed reports what was spent, which is the number you size next time from.

The per-run watchdog and epoch interruption

Why it exists

Fuel says nothing about wall-clock time. A guest that calls a slow host function, or one the OS deschedules, can hold a thread while burning almost nothing. The watchdog catches that.

How it actually works

Sandbox::run spawns one thread per call. It sleeps until the configured deadline, calls engine.increment_epoch() once, then exits. A shared AtomicBool lets the main thread signal completion so the watchdog can exit early. The guest checks the epoch at safe points and traps with Trap::Interrupt, which becomes SandboxError::Timeout. One thread per run, no idle ticker, precise per-call deadline.

ResourceLimiter, the memory cap

Why it exists

Linear memory is the obvious denial-of-service vector. Without a cap a single memory.grow loop can starve the host.

How it actually works

A struct implementing wasmtime::ResourceLimiter is attached to the Store. memory_growing returns Ok(false) once the requested size exceeds the cap, which refuses the growth without trapping. growth_was_denied is recorded; the guest sees memory.grow return -1; however the guest reacts (panic, unreachable, retry) the wrapper reports SandboxError::MemoryLimitExceeded so callers always see the real cause.

reject_disallowed_imports, the allow-list walk

Why it exists

A linker that fails to define an import would still let the parsed module sit in memory. Better to reject at the door, name the offending import, and never construct the store at all.

How it actually works

Before Store creation, every Module::imports() entry is checked against the allow-list. Today the only allowed pair is ("host", "log") when HostAbi::log_allowed() is true. Anything else returns SandboxError::DisallowedImport with the exact module and name. As belt and braces, if this check were ever bypassed, the linker only defines granted imports, so wasmtime instantiation would fail and map_instantiation_error translates that back into DisallowedImport too.

host::log, the audited capability

Why it exists

A sandbox with literally zero observable side effects is useful for pure computation only. Logging is the smallest, safest first import to ship, and it acts as the worked reference for any future capability.

How it actually works

Signature (param i32 i32): pointer and length into the guest&rsquo;s exported memory. The host reads memory.data(&caller), validates with ptr.checked_add(len), slices with .get(ptr..end) so any out-of-range read returns None and traps cleanly, and decodes with String::from_utf8_lossy so invalid bytes become replacement characters rather than aborting the host. The line is appended to an Arc<Mutex<Vec<String>>> sink that the embedder owns.

Technology choices

Why this, not that

wasmtime, not wasmer

Why we use it

wasmtime is the BytecodeAlliance reference runtime. Its fuel and epoch_interruption APIs are exactly the primitives this design needs, and Cranelift is the most-audited wasm code generator in production. CVE history is short and disclosure is professional.

Why not the alternative

wasmer, fine runtime, less polished fuel and timeout story, smaller security history. WAVM, research-grade. Native wasm interpreters, too slow for non-trivial guests.

Rust, not C or Go

Why we use it

wasmtime is Rust, so the embedding is zero-overhead and the borrow checker keeps the host boundary honest. thiserror gives typed errors without boilerplate, clap is the obvious CLI choice.

Why not the alternative

C, manual lifetime management on guest memory is exactly the bug class this is supposed to prevent. Go, viable, but no first-class wasmtime binding and the host-call FFI cost is real.

Per-run watchdog thread, not a global ticker

Why we use it

A per-run thread gives every call its own precise deadline and no idle thread between runs. The thread spawn cost is in the noise next to module compile.

Why not the alternative

A global ticker thread bumping the epoch every N milliseconds is simpler but gives you coarse shared timing and a thread that runs forever, neither of which I want.

Deny-by-default, not WASI with restrictions

Why we use it

Starting from nothing and adding one audited function means the allow-list is short enough to read in full and the default is the safe one. The allow-list grows only as fast as I can audit each addition.

Why not the alternative

wasmtime-wasi plus a capability filter. WASI&rsquo;s surface is large and its preview still moves, and grant-all-then-claw-back is exactly the deny-list posture that leaks. If you need files, WASI is the right tool, but it is a different project.

Typed SandboxError per failure mode

Why we use it

Callers branch on why a run stopped, bill it, retry it, ban the module, without scraping strings. The CLI maps each variant to its own exit code, which makes scripting around it trivial.

Why not the alternative

An opaque error with a message string. Works once, becomes a parsing exercise the second time you need to react to a specific failure.

Fresh Store per run

Why we use it

Fuel, the epoch deadline, the limiter, linear memory and globals are all per-store. A fresh one per run gives bullet-proof isolation between calls with no cleanup logic.

Why not the alternative

Reusing a store across runs to save compile time. The interactions between leftover globals, residual fuel and the epoch counter become a footgun fast. Module caching is a better optimisation if needed.

Numbers & escape-attempt tests

What you can measure

522
fuel for fib(30)
Identical on every run, every machine
~10ms
cold CLI run
100 invocations of fib(30) in 1.06s on M3 Pro
~145ms
100ms timeout wall time
Extra is process spawn + compile, not deadline slack

Escape-attempt tests in tests/sandbox.rs

fuel_exhaustion_terminates
Scenario: infinite_loop.wat with a long timeout, low fuel. Asserts FuelExhausted and that the timeout never fires.
Proves: Proves fuel, not time, stops the run when fuel is the binding constraint.
epoch_timeout_terminates
Scenario: infinite_loop.wat with effectively unlimited fuel, 100ms timeout. Asserts Timeout.
Proves: Proves the watchdog handles the case where fuel cannot stop the guest in time.
memory_cap_enforced
Scenario: memory_bomb.wat with a 4 MiB cap. Asserts MemoryLimitExceeded.
Proves: Proves the ResourceLimiter refusal becomes a typed error regardless of how the guest reacts to memory.grow returning -1.
disallowed_import_rejected
Scenario: disallowed_import.wat imports env::secret. Asserts DisallowedImport { module: "env", name: "secret" }.
Proves: Proves the rejection happens before any guest code runs and the error names the offending import.
log_import_denied_by_default
Scenario: logger.wat imports host::log without HostAbi::allow_log(). Asserts DisallowedImport.
Proves: Proves even the one known capability is off until you ask for it.
allowed_import_works
Scenario: logger.wat with HostAbi::allow_log(). Calls run, then reads the sink.
Proves: Proves the capability is functional when granted and the captured line matches what the guest wrote.
Roadmap

What is next

Monotonic clock capability

host::time_monotonic_ns returning a u64. Safe because it reveals nothing the guest cannot infer from its own behaviour, and useful for guest-side measurement.

Seeded RNG capability

host::random_fill taking a buffer pointer and length, seeded per-run from the embedder. Deterministic if the seed is fixed, which preserves replayability.

Per-run memory reported

RunOutput will return memory_high_water alongside fuel_consumed, so embedders can size limits from one observed run rather than guessing twice.

Optional precompiled modules

A cache layer for embedders that run the same module repeatedly. Cuts the dominant cost (Cranelift compile) on hot paths without touching the security boundary.

Not happening: WASI

If you need files, sockets or a clock the guest can wall-time, WASI is the right tool and a different project. sandboxd stays small on purpose.

Not happening: a plugin manager

No package format, no network, no marketplace. The scope is run these bytes under these limits and tell me what happened.

Ready to embed it?

cargo add sandboxd, Sandbox::deny_all(), run untrusted bytes. The host boundary is one file you can read in a coffee break.