Shipyard
A production-grade multi-tenant SaaS starter. Tenant isolation enforced by a repository chokepoint, permission-based RBAC, append-only audit log, token-bucket rate limits, and a billing scaffold with a real subscription state machine.
Abstract
Shipyard is an open-source, MIT-licensed multi-tenant SaaS starter for Next.js. The headline guarantee is that one tenant cannot read or write another tenant’s data, enforced by a single repository chokepoint that injects the tenant predicate into every scoped query and stamps it onto every scoped insert. On top of that sit permission-based RBAC, an append-only audit log, a token-bucket rate limiter with a pluggable store, and a billing scaffold that includes a real subscription state machine and a Stripe-shaped webhook signature check. The project ships with 29 tests across six isolated suites that prove each guarantee, runs against the built-in node:sqlite in development and tests, and is structured for a Postgres swap in production. This whitepaper documents the architecture, the technical decisions, the alternatives considered, and the reasons for picking Shipyard over a homegrown starter or a Clerk-plus-Stripe paste.
01Executive Summary
Every B2B SaaS needs the same spine before it can ship product: tenant isolation, sessions, a role model, an audit trail, rate limits, and a billing scaffold that does not embarrass you on the first sales call. The pieces are well-understood. They are also each independently easy to get subtly wrong, and the failures are the kind that surface in production: a query missing its tenant predicate, a role check living in the client, a webhook reactivating a cancelled plan because nobody validated the transition.
Shipyard exists to write that spine once, with the isolation and authorisation guarantees pinned down by tests rather than by code review. The project is opinionated about the hard parts (one chokepoint for tenant data, server-side authorisation, append-only audit, validated state transitions) and deliberately empty everywhere your product lives (no UI kit, no ORM, no bundled payment SDK).
The whole spine is small enough to read in one sitting. The repository is roughly 230 lines, the RBAC guard is about 30, the limiter is about 60. Total pnpm test wall time is roughly one second including process start. Six isolated suites prove tenant isolation, RBAC, audit, rate limits, billing transitions and Stripe webhook verification.
02Background & Motivation
I have started the same B2B SaaS three times. Each time I spent the first fortnight rebuilding the same unglamorous spine before I could touch the actual product. The cost is not the lines of code, it is the second-order risk of getting any of them subtly wrong in a way that does not show up until a customer notices. A scoped query that forgot its WHERE. A role comparison that drifted as a new role was added. A webhook handler that accepted a replayed event because the state machine was implicit in the if-statements.
The market has three nearby answers. One: glue Clerk or Auth0 onto Stripe and write the rest yourself. Two: clone a paid SaaS boilerplate. Three: copy your last project. Each works, none of them give you a tested guarantee of tenant isolation on commit one, because that guarantee is structural rather than something an auth provider can add.
Shipyard takes the fourth route. Write the spine once, define the chokepoint explicitly, prove it with tests, and document the design choices in prose. Make it run with zero external services so the install is fast and the guarantees prove themselves on any machine.
03The Problem
The specific failure modes Shipyard is designed to remove:
- Cross-tenant reads and writes. A query path that does not include the tenant predicate is a data breach waiting for the right input. Application-level isolation has to be testable on commit one, not after Postgres is configured.
- Smuggled tenant ids. A request body that carries an
organisationIdfrom the client must not be trusted, even by a careful developer who only sometimes remembers to strip it. - Role check drift. Asserting roles directly at the call site (
if (role === "admin")) means every new capability invites a quiet inconsistency between routes. - Client-trusted authorisation. A role read from a JWT and acted on in the browser is not authorisation, it is a UI hint with no enforcement.
- Sessions in plaintext. Storing the session token verbatim means a database leak hands an attacker live sessions.
- Audit gaps. Privileged actions without a tamper-resistant trail leave an incident review without an answer.
- Implicit state machines. A subscription model where transitions are scattered across handlers ends up accepting replays and out-of-order webhooks.
- Hand-rolled signature compare. Webhook signature checks that use
===instead of constant-time comparison leak signature bytes through wall-clock variance.
04Goals & Non-goals
Goals
- A single, narrow, auditable path for every tenant-scoped read and write.
- Server-side authorisation through permissions, with a fail-closed default.
- Append-only audit log written through the scoped repository.
- Token-bucket rate limits per (tenant, route group), with a pluggable store for multi-instance correctness.
- A real subscription state machine that rejects illegal transitions.
- A Stripe-shaped webhook signature check that is the real HMAC scheme, not a stub.
- Zero external services to run locally.
pnpm installandpnpm testfinish in seconds. - Tests that prove each guarantee, with one fresh in-memory database per suite.
Non-goals
- An ORM. The whole isolation argument rests on one narrow path. A generated query builder hides the
WHEREclause the guarantee depends on. - A bundled payment SDK. The Stripe adapter is a seam. The webhook signature is real, the rest of the calls point you at the four-line pnpm install.
- A UI kit. A minimal settings dashboard proves the wiring, then gets out of the way.
- SQLite in production. The repository is built for the Postgres swap. SQLite is the dev and test layer.
- A single-tenant skeleton. The tenancy machinery is pure overhead if you are not multi-tenant.
05Architecture
Request flow
Module map
| File | Responsibility |
|---|---|
src/db/schema.ts | Table definitions and the TENANT_SCOPED_TABLES set |
src/db/repository.ts | The chokepoint. Scoped and global helpers, predicate injection |
src/db/migrate.ts | Schema migrations driven by the table descriptors |
src/lib/auth.ts | Sessions (SHA-256 hashed), scrypt password hashes, signup, login |
src/lib/context.ts | resolveContext: session → user → tenant → role |
src/lib/rbac.ts | Permissions, roles, requirePermission, guard |
src/lib/audit.ts | recordAudit, listAudit |
src/lib/rate-limit.ts | Token bucket, injectable clock, BucketStore interface |
src/lib/billing/plans.ts | Plan catalogue and per-metric budgets |
src/lib/billing/service.ts | Subscription state machine and usage metering |
src/lib/billing/provider-fake.ts | In-memory provider for tests and local dev |
src/lib/billing/provider-stripe.ts | Stripe-shaped seam. Real HMAC verification, stubbed customer/subscription calls |
src/lib/http.ts | withGuard wrapper for routes, cookie setter, error mapping |
tests/* | Six suites, fresh in-memory DB each |
06Key Technical Decisions
A repository chokepoint, not Postgres RLS as the primary guard
Row-level security is genuinely good, and Shipyard documents it as the production defence in depth on Postgres. It is not the primary guard for two reasons. The project has to run and prove itself with zero services, which rules out making Postgres a prerequisite. And an application-level guard fails loudly in a unit test on any database, whereas an RLS misconfiguration fails silently until production. Both, not one, was the only honest answer.
node:sqlite, not better-sqlite3 or an ORM
better-sqlite3 is excellent and it is also a compiled addon, which is exactly the thing that breaks in someone’s CI on a Tuesday. node:sqlite (stable from Node 24) has no native build step, so pnpm install is fast and pnpm test runs anywhere. An ORM was rejected for a different reason: the isolation argument rests on there being one narrow, auditable path to tenant data, and a hand-written repository of about 230 lines is something I can read top to bottom and reason about.
Permissions at the call site, roles as bundles
Asserting roles directly (if (role === "admin")) is shorter and rots. Every new capability forces a revisit of every role comparison and meaning drifts. Asserting a permission (requirePermission(role, "members:invite")) keeps routes readable and lets the role table grow without touching call sites. The guard throws on missing permission so a forgotten check fails by raising rather than by silently allowing.
Sessions hashed at rest
Opaque 32-byte tokens with only the SHA-256 hash stored. A database dump does not hand out live sessions because the stored hash cannot be presented as a cookie. The plaintext is the httpOnly cookie. Passwords are hashed with scrypt from node:crypto, with cost parameters embedded in the hash so they can be raised later without a migration.
Token bucket over a fixed window or sliding-window log
A fixed window double-rates at the boundary. A sliding-window log needs a timestamp list per key. The bucket is two numbers, (tokens, lastRefill), which is also why it ports unchanged to a Redis Lua script. auth is the tightest budget at five-with-one-every-five-seconds, which blunts credential stuffing.
A real state machine for billing
Allowed transitions are explicit. canceled is terminal. An out-of-order or replayed webhook that tries an illegal move (for example reactivating a cancelled subscription) is rejected with a BillingError rather than silently applied. Webhook events are also checked against the stored providerSubscriptionId so an event for a different subscription cannot mutate this tenant’s record.
A Stripe-shaped seam, not a Stripe SDK dependency
Bundling a payment SDK into a starter is the wrong default. The webhook signature check is the one piece you genuinely cannot fake, so it is implemented for real: HMAC-SHA256 over `{timestamp}.{payload}` with the webhook secret, compared in constant time via timingSafeEqual. The customer and subscription methods throw with a pointer to the wiki until you pnpm add stripe and fill them in.
07Alternatives Considered
Why this over a homegrown starter
The honest version of a homegrown starter takes a fortnight per project and produces a slightly different spine every time, because the design decisions are remade from cold. The result is a portfolio of slightly inconsistent starters, none of which has a tested isolation guarantee on commit one. Shipyard is the version where the design decisions are made once, written down, and pinned by tests.
Why this over Clerk plus Stripe pasted together
Clerk solves authentication and gives you a hosted UI for sign-in and organisations. Stripe solves billing. Neither solves tenant isolation in your database, because that is structural to your code rather than something a provider can do for you. The interesting questions are still yours: which tables carry organisationId, where the predicate is enforced, what stops a smuggled id from landing under the wrong tenant, what makes the audit log trustworthy. A Clerk-plus-Stripe paste leaves all of those open and bills you monthly for the bits it does cover. Shipyard answers them in the repository and is MIT-licensed.
Why this over a paid SaaS boilerplate
Paid boilerplates tend to optimise for surface area. Lots of pages, lots of integrations, lots of branding. The spine underneath is rarely the part they sell on. Shipyard does the opposite: only the spine, with the guarantees front and centre. If you want pages, bring your own design system. If you want integrations, the seams are explicit.
Why this over Postgres RLS alone
RLS is excellent for defence in depth and the right answer in production. It is not the right primary guard because it makes Postgres a prerequisite for the project to install and prove itself, and because RLS misconfiguration fails silently. Application-level isolation is what you test on commit one. RLS is what you add when you swap to Postgres.
08Results & Performance
Test suite
Apple M3 Pro, Node v25.9.0. Real numbers from my machine, not estimates.
$ pnpm test
Test Files 6 passed (6)
Tests 29 passed (29)
Duration 468ms
$ /usr/bin/time -p pnpm test # whole command, including process start
real 1.04What each suite proves
| Suite | What it proves |
|---|---|
tenant-isolation | Cross-tenant reads return nothing; a smuggled tenant id is overwritten; cross-tenant updates change zero rows |
rbac | A viewer is refused privileged actions; a user with no membership in the active tenant fails closed |
audit | Signup and invitations write entries with the correct actor, tenant and metadata; entries are returned newest first |
rate-limit | The bucket allows up to capacity, blocks past it, refills at the configured rate and never exceeds the ceiling |
billing | Subscribe and webhook transitions are validated; illegal transitions are rejected; plan budgets stop usage overrun |
stripe-webhook | A correctly signed payload is accepted and mapped; a tampered or unsigned payload is rejected |
Each suite gets its own fresh in-memory database, so there is no shared state to leak between cases. The whole run is hermetic and reproducible.
09Lessons & Trade-offs
What worked
- Asserting the chokepoint with a test before any feature. The smuggled-id and cross-tenant-update cases catch the entire class of subtle isolation bugs in one suite.
- Skipping
organisationIdin the where loop. One line in the repository (if (key === "organisationId") continue) makes the predicate genuinely non-overridable rather than merely conventionally so. - Permissions as a typed tuple. A new capability that is not in
PERMISSIONSis a type error at the call site, so the wiring stays in sync. - Injectable clock on the limiter. Tests advance the clock by hand, so refill behaviour is exact and deterministic with no
setTimeout. - Explicit allow list for state transitions. Reading the allowed moves as a list is much easier to review than reading the equivalent if-tree.
Trade-offs accepted
- SQLite in production is not supported. The repository is built for the Postgres swap. Shipping the SQLite layer to production is on the user.
- The Stripe adapter is a seam. Customer and subscription calls throw until you bring in the SDK. The webhook signature is the one piece I refused to stub.
- Single-instance rate limiting by default. The bucket store is in-memory. Behind several instances the effective limit multiplies until you wire the Redis store. The interface exists for exactly that.
- No UI kit. A minimal dashboard proves the wiring. Bring your own design system.
- Not a single-tenant skeleton. If you are not building multi-tenant, the tenancy machinery is overhead. Start elsewhere.
10Conclusion
The hard parts of a B2B SaaS spine are the parts that fail subtly. Cross-tenant queries that look right. Role checks that drift. Webhooks that accept replays. Signature checks that leak through timing. Each of these is a one-line fix when you know the pattern, and a real outage when you do not. Shipyard’s contribution is to make the patterns explicit, narrow, and tested, so the spine you copy across projects is the same one each time and the guarantees travel with it.
What you build on top is your product. Bring your own UI kit, your own routes, your own data model. The spine gets out of the way as soon as you have it.
AConfiguration
| Variable | Required | Default | Purpose |
|---|---|---|---|
SHIPYARD_DB_PATH | For dev | in-memory | Path to the SQLite file. Omit for an ephemeral in-memory database (the default for tests) |
NODE_ENV | No | development | Sets the session cookie’s secure flag when production |
BILLING_PROVIDER | No | fake | Selects the billing provider. stripe wires up provider-stripe.ts |
STRIPE_SECRET_KEY | If Stripe | , | Stripe secret. Used once you fill in the customer and subscription calls |
STRIPE_WEBHOOK_SECRET | If Stripe | , | Webhook signing secret. The HMAC verification is already real and tested |
STRIPE_PRICE_PRO | If Stripe | , | Stripe Price id for the Pro plan |
STRIPE_PRICE_SCALE | If Stripe | , | Stripe Price id for the Scale plan |
BProduction Checklist
- Swap SQLite for Postgres. Implement the same
Repositoryinterface against Postgres. The application code does not change. - Enable RLS as defence in depth. Keep the repository as the application-level guard and add an RLS policy keyed on a session variable. Both, not one.
- Wire a Redis
BucketStore. Implementget/setagainst Redis, ideally with refill-and-take in a small Lua script so concurrent requests across instances cannot both spend the last token. - Set
BILLING_PROVIDER=stripe.pnpm add stripe, fill increateCustomer,createSubscription,cancelSubscriptioninprovider-stripe.ts. The webhook signature check is already real. - Lock down audit-log mutation at the database. Revoke
UPDATEandDELETEonaudit_logfor the application role, so the append-only property is enforced below the application as well. - Pin Node to the version you tested. Reproducible builds matter, especially for a project that depends on built-in
node:sqlite. - Put the app behind a TLS-terminating proxy. Set
NODE_ENV=productionso the session cookie is markedsecure. - Add bounces and complaints monitoring to your email provider. Sign-up and invitation flows depend on it.