Open to my next full-time role

Hire Sarma, software engineer. Production AI, built end to end, in the UK.

I design and ship LLM infrastructure and platform-grade software end to end. Nineteen MIT-licensed open-source repositories spanning a coding agent runner, a multi-provider gateway, a Rust inference server, storage engines, consensus, and WebAssembly sandboxes. Eighty-seven long-form engineering essays. Ready to bring all of that into a team that is shipping something serious.

30-second summary
  • Senior software engineer, around 5 years production experience, UK-based (Hemel Hempstead, Hertfordshire).
  • Focus: LLM infrastructure, AI engineering, platform engineering, agent orchestration, backend or full-stack.
  • Available now for permanent UK PAYE roles only. Not contract. Not consulting. Not Inside IR35.
  • Remote, hybrid or on-site. Open to relocation within commuting distance of Hemel Hempstead.
  • Email: projects@sarmalinux.com. LinkedIn: linkedin.com/in/sarmalinux.
19
Open-source repositories
87
Long-form engineering essays
36
Engines on the LLM failover stack
~5 yrs
Production engineering experience

One engineer. Nineteen public repositories. Zero hidden trade-offs. Read the code before you read the CV.

Capabilities, scored honestly

Tap a row for evidence

Five-point depth on the things I actually ship. Not maxed across the board, because that would not be true. Every row links to a real repository, a whitepaper, or a long-form essay you can read in ten minutes.

  • I shipped SarmaLink-AI across 36 engines and watched the failover save a client during the OpenAI outage on 28 May 2026. The gateway is OpenAI-compatible, intent-based plugin routing is gated behind a single env var, and Manus webhook persistence has real HMAC-SHA256 verification.

    Evidence: github.com/sarmakska/Sarmalink-ai, /products/sarmalink-ai whitepaper, /blog/sarmalink-ai-failover-deep-dive.

  • forge-infer is a minimal LLM inference server in Rust. Paged attention KV cache, continuous batching, speculative decoding, OpenAI-shaped HTTP surface. Not at vLLM throughput numbers, deliberately small enough to read in one sitting.

    Evidence: github.com/sarmakska/forge-infer, /products/forge-infer.

  • Agent Orchestrator is a TypeScript workflow engine on Postgres and BullMQ with deterministic replay and per-step tool budgets. The Inspector UI is a tRPC app that shows live graph state and lets you replay any message. Slipstream is the same idea applied to local coding agents with persistent memory.

    Evidence: github.com/sarmakska/agent-orchestrator, github.com/sarmakska/slipstream, /products/slipstream.

  • voice-agent-starter is a full duplex loop on WebRTC plus mediasoup. Pluggable STT, LLM, and TTS adapters. Per-stage latency telemetry so you can see where the budget goes. Site chatbot Martha runs on the same primitives, with sentence-by-sentence streaming TTS off a VPS at voice.sarmalinux.com.

    Evidence: github.com/sarmakska/voice-agent-starter, /products/voice-agent-starter.

  • lsmdb is a hand-written LSM tree storage engine in Go with compaction, bloom filters, and a wire test against RocksDB. raftkv is a Raft KV store with snapshots and a fault injection harness that randomises partitions, slow links, and dropped messages. Both written to teach myself the internals, not to compete with FoundationDB.

    Evidence: github.com/sarmakska/lsmdb, github.com/sarmakska/raftkv.

  • sandboxd is a Rust WebAssembly sandbox with a deny-by-default host ABI, capability handles passed in from the host, and fuel metering so a runaway guest cannot tie up a worker. Designed to be embedded into agent runners and SaaS plug-in surfaces.

    Evidence: github.com/sarmakska/sandboxd, /products/sandboxd.

  • terraform-stack puts Vercel, Supabase, Cloudflare, and DigitalOcean modules in one repo. Reproducible from scratch in one command. k8s-ops-toolkit is the Helm chart for the Kubernetes side: ingress-nginx, cert-manager, kube-prometheus-stack, Loki, ServiceMonitor template included. I run my own production on this exact stack.

    Evidence: github.com/sarmakska/terraform-stack, github.com/sarmakska/k8s-ops-toolkit.

  • Every service I run reports traces and metrics. The orchestrator emits a span per step and per tool call. The voice agent emits a stage timing per turn. Dashboards are checked in next to the code as Grafana JSON. Runbooks live in the repo, not on a shared drive someone will lose access to.

    Evidence: /blog category observability, /products/agent-orchestrator how-it-works.

  • rag-over-pdf is a minimal retrieval stack that keeps citation spans through generation so the answer can be checked. ai-eval-runner is evals as code: Python 3.12, uv, Typer CLI, DuckDB result store, FastAPI plus HTMX viewer. Designed to live in the same repo as the prompts and run on every push.

    Evidence: github.com/sarmakska/rag-over-pdf, github.com/sarmakska/ai-eval-runner.

  • sarmalinux.com itself is the proof. Public marketing site, full studio CRM at admin.sarmalinux.com, client portal at client.sarmalinux.com with magic link auth, Telegram bridge, web push, NPS flow, storage dashboard, 87 long-form blog posts with live API widgets. All on Next.js 16, Tailwind v4, Supabase, Vercel.

    Evidence: this site, /blog, /admin, /client.

14

engines on the failover ladder behind every model call I ship.

~70%

Disk IO budget reclaimed on Supabase page_views in one perf session, 2026-05-28.

1 day

Turnaround on a genuine enquiry. I read every one myself.

About me, in my own voice

I am a UK-based software engineer with around five years of production experience. The work I am drawn to sits at the intersection of LLM infrastructure and platform engineering: model orchestration, multi-provider gateways, durable workflow engines, low-level inference, storage engines, observability, IaC, and the boring-but-important plumbing that lets a small team ship reliably.

Over the last year I published nineteen MIT-licensed open-source repositories under the same engineering bar I would bring to a team. Typed end to end, tested, documented with whitepapers and architecture diagrams, deployable from a fresh clone. I learn well by writing a thing twice: once to get it working, then again to make it readable. The published repository is the second pass.

I think well in writing. The blog at /blog holds eighty-plus essays on LLM infrastructure, platform engineering, observability, and the modern indie-SaaS stack. Hiring managers tell me reading two or three of those is the fastest way to get a sense of how I think.

What I am looking for

Permanent, full-time employee roles
PAYE only. Not taking contract or consulting work.
Mid level to Senior engineer roles
Strong individual contributor. Comfortable owning systems end to end and learning fast inside a team.
AI engineering, platform, backend, full stack
AI infrastructure, model orchestration, multi-provider gateways, durable workflow systems, observability, IaC, modern web platforms.
United Kingdom only
Remote, hybrid, or on-site. Open to relocation within the UK for the right role.

I am open to startups raising their first or second round, scale-ups in their series-B-to-D phase, and established product companies with a mature engineering culture. Sectors I find most interesting: LLM infrastructure and tooling, developer platforms, systems and storage, fintech, healthtech, and anything with hard real-world constraints.

Selected ships, scored

Six headlines, twelve more below

Six hand-picked repositories that show variety: an AI gateway, a coding agent, an orchestration engine, a SaaS scaffold, a distributed system, an evals runner. Each is MIT-licensed, runs from a fresh clone, and is the kind of thing I would happily walk you through line by line.

Flagship

SarmaLink-AI

TypeScript

Multi-provider OpenAI compatible gateway with 36-engine failover.

  • OpenAI-shaped /v1/chat/completions, /v1/embeddings, /v1/models surface.
  • HMAC-SHA256 verified Manus webhook persistence on Supabase.
  • Intent based plugin auto-routing across 10 open-source plugins.
  • Per-engine circuit breakers, dynamic OpenRouter ladder, 24h model cache.
36 engines
failover ladder
Flagship

slipstream

TypeScript

Token efficient coding agent runner with persistent project memory.

  • Local dashboard, deterministic tool-budget enforcer, project-scoped memory.
  • Plugs into any OpenAI-compatible gateway, including SarmaLink-AI.
  • Per-task transcripts, replay, and diff-level cost attribution.
  • Runs on a developer laptop, no cloud lock-in.
local first
no cloud lock-in

agent-orchestrator

TypeScript

Durable multi-agent workflow engine with deterministic replay.

  • Postgres journal, BullMQ step queue, idempotent step handlers.
  • Deterministic replay from any journal offset.
  • Tool-budget enforcement at the step level.
  • tRPC Inspector UI for live graph state and per-message replay.
replayable
every step journalled

shipyard

TypeScript

Studio SaaS scaffold with auth, billing, admin, and client portal.

  • Magic link auth, Stripe billing, role-based admin and client surfaces.
  • Next.js 16 App Router, Supabase, Tailwind v4.
  • Pre-wired Telegram bridge, web push, NPS flow.
  • Reproducible from one command via terraform-stack.
one command
fresh studio in minutes

raftkv

Go

Raft KV store with a fault injection harness.

  • Hand-written Raft: leader election, log replication, snapshots, membership.
  • Deterministic fault injection harness for partitions and slow links.
  • Wire-compatible client library with retry and read-your-writes.
  • Documented in a whitepaper plus a how-it-works explainer.
fault injected
partitions, drops, slow links

ai-eval-runner

Python

Evals as code, checked into the same repo as the prompts.

  • Python 3.12, uv, Typer CLI, runs on every push.
  • DuckDB result store, queryable from notebooks and CI.
  • FastAPI plus HTMX viewer for trend lines and regression catches.
  • Per-suite cost and latency budgets enforced in CI.
CI native
fails the build on regression

Recent ships, last 60 days

Straight from git log. Not picked, not curated, just what the repository says happened.

  1. 2026-05-31
    shipped slipstream plus 6 bespoke product trios and a site-wide refresh in one session
  2. 2026-05-31
    open-source directory grew from 12 to 19 repositories, rolled through home, chatbot, docs
  3. 2026-05-30
    portable bootstrap, one zip to take the studio to a fresh machine in minutes
  4. 2026-05-28
    10 new playbooks, home carousel, admin PWA fixes for iOS safe-area
  5. 2026-05-28
    auto-generated branded hero on every blog post and index card
  6. 2026-05-28
    cut page_views Disk IO budget burn by roughly seventy percent on Supabase
  7. 2026-05-28
    Telegram bridge, web push, NPS request flow, Supabase keep-alive cron
  8. 2026-05-13
    session replays and admin testimonials surface end to end

Why me over an agency

Three short comparisons. No fluff.

Accountability

You speak to the engineer

You get me, the engineer who shipped nineteen production repositories. Not whoever the agency happens to have free that week, behind a project manager who cannot answer the technical question.

Transparency

Plain about what I want

Permanent full-time PAYE roles only. No consulting, no contracting, no freelance, no agency subcontract. If you are hiring for a real seat on a real team, this page is for you.

Focus

One build at a time

The work you commission is the work that ships. Not something queued behind three other accounts at the agency. You get my full attention on your build, every week.

Roles I am open to

PAYE, full-time

Permanent employment only. No consulting, contracting, freelance, or agency subcontract until at least February 2030. Companies hiring for a real role on a real team are very welcome to reach out.

Senior or mid-level IC
PAYE, full-time

Individual contributor role on a real engineering team. AI infrastructure, platform, backend, or full stack. I am the person writing the code, not the manager of someone who is.

Specialist hire
PAYE, full-time

When the role is specifically LLM infrastructure, agent runners, multi-provider gateways, evals, or real-time voice systems. I have shipped production versions of all of these and the code is open source.

Founding engineer
PAYE, full-time

Early team, real funding, named CTO or technical founder I can talk to directly. Comfortable picking up across the stack and owning systems end to end.

Salary, level, and location are part of the conversation. London, Manchester, Edinburgh, Birmingham, Bristol, Cambridge, Leeds, Glasgow, or remote across the UK.

Where I work from, when I work, how I think

Where

United Kingdom. I work from a quiet home office on a mechanical keyboard and a pair of large monitors. I run my own production on the same Linux box. Happy on site in London for kick-off, design sessions, and milestones. Open to relocating within the UK for the right role.

When

UK working hours, async first. Deep work in long blocks before lunch, calls batched in the afternoon. I keep a written log of every working day so I can hand context back to my future self. I do not work weekends unless prod is on fire, and I rarely take a holiday without telling you first.

How

Strong opinions about code review. Every change deserves a description, a motivation, and the receipts. I read every line that lands in main. I prefer a small, boring, observable stack over a fashionable one, and I would rather ship the smaller thing twice than the bigger thing once.

Tech I work with

TypeScriptPythonGoRustNext.jsNode.jsFastifytRPCPostgresSupabaseDrizzleRedisBullMQDockerKubernetesHelmTerraformVercelCloudflareDigitalOceanOpenTelemetryPrometheusGrafanaLLM gatewaysRAGVector searchMCP

I am comfortable owning a system end to end, schema, services, API surface, deployment, monitoring, and the runbook a teammate will reach for at three in the morning. I am also comfortable being the new person on a team, and shutting up until I understand the codebase.

Writing

Eighty-seven long-form engineering essays

LLM infrastructure, platform engineering, storage and consensus, observability, and the indie-SaaS stack. Two or three posts is the fastest way to read how I think before a first call.

Read the blog

Frequently asked

The questions recruiters and hiring managers ask first.

Are you available for contract or consulting work?+

No. I am open to permanent full-time PAYE roles only. I am not currently offering commercial consulting, contracting, freelance, or paid project work. Companies hiring for a real role are very welcome to get in touch.

Do you sign NDAs?+

Yes. I sign mutual NDAs as a matter of course before sharing anything sensitive. I have my own template if you do not, and I am happy to sign yours after a quick read.

What is your stack?+

TypeScript, Python, Go, and Rust. Next.js sixteen App Router on the web, Fastify and FastAPI on services, Postgres on Supabase, Drizzle, Redis and BullMQ for queues. Docker, Kubernetes with Helm, Terraform for IaC. Vercel, Cloudflare, and DigitalOcean for hosting. OpenTelemetry, Prometheus, and Grafana for observability. Multi-provider gateway in front of every model call. Full stack visible at sarmalinux.com/technology.

Do you take agency or white-label work?+

No. PAYE full-time only. I am not taking agency subcontract work, white-label engagements, or any form of paid project work at present.

Do you work remote, hybrid, or on-site?+

Primarily remote from the United Kingdom. I am happy to come on-site in London for kick-off, design sessions, and key milestones, and I am open to hybrid for the right role. UK time zone, written-first communication, video calls scheduled in blocks.

What is your turnaround on a new enquiry?+

Within one working day. I read every enquiry myself. If the fit is obvious I propose a thirty-minute call, otherwise I write back with the honest reason it is not a fit and a referral if I have one.

Are you available right now?+

Yes for the right permanent full-time role. Notice period and exact start date depend on the conversation. I always read every genuine enquiry within a working day.

Why an in-house hire rather than an agency?+

You get an engineer who already runs production on the stack he writes about, owns the open-source repositories he ships from, and treats engineering as a serious craft. ADRs, runbooks, observability, the lot. No hand-off, no account manager, no rate-card opacity. The person you interview is the person who writes the code.

Where in the UK are you based, and are you open to relocation?+

I am based in the United Kingdom and open to permanent full-time roles across the UK, London, Manchester, Edinburgh, Birmingham, Bristol, Cambridge, Leeds, Glasgow, remote, hybrid, or on-site. Open to relocation within the UK for the right role.

What kind of role are you looking for?+

Permanent, full-time employee roles (PAYE). Senior or mid-level individual-contributor positions across AI infrastructure, AI engineering, platform engineering, backend, and full-stack development. Not taking contract or consulting work.

What is your strongest area?+

LLM infrastructure and platform engineering. I have built and shipped slipstream (a token-efficient coding agent runner with persistent memory and a live local dashboard), a multi-provider OpenAI-compatible gateway with 36-engine failover and intent-based plugin auto-routing, a durable multi-agent orchestrator with deterministic replay, a sub-second WebRTC voice agent loop, an evals-as-code runner, a minimal LLM inference server in Rust with paged KV-cache and continuous batching, an LSM-tree storage engine in Go, a Raft KV store with a fault-injection harness, a WebAssembly sandbox in Rust with a deny-by-default host ABI, and full IaC for Vercel plus Supabase plus Cloudflare plus DigitalOcean. All open source. All MIT licensed.

How many years of experience do you have?+

Around five years of production software engineering experience.

Which technologies do you work with?+

TypeScript, Python, Go, Rust, Next.js, Node.js, Fastify, tRPC, Postgres, Supabase, Drizzle, Redis, BullMQ, Docker, Kubernetes, Helm, Terraform, Vercel, Cloudflare, DigitalOcean, OpenTelemetry, Prometheus, Grafana, multi-provider LLM gateways, RAG, vector search, MCP.

Can I see your work before getting in touch?+

Yes, every project is open source at github.com/sarmakska. Nineteen production-shaped repositories spanning LLM infrastructure, agent runners, voice loops, storage engines, consensus, and WebAssembly sandboxes. Whitepapers, architecture diagrams, and quick-start guides on each. Long-form engineering writing at sarmalinux.com/blog. Reading two or three of the posts is the fastest way to see how I think.

How do I get in touch about a role?+

Email projects@sarmalinux.com with the company, the role, and one sentence on what you are building. I reply to every genuine enquiry. Recruiters welcome if the role is real and the company is named in the first message.

Got a role I would be a fit for?

Send a short note with the company, the role, and a sentence on what you are building. I reply to every genuine enquiry. Recruiters welcome if the role is real and the company is named in the first message.