Open Source · MIT · Helm + Bash

Next.js on Kubernetes, production-grade in five commands.

A Helm chart for the app and a bootstrap script for the platform. Ingress, TLS, autoscaling, metrics, logs, alerts. None of the yaml.

k8s-ops-toolkit is the platform layer most teams spend a week assembling, written down. The Helm chart deploys your Next.js app with deployment, service, ingress (TLS via cert-manager), HPA, PDB, and a Prometheus ServiceMonitor. The bootstrap script installs ingress-nginx, cert-manager, kube-prometheus-stack, and Loki + Promtail with sane defaults and pre-baked Grafana dashboards. Five commands, about eight minutes, no surprises.

View on GitHub Whitepaper How it works Get help shipping

5

Commands

~8 min

To live

~$70

/mo platform on DO

TLS

Auto-renewing

MIT

Licence

Why this exists

Most teams running Next.js on Kubernetes solve the same five problems in the first month: TLS, autoscaling, metrics, logs, alerts. Each is a few hours; together they are a week of yak-shaving before anyone is comfortable pushing to production.

The bigger ecosystems (Argo, Crossplane, Backstage) solve much larger problems and bring much heavier machinery with them. The lighter starters skip observability entirely. The middle ground is what most teams actually need and rarely package well.

k8s-ops-toolkit is that middle ground. A small Helm chart you can read in twenty minutes plus a bootstrap script for the platform stack. Pre-baked Grafana dashboards, working alert rules, sensible defaults. The week you would have spent, given back.

What it does

Every feature below ships in the public repository today. Clone, configure, run.

Helm chart for Next.js

Deployment, service, ingress with TLS, HPA, PDB, ServiceMonitor. Twenty minutes to read.

cert-manager built-in

Let's Encrypt issuer wired up. Automatic renewal. Default 90-day cert with 14-day expiry alerts.

Prometheus + Grafana

kube-prometheus-stack with three pre-baked dashboards: Cluster, Ingress, Next.js app.

Loki + Promtail

Log aggregation that does not break the bank. Indexed by label, queryable from Grafana.

Alertmanager rules

CrashLoopBackOff, ingress 5xx spikes, p99 latency, cert expiry, disk pressure. Wire to Slack or PagerDuty.

HPA on CPU or RPS

Default CPU autoscaling. Optional pattern for scaling on requests-per-second from the ServiceMonitor.

ingress-nginx default

The ingress everyone runs. Documented annotations for body size, websocket, redirects.

PDB for safe maintenance

Pod disruption budgets so cluster upgrades do not take you down.

No service mesh, by design

Mesh complexity is rarely worth the cost for Next.js workloads. We deliberately do not bundle one.

Plain Helm, no operator

You can read the templates. You can copy them. You can fork them. No magic.

Architecture, in one diagram

The whole system on a single screen. Every box maps to a real folder in the repo.

┌──────────────────────────────────────────────────────────┐
│                   Internet                                │
└──────────────────────────┬───────────────────────────────┘
                           ▼
┌──────────────────────────────────────────────────────────┐
│  ingress-nginx (LoadBalancer)                             │
│   - cert-manager → Let's Encrypt → TLS                    │
└──────────────────────────┬───────────────────────────────┘
                           ▼
┌──────────────────────────────────────────────────────────┐
│  Next.js app (Helm chart)                                 │
│   - Deployment + Service + HPA + PDB                      │
│   - /api/health probes, /api/metrics scrape               │
└──────────────────────────┬───────────────────────────────┘
                           │
                           ▼
┌─────────────────────┐  ┌─────────────────────┐
│  Prometheus         │  │  Loki + Promtail    │
│   - ServiceMonitor  │  │   - log shipping    │
└──────────┬──────────┘  └──────────┬──────────┘
           ▼                         ▼
        ┌─────────────────────────────┐
        │  Grafana (dashboards)        │
        │  Alertmanager (Slack/PD)     │
        └─────────────────────────────┘

Quick start

From clone to first request in under five minutes.

01

git clone https://github.com/sarmakska/k8s-ops-toolkit.git
cd k8s-ops-toolkit

02

./scripts/install.sh \
   --domain example.com \
   --email you@example.com \
   --slack-webhook https://hooks.slack.com/...

03

helm install my-app charts/nextjs-app \
   --set image.repository=ghcr.io/you/my-app \
   --set image.tag=v1.0.0 \
   --set ingress.host=app.example.com

04

kubectl port-forward -n monitoring svc/grafana 3000:80
# Grafana → Cluster Overview, Ingress nginx, Next.js app

Where it fits

The patterns this repository was built around.

First production cluster

Greenfield team going from "we deploy to Vercel" to "we run our own k8s." Skip the week of yak-shaving.

Adding observability later

You already have apps running but no metrics or logs. The bootstrap script gets you instrumented in an afternoon.

Standardising deploys

Pin every Next.js deploy in your org to the same chart. Consistent probes, consistent autoscaling, consistent alerts.

Cost-controlled SaaS infra

A single $70/mo cluster on DigitalOcean hosting an arbitrary number of apps. Predictable bill, no surprise vendors.

Related products

The wider Sarma Linux toolkit. Every project ships with the same opinions: open source, MIT, real depth, no marketing fluff.

SarmaLink-AI

multi-provider AI backend with sub-50ms failover across 36 engines.

Open product page

MCP Server Toolkit

Production-ready Model Context Protocol server starter, with plugins.

Open product page

Voice Agent Starter

Sub-second real-time voice loop with WebRTC, barge-in, and pluggable STT/TTS.

Open product page

Agent Orchestrator

Deterministic-replay multi-agent workflows with durable state.

Open product page

AI Eval Runner

Evals as code. Datasets, scorers, traces, regressions, all in one CLI.

Open product page

Local LLM Router

OpenAI-compatible proxy that routes between local Ollama and cloud LLMs.

Open product page

StaffPortal

Open-source HR + ops platform built to replace three SaaS subscriptions.

Open product page

RAG-over-PDF

A minimal, production-shaped RAG starter with cited streaming answers.

Open product page

Receipt Scanner

Vision-OCR receipt scanning starter with Zod-typed JSON output.

Open product page

Webhook-to-Email

A tiny, production-grade webhook receiver with HMAC and React Email.

Open product page

terraform-stack

Vercel + Supabase + Cloudflare + DigitalOcean as one Terraform repo.

Open product page

Next.js on Kubernetes, production-grade in five commands.

Why this exists

What it does

Helm chart for Next.js

cert-manager built-in

Prometheus + Grafana

Loki + Promtail

Alertmanager rules

HPA on CPU or RPS

ingress-nginx default

PDB for safe maintenance

No service mesh, by design

Plain Helm, no operator

Tech stack

Architecture, in one diagram

Quick start

Where it fits

First production cluster

Adding observability later

Standardising deploys

Cost-controlled SaaS infra

Related products

SarmaLink-AI

MCP Server Toolkit

Voice Agent Starter

Agent Orchestrator

AI Eval Runner

Local LLM Router

StaffPortal

RAG-over-PDF

Receipt Scanner

Webhook-to-Email

terraform-stack

Your platform stack, written down. Five commands. Eight minutes.