Open Source · MIT · Helm + Bash

Next.js on Kubernetes, production-grade in five commands.

A Helm chart for the app and a bootstrap script for the platform. Ingress, TLS, autoscaling, metrics, logs, alerts. None of the yaml.

k8s-ops-toolkit is the platform layer most teams spend a week assembling, written down. The Helm chart deploys your Next.js app with deployment, service, ingress (TLS via cert-manager), HPA, PDB, and a Prometheus ServiceMonitor. The bootstrap script installs ingress-nginx, cert-manager, kube-prometheus-stack, and Loki + Promtail with sane defaults and pre-baked Grafana dashboards. Five commands, about eight minutes, no surprises.

5
Commands
~8 min
To live
~$70
/mo platform on DO
TLS
Auto-renewing
MIT
Licence

Why this exists

Most teams running Next.js on Kubernetes solve the same five problems in the first month: TLS, autoscaling, metrics, logs, alerts. Each is a few hours; together they are a week of yak-shaving before anyone is comfortable pushing to production.

The bigger ecosystems (Argo, Crossplane, Backstage) solve much larger problems and bring much heavier machinery with them. The lighter starters skip observability entirely. The middle ground is what most teams actually need and rarely package well.

k8s-ops-toolkit is that middle ground. A small Helm chart you can read in twenty minutes plus a bootstrap script for the platform stack. Pre-baked Grafana dashboards, working alert rules, sensible defaults. The week you would have spent, given back.

What it does

Every feature below ships in the public repository today. Clone, configure, run.

Helm chart for Next.js

Deployment, service, ingress with TLS, HPA, PDB, ServiceMonitor. Twenty minutes to read.

cert-manager built-in

Let's Encrypt issuer wired up. Automatic renewal. Default 90-day cert with 14-day expiry alerts.

Prometheus + Grafana

kube-prometheus-stack with three pre-baked dashboards: Cluster, Ingress, Next.js app.

Loki + Promtail

Log aggregation that does not break the bank. Indexed by label, queryable from Grafana.

Alertmanager rules

CrashLoopBackOff, ingress 5xx spikes, p99 latency, cert expiry, disk pressure. Wire to Slack or PagerDuty.

HPA on CPU or RPS

Default CPU autoscaling. Optional pattern for scaling on requests-per-second from the ServiceMonitor.

ingress-nginx default

The ingress everyone runs. Documented annotations for body size, websocket, redirects.

PDB for safe maintenance

Pod disruption budgets so cluster upgrades do not take you down.

No service mesh, by design

Mesh complexity is rarely worth the cost for Next.js workloads. We deliberately do not bundle one.

Plain Helm, no operator

You can read the templates. You can copy them. You can fork them. No magic.

Tech stack

Kubernetes 1.31+Helm 3.16+ingress-nginxcert-managerPrometheusGrafanaLokiPromtailAlertmanagerkube-prometheus-stackBash bootstrap script

Architecture, in one diagram

The whole system on a single screen. Every box maps to a real folder in the repo.

┌──────────────────────────────────────────────────────────┐
│                   Internet                                │
└──────────────────────────┬───────────────────────────────┘
                           ▼
┌──────────────────────────────────────────────────────────┐
│  ingress-nginx (LoadBalancer)                             │
│   - cert-manager → Let's Encrypt → TLS                    │
└──────────────────────────┬───────────────────────────────┘
                           ▼
┌──────────────────────────────────────────────────────────┐
│  Next.js app (Helm chart)                                 │
│   - Deployment + Service + HPA + PDB                      │
│   - /api/health probes, /api/metrics scrape               │
└──────────────────────────┬───────────────────────────────┘
                           │
                           ▼
┌─────────────────────┐  ┌─────────────────────┐
│  Prometheus         │  │  Loki + Promtail    │
│   - ServiceMonitor  │  │   - log shipping    │
└──────────┬──────────┘  └──────────┬──────────┘
           ▼                         ▼
        ┌─────────────────────────────┐
        │  Grafana (dashboards)        │
        │  Alertmanager (Slack/PD)     │
        └─────────────────────────────┘

Quick start

From clone to first request in under five minutes.

01
git clone https://github.com/sarmakska/k8s-ops-toolkit.git
cd k8s-ops-toolkit
02
./scripts/install.sh \
   --domain example.com \
   --email you@example.com \
   --slack-webhook https://hooks.slack.com/...
03
helm install my-app charts/nextjs-app \
   --set image.repository=ghcr.io/you/my-app \
   --set image.tag=v1.0.0 \
   --set ingress.host=app.example.com
04
kubectl port-forward -n monitoring svc/grafana 3000:80
# Grafana → Cluster Overview, Ingress nginx, Next.js app

Where it fits

The patterns this repository was built around.

First production cluster

Greenfield team going from "we deploy to Vercel" to "we run our own k8s." Skip the week of yak-shaving.

Adding observability later

You already have apps running but no metrics or logs. The bootstrap script gets you instrumented in an afternoon.

Standardising deploys

Pin every Next.js deploy in your org to the same chart. Consistent probes, consistent autoscaling, consistent alerts.

Cost-controlled SaaS infra

A single $70/mo cluster on DigitalOcean hosting an arbitrary number of apps. Predictable bill, no surprise vendors.

Your platform stack, written down. Five commands. Eight minutes.

Clone the repo, follow the four-step quick start, ship something real.

All open-source projects