A production-grade MCP server, ready on day one.
MCP became the default integration layer for agents. Most reference servers are still toys.
MCP Server Toolkit is the opinionated alternative: an MCP 1.0 compliant server with two transports, OAuth 2.1, schema validation, per-client rate limits, and OpenTelemetry spans already wired in. Write a tool once and reach it over stdio for a local agent and over streamable HTTP for a remote one, with the same handler code.
Why this exists
MCP is the cleanest standard the agent ecosystem has produced for tool use. The official SDKs are excellent. The reference servers are deliberately minimal. The moment you try to ship one into a real product, the same thousand lines of plumbing reappear: OAuth, telemetry, plugin loading, lifecycle, CI scaffolding, schema generation, rate limiting.
Every team I have worked with rebuilds the same stack. A FastAPI app, a plugin discovery loop, a way to gate sensitive tools behind OAuth scopes, a way to fan tools out across stdio for desktop clients and streamable HTTP for hosted clients. Most of the time the implementation drifts and the tests come last.
MCP Server Toolkit is that work, done once. One server, one plugin contract, both transports, OAuth on the sensitive endpoints, OpenTelemetry across the lot, a container that runs as a non-root user with a healthcheck. Fork it, drop in your plugins, deploy it. The boring work is already done.
What is in the box
Everything below ships in the public repository today. Clone, configure, run.
Decorator-based registry
Mark a handler with @registry.tool and the server derives the input JSON Schema from your Python type hints. additionalProperties is false by default, so unexpected arguments are rejected.
Two transports, one code path
stdio (JSON-RPC 2.0) for desktop and IDE agents. Streamable HTTP at POST /mcp for remote clients. Both flow through protocol.dispatch, so a tool written once behaves identically.
OAuth 2.1 with JWKS validation
Bearer tokens validated against the issuer JWKS with issuer and audience checks. RS256, ES256, RS384, RS512 only. Keys cached and refetched on rotation without a restart.
API key mode
Set MCP_AUTH=api_key and clients send X-API-Key. Constant-time comparison so the key never leaks through timing. Pick none for stdio, api_key for a team server, oauth for multi-tenant.
Per-client rate limit
Token bucket keyed by API key, bearer subject, or remote address. Configure with MCP_RATE_LIMIT_RPS and MCP_RATE_LIMIT_BURST. Exhausted clients receive 429.
OpenTelemetry baked in
Every tools/call runs inside a tool.<name> span recording name, argument count, duration, and error type. Set MCP_OTEL_ENDPOINT and spans export over OTLP. Otherwise structlog JSON still flows to stderr.
Filesystem plugin (sandboxed)
Read, write, list, and search inside an allow-listed root pinned by MCP_FS_ROOT. Path traversal is rejected at the boundary. list_files declares an output schema, so its result surfaces as structuredContent.
SarmaLink plugin
A working end-to-end example that wraps an external API as MCP tools. Calls into the SarmaLink-AI failover stack, so a calling agent borrows multi-provider routing without adopting the rest.
PKCE login command
mcp-toolkit login runs the OAuth 2.1 authorisation code flow with PKCE against a hosted server. Opens the browser, captures the redirect on a one-shot loopback listener, exchanges the code for a token.
CLI: run, doctor, init, login
mcp-toolkit run starts a server. doctor prints registered tool counts and configuration. init scaffolds a plugin. login obtains a token.
Structured output validation
Declare an output_schema on a tool and the return value is validated on every call, then surfaced to the client as MCP structuredContent. Useful for tools whose downstream consumer is itself a programme.
Reproducible container
Multi-stage uv build that runs as a non-root user with a built-in health check. Front it with TLS termination through your platform load balancer or Caddy on a VPS.
Architecture, in one diagram
Every box maps to a real module in src/mcp_toolkit. The protocol layer is the single source of truth, so a tool behaves identically over stdio and HTTP.
server.pyLifecycle: select transport, set up telemetry, import plugins.
protocol.pyMCP 1.0 JSON-RPC dispatch shared by both transports.
registry.pyDecorator-based tool registry: schema generation, validation, span wrapping.
transports/stdio.pyJSON-RPC 2.0 loop over stdin and stdout, one message per line.
transports/http.pyFastAPI app: POST /mcp, REST /tools, /health, auth, rate limiting.
auth/api_key.pyConstant-time API key comparison.
auth/oauth.pyOAuth 2.1 resource server: JWT validation against issuer JWKS.
auth/ratelimit.pyPer-client token bucket.
oauth_client.pyOAuth 2.1 PKCE flow for obtaining tokens.
telemetry.pyOpenTelemetry tracer provider and structlog configuration.
config.pySettings from environment, MCP_ prefix.
cli.pyrun, doctor, init, login commands.
Quick start
Clone to first tool call in under five minutes. Commands taken straight from the README.
git clone https://github.com/sarmakska/mcp-server-toolkit.git cd mcp-server-toolkit
uv sync cp .env.example .env # set MCP_AUTH, MCP_FS_ROOT, keys as needed
uv run mcp-toolkit run --transport stdio
uv run mcp-toolkit run --transport http --port 8000
curl -s localhost:8000/mcp -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'uv run mcp-toolkit doctor # prints registered tool count and config
Plugin authoring, in real code
Real snippets from the repo and the wiki. Every example below runs as-is once you have set the relevant environment variables.
pythonMinimum plugin: type hints drive the schema+
# src/mcp_toolkit/plugins/myplugin/handlers.py
from ...registry import registry
@registry.tool("search_docs", description="Search internal docs")
async def search_docs(query: str, limit: int = 10) -> dict:
return {"results": [...]}
# query is required (no default), limit is optional.
# The generated schema sets additionalProperties: false,
# so unexpected arguments are rejected at the boundary.pythonOutput schema, surfaced as structuredContent+
@registry.tool(
"get_weather",
description="Current weather for a city",
output_schema={
"type": "object",
"properties": {
"city": {"type": "string"},
"temperature_c": {"type": "number"},
"conditions": {"type": "string"},
},
"required": ["city", "temperature_c", "conditions"],
},
)
async def get_weather(city: str) -> dict:
return {"city": city, "temperature_c": 14.0, "conditions": "partly cloudy"}bashHandshake plus tools/list over stdio+
printf '%s\n' \
'{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18"}}' \
'{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
| uv run mcp-toolkit run --transport stdiobashOAuth 2.1: bearer token to a hosted server+
MCP_AUTH=oauth
MCP_OAUTH_ISSUER=https://your-id-provider.com
MCP_OAUTH_AUDIENCE=mcp-toolkit
# Optional: skip discovery
MCP_OAUTH_JWKS_URI=https://your-id-provider.com/.well-known/jwks.json
curl -H "Authorization: Bearer $TOKEN" \
http://localhost:8000/mcp \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'bashPKCE login flow for tokens+
MCP_OAUTH_CLIENT_ID=your-public-client-id MCP_OAUTH_REDIRECT_URI=http://127.0.0.1:8765/callback uv run mcp-toolkit login \ --issuer https://your-id-provider.com \ --client-id your-public-client-id # Prints the access token to send as Authorization: Bearer ...
Configuration, twelve-factor
Every setting reads from the environment with an MCP_ prefix. A .env file works in development; the same names work in production.
| Env var | Purpose | Default |
|---|---|---|
| MCP_TRANSPORT | stdio or http | stdio |
| MCP_HTTP_HOST / MCP_HTTP_PORT | HTTP bind address | 0.0.0.0:8000 |
| MCP_AUTH | none, api_key, or oauth | none |
| MCP_API_KEY | Key for api_key mode | unset |
| MCP_OAUTH_ISSUER / MCP_OAUTH_AUDIENCE | OAuth resource server checks | unset |
| MCP_OAUTH_JWKS_URI | Explicit JWKS endpoint (otherwise discovered) | unset |
| MCP_RATE_LIMIT_RPS / MCP_RATE_LIMIT_BURST | Per-client token bucket | 0 (off) |
| MCP_FS_ROOT | Filesystem plugin sandbox root | ~/mcp-data |
| MCP_OTEL_ENDPOINT | OTLP collector URL for spans | unset |
| MCP_SARMALINK_API_KEY | Key for the bundled sarmalink plugin | unset |
Where it fits
The patterns this repository was built around, and the ones it deliberately is not.
Internal tool gateway
Front a Postgres replica, an S3 bucket, and a GitHub PAT behind one MCP server. Tools are schema-validated, OAuth-scoped, traced. The agent talks to one endpoint instead of three SDKs.
Desktop and IDE plugins
Ship the same plugin code over stdio for IDE agents. No HTTP, no auth, no rate limits because the OS process boundary is the trust boundary.
Remote MCP for a team
Run as a public-facing service with API-key auth and rate limits. One server, many clients. Spans into your existing OTel collector.
Wrap an existing API
The bundled sarmalink plugin shows the pattern: take an HTTP API, expose it as MCP tools with typed inputs and outputs. Agents pick it up automatically.
When NOT to reach for it
A throwaway single-tool stdio script for one local agent is lighter to write directly against the reference SDK. This toolkit is for servers you intend to operate.
Not a managed service
You host and run the server. There is no SaaS dashboard, no per-call billing, no opaque vendor between you and the protocol.
Tech stack
Compared to the alternatives
Two honest comparisons. The official reference servers, and the in-house MCP plumbing most teams end up writing.
| Feature | MCP Server Toolkit | Reference SDK example | In-house build |
|---|---|---|---|
| MCP 1.0 protocol | Yes, three versions negotiated | Yes | Reimplemented |
| Both transports | stdio + HTTP, one code path | Pick one | Pick one |
| Schema from type hints | Yes | Manual | Manual |
| OAuth 2.1 + PKCE login | Built in | No | You write it |
| Rate limiting per client | Token bucket | No | You write it |
| OTel spans per tool call | Yes | No | You write it |
| Container, non-root, healthcheck | Yes | No | Hand-rolled |
| Self-hosted, MIT licensed | Yes | Yes | N/A |
Documentation, all in the wiki
Seven focused wiki pages. Each one answers a single operational question. No homepage marketing in between.
Frequently asked
Eight questions I have actually been asked while shipping this with other teams.
Why ship both stdio and HTTP?+
They serve different agents. Desktop and IDE clients launch the server as a subprocess and exchange JSON-RPC over stdin and stdout. Hosted agents call POST /mcp over the network. Routing both through one protocol.dispatch is what guarantees a tool behaves identically in either deployment.
How is the input schema generated?+
The registry inspects type hints with typing.get_type_hints and maps each parameter to a JSON Schema fragment. str becomes string, int becomes integer, list[str] becomes an array of strings, X | None is treated as the inner type. Parameters without a default are required. additionalProperties is set to false.
Do handlers have to be async?+
Yes. The decorator rejects a sync function at registration time. Use httpx.AsyncClient and async database drivers. If you must call blocking code, wrap it in loop.run_in_executor. Log through structlog rather than print so stdout stays a clean JSON-RPC channel.
How do errors propagate?+
Argument validation failures map to JSON-RPC -32602. Unknown tools or methods map to -32601. Handler exceptions return a tools/call result with isError set to true and the exception message attached, so the agent sees what went wrong rather than the session crashing.
Is the OAuth flow really self-contained?+
Yes. mcp-toolkit login discovers authorisation server metadata, generates a PKCE pair, opens the browser, captures the redirect on a one-shot loopback listener, exchanges the code for a token, and prints it. There is no third-party broker.
What happens when keys rotate upstream?+
The JWKS validator caches keys and refetches on rotation, so a key roll at your identity provider does not need a server restart. The kid in the JWT header points the validator at the right key.
Will this fight my Kubernetes setup?+
No. The container runs as a non-root user, exposes a built-in health check, and reads every setting from the environment with an MCP_ prefix. Healthchecks at /health are always open, so they work as liveness and readiness probes without any auth bypass.
Why Python rather than TypeScript?+
Python is where the MCP reference SDK ships first and the tool-authoring ecosystem (Pydantic, jsonschema, OpenTelemetry, FastAPI) is mature. The protocol layer is small enough that a TypeScript twin would be a port, not an integration.
Related products
The rest of the Sarma Linux toolkit. Same opinions throughout: open source, MIT, real depth.
Stand up an MCP server in under five minutes.
Clone the repo, run the four-step quick start, register a plugin, ship.