slipstream
A coding-agent platform plugin that swaps whole-file reads for precise tools, keeps context alive across compaction, and stands up a live local dashboard so I can actually watch the agents work. The runner I ship with.
- mainrunning
- sp-shipperstep 4/7
- sp-reviewerwaiting
- + sp_symbol(retrieve.ts, retrieveSymbol)
- + sp_lines(server.ts, 40, 92)
- ~ sp_remember(decision)
- + sp_map()
- [x] orient with sp_map
- [x] slice retrieveSymbol
- [ ] wire PreCompact hook
- [ ] verify with /doctor
The live dashboard, mocked. Real one renders on 127.0.0.1, refreshes over SSE.
Why this exists
A long coding-agent session usually dies one of two ways. The agent reads whole files until the context window fills and starts forgetting the start of its own plan. Or it does good work, the session ends, and every decision it made evaporates. I write small production sites on Cloudflare, Supabase, Vercel and Resend, and I lean on the agent in my IDE for the boring parts. Both of those failure modes were biting me every long day.
The first failure mode is whole-file context bleed. The agent opens a 1,200 line component to change one prop. The budget bleeds. Three prompts later it has paged out the convention we agreed on at the top. Slipstream answers this with a bundled MCP server: a compact sp_map of the project, and sp_symbol / sp_lines for surgical slices. One symbol in, not one file in.
The second failure mode is the compaction cliff. The window summarises, the durable facts go with the noise. I tried writing everything into a single hand-rolled notes file. It rotted within a day. Slipstream answers this with a structured memory store, a PreCompact hook that writes a session digest the instant before the trim, and a signal-ranked recall that reloads only the relevant subset on the next session.
Then I added the thing I actually wanted most: a window into the session. When you fire off a plan and a subagent and walk away, you should be able to glance at a tab and see which agent is on which step and where the budget is. That is the live local dashboard, and it is the headline feature.
Watch the agents work
Session start boots a small 127.0.0.1 server on a free port and prints the URL into chat. Four panels, themed in the SarmaLinux palette, refreshed over SSE.
Agents
Every agent and subagent, its status (running, waiting, done, failed) and the task it is on. Grouped so a subagent's work does not tangle with the main thread.
Activity
The per-agent stream of prompts, tool calls and results as they land. The append-only event log behind it makes replay free.
Token budget
A bar that fills as reads pull bytes into context. With sp_* tools on, the bar crawls. With whole-file reads, it lurches.
Plan + mind map
The current plan and a Mermaid mind map of the session's agents, redrawn as events arrive. Same renderer as /slipstream:mindmap.
Honest about what it is
It is a local observability dashboard for your session. It watches and visualises, it does not drive. Nothing leaves the machine, there is no telemetry, the bind is 127.0.0.1 only, and obvious secrets are pattern-redacted before they ever reach the log. Auto-open lives behind a setting in .claude/slipstream/dashboard.json, and SLIPSTREAM_DASHBOARD=0 disables it per session.
Before and after, real token numbers
Numbers from this repository on my machine (Apple Silicon, Node 25), using slipstream's own conservative 3.6 bytes-per-token estimate from src/context/budget.ts.
| Approach | Bytes into context | Approx tokens | Saving |
|---|---|---|---|
| Whole-file Read of src/map/retrieve.ts | 4,841 | ~1,345 | baseline |
| sp_symbol(retrieve.ts, retrieveSymbol) | 1,381 | ~384 | 71% fewer |
| Reading every file in src/ | 146,150 | ~40,597 | naive orient |
| sp_map index instead | 7,821 | ~2,173 | 5.4% of reading everything |
The dashboard's token-budget bar makes this visible while it happens. With the tools on, the bar crawls. With whole-file reads, it lurches. The discipline that prevents sp_symbol from ever returning the whole file lives in src/mcp/tools.ts where it cannot be bypassed.
The bundled MCP tools
Nine tools, served over stdio by dist/mcp/index.js. Every one returns the smallest correct thing.
sp_mapThe compact project map: every file, its exported symbols and a one-line purpose. No file contents. The agent orients with this before it reads anything.
sp_symbolJust that symbol's source slice, with its doc comment. Walks braces from the declaration line. A single call replaces opening the whole file.
sp_linesExactly that line range, bounded. No surrounding context, no leakage. For when the slice you want is a block, not a symbol.
sp_searchRanked file locations for a query. Returns locations, not contents, so the agent decides what to slice next.
sp_rememberWrite a durable Markdown fact into the memory store under .claude/slipstream/memory/. Survives compaction. Reloaded on next session.
sp_recallRead memories back into the turn. With a query, ranks by signal. Without, returns the index. Capped under a ~1,200 token ceiling.
sp_forgetRemove a stale fact. The MEMORY.md index regenerates so the durable view stays clean.
sp_budgetThe context-budget level (ok / warn / compact) and a conservative token estimate from bytes-into-context at 3.6 bytes per token.
sp_mindmapThe project rendered as a themed Mermaid mind map, returned to chat or written to a self-contained HTML artifact.
Deep dive: MCP-Tools wiki → . Token-Efficiency →
Lossless compaction + smart recall
The two memory features that make a long session survivable. One catches the thread before it is trimmed. The other refuses to dump the whole store back on the next start.
Lossless compaction
The agent platform fires PreCompact just before it summarises and trims the conversation, which is exactly the moment the thread tends to blur. slipstream's hook reconstructs what happened from the dashboard event log, builds a structured digest (open task, decisions, files touched, next step) in src/memory/digest.ts, and writes it to the store as a durable fact.
On the next session start it is reloaded first, so a resumed session picks up where it left off rather than from a lossy summary. The hook is idempotent. If the session is resumed mid-compact, the digest is updated, not duplicated.
Smart recall, not load-everything
A naive memory layer dumps the whole store back into context every session, which costs more tokens the larger and more useful it grows. slipstream instead builds a task signal from the git branch, the files changed in the working tree and the last prompt, ranks memories against it (src/memory/recall.ts), and reloads only the relevant subset under a hard ~1,200 token ceiling, plus the MEMORY.md index for the rest.
With no signal it loads nothing and defers to the index, because loading arbitrary memories with no signal is the behaviour we are avoiding.
Recall, diagrammed
Signal-ranked recall on session start. Empty signal short-circuits to nothing.
One line in the status bar
Context budget level, durable memory count, active skill, model. The formatting is a pure function, unit-tested in src/statusline, so the bar never lies about the helper underneath.
cp | ctx 12% ok | mem 4 | skill scoped-read | Opus 4.8
Terse output style
A bundled output style under output-styles/slipstream.md tuned for high-signal, low-token answers. Switch to it with /output-style slipstream to spend fewer tokens per turn without losing precision. Pairs with the statusline so the cost stays visible.
/output-style slipstream # answers go terse, code blocks lean, no fluff
Three shipped subagents
Lean, token-disciplined subagents under agents/, each using the MCP tools rather than whole-file reads. Delegate with the Task tool, for example "use sp-reviewer to check this before I push".
Scaffold to deployed
Drives a small production site through the integration skills end to end: scaffold, wire auth, set up Supabase with row-level security, deploy to Cloudflare or Vercel, attach a domain, send the first transactional mail. Every shipping skill has a verification gate the agent must run; sp-shipper refuses to advance past a red one.
Postgres + RLS
Designs and migrates a Supabase / Postgres schema with row-level security that denies by default and explicit policies per role. Uses sp_symbol and sp_lines instead of reading whole migration files, so it stays inside budget even on a real codebase.
Pre-push guardrail
Runs lint, build, the test suite and a secret scan, then delivers a clear FAIL verdict that blocks the push when something is off. Designed to be invoked as "use sp-reviewer to check this before I push", so the green light is a discrete, auditable step.
Slash commands
Nine commands under commands/. Each is a thin wrapper around the helper, audited in one file, unit-tested where shape matters.
| Command | What it does |
|---|---|
| /slipstream:doctor | End-to-end install verifier. 12+ PASS / FAIL checks: MCP server built and declared, every hook wired, memory store reachable, helper CLI built, statusline and output style present, manifest valid. |
| /slipstream:map | Build the compact project map once. Walks the tree, records exported symbols and a one-line purpose per file, writes the index the agent will read from. |
| /slipstream:remember | Save a durable decision to the memory store. One Markdown file per fact with frontmatter, plus a regenerated MEMORY.md index. |
| /slipstream:recall | Pull facts back into the turn. With a query, signal-ranks against the working tree. Without, returns the durable index for the agent to scan. |
| /slipstream:forget | Drop a stale fact by slug. Index regenerates so the durable view stays clean. Used when a decision is reversed. |
| /slipstream:status | One screen: current plan, context-budget level with recommendation, durable memory count, project map size. |
| /slipstream:mindmap | Render the project as a themed Mermaid mind map. Inline in chat or written to a self-contained HTML artifact under .claude/slipstream/dashboard/. |
| /slipstream:dashboard | Print the dashboard URL again. Useful after a reload. Starting is idempotent so the running server is reused. |
| /slipstream:validate | Run plugin-validate against the manifest. Fails loudly on a malformed skill or a missing hook declaration. |
Architecture, end to end
Hooks, helper, MCP server, map, memory, event log, dashboard. Same diagram as the README, themed for the site.
The five pillars
Scoped-read tools, durable memory, lossless compaction, the live dashboard, the statusline. Each one earns its place.
Precise tools, not whole-file reads
A bundled MCP server (src/mcp) exposes sp_map, sp_symbol, sp_lines and sp_search. The agent orients with the map and pulls one declaration or one line range. The discipline lives in src/mcp/tools.ts where it cannot be bypassed.
Persistent memory with lossless compaction
A file-based store under .claude/slipstream/memory/, one Markdown fact per file plus a regenerated MEMORY.md index. A PreCompact hook writes a structured session digest the instant before the window is trimmed.
Lossless context across sessions
On session start, recall reloads the digest first, then signal-ranks the store against the git branch, the working tree and the last prompt, and loads only the relevant subset under a ~1,200 token ceiling.
Live local agent dashboard
Session start boots a 127.0.0.1 server and prints the URL into chat. Four panels: agents, the activity stream, the token-budget bar and a Mermaid mind map. The append-only event log doubles as replay.
Budget and skill in the statusline
A one-line statusline renders the context-budget level, the durable memory count, the active skill and the model. The formatting is a pure function, unit-tested so the bar never lies.
What is in the box
Everything below is shipped today and covered by the test suite (88 tests across 11 files).
PreCompact session digest
A hook reconstructs the open task, decisions, files touched and next step from the dashboard event log, then writes one structured fact to memory before the window is trimmed. The next session reloads it first.
Signal-ranked recall
On session start, recall builds a task signal from the git branch, the files changed in the working tree and the last prompt, ranks memories against it and reloads only the relevant subset under a hard token ceiling. With no signal it loads nothing.
Hand-rolled MCP server
A small newline-delimited JSON-RPC stdio loop, no SDK dependency. The slice of the protocol in play (initialize, tools/list, tools/call) is small and stable. The request handler is a pure exported function so tests drive it without spawning a process.
Three shipped subagents
sp-shipper drives a site from scaffold to deployed across integration skills, refusing to advance past a red verification gate. sp-schema designs and migrates a Supabase schema with row-level security that denies by default. sp-reviewer is a pre-push guardrail with a hard FAIL verdict.
Terse output style
A bundled output style tuned for high-signal, low-token answers. Switch to it with /output-style slipstream to spend fewer tokens per turn without losing precision.
Local-only by construction
The dashboard binds 127.0.0.1 on a free port. Nothing leaves the machine. Obvious secrets are pattern-redacted before they reach the event log. No telemetry, no accounts, no hosted layer.
Append-only event log
Every lifecycle hook (SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, SubagentStop, Stop, PreCompact) writes one JSON event to .claude/slipstream/dashboard/<session>.jsonl. State is a pure fold over the log, so replay is free.
Doctor end to end
/slipstream:doctor checks the MCP server is built and declared, every hook is wired, the memory store is reachable, the helper CLI is built, the statusline, output style and subagents are present, and the plugin manifest is valid. PASS / FAIL per check.
Run it in any IDE
Two layers. The full plugin (skills, hooks, memory, lossless compaction, statusline, live dashboard) runs inside a coding-agent host that loads the plugin format. The MCP tools, the token-saving core, are standard Model Context Protocol and work in any MCP-capable editor.
In a plugin-capable agent host
You get the bundled MCP server, the slash commands, the 59 skills, every lifecycle hook, the live dashboard, the statusline, the terse output style and the three subagents. Node 20 or newer on your PATH.
# In your coding-agent host /plugin marketplace add sarmakska/slipstream /plugin install slipstream # Then, in the project /slipstream:map # build the project map once /slipstream:doctor # verify the install end to end /slipstream:status # plan, budget, memory count, map
In Antigravity, Cursor, Windsurf and other MCP editors
These editors do not load the host's plugin format, so the skills, hooks, slash commands and dashboard are not available there. The nine sp_* tools are. Build the server once, then register the absolute path under your editor's mcpServers block.
# Any MCP-capable editor: Cursor, Windsurf, Antigravity, others
git clone https://github.com/sarmakska/slipstream
cd slipstream
pnpm install
pnpm build
# Register the server (paths vary by editor)
# Cursor: .cursor/mcp.json
# Windsurf: ~/.codeium/windsurf/mcp_config.json
# Antigravity: Settings -> MCP
{
"mcpServers": {
"slipstream": {
"command": "node",
"args": ["/absolute/path/to/slipstream/dist/mcp/index.js"]
}
}
}Instead of whole-file reads
// What sp_map returns at orient time, JSON in chat
{
"files": [
{ "path": "src/map/retrieve.ts",
"purpose": "retrieve a symbol or line range from a file",
"symbols": ["retrieveSymbol", "retrieveLines"] }
]
}
// What sp_symbol returns instead of opening the whole file
/** Walk braces from the declaration line to return one symbol slice. */
export function retrieveSymbol(file: string, symbol: string): string { /* ... */ }How slipstream changes the shape of a session
Orient, pull, compact, reload. The hook writes a durable digest the instant before the trim, the next session reloads it first.
vs other context-saving approaches
Honest about the trade-offs. A hand-written instructions file is free but it rots. A summariser is automatic but it is lossy by design. Manual notes are precise but they evaporate the moment you stop typing them.
| Concern | slipstream | Hand-written instructions file | Summariser tool | Manual notes |
|---|---|---|---|---|
| Reads cost | Slice or symbol | Whole-file by default | Whole-file then summarise | Whole-file by default |
| Memory across sessions | Structured store + index | Single hand-written file | Lossy summary | Manual rewrite |
| Compaction safety | PreCompact digest, lossless | Loses what is not in the file | Lossy by design | Loses everything |
| Recall strategy | Signal-ranked, ~1,200 token cap | Always loaded, full file | Whatever the summariser kept | Whatever you remember to paste |
| Watching the agent | Live dashboard + replay | None | None | None |
| Statusline | Budget, mem, skill, model | None | None | None |
| Verification gates | 59 skills, each gated | None | None | Whatever you wrote |
| Data leaves the machine | Never | Never (it is just a file) | Depends on tool | Never |
| License | MIT | – | Varies | – |
Full comparison page in the wiki, with the design rationale behind each choice: Comparisons →
A guardrailed skill library
Fifty-nine skills under skills/, grouped by area. Each shipping skill carries a verification gate, a real check the agent must pass before advancing. The library targets the stack I actually ship on, not a universal scaffolder.
frontendtailwind, forms, router, dark-mode, responsive-layout, component-library
backendhono-api, zod-validation, error-handling, rate-limit, openapi
supabaseinit, schema, rls, auth, edge-function, storage, typegen
cloudflareworker, pages, d1, kv, r2, secrets
vercellink, env, preview, deploy
resendsetup, domain, transactional, webhook
authsession, password-reset, oauth, rbac
paymentsstripe-setup, checkout, subscriptions, webhooks
seometa-tags, open-graph, structured-data, sitemap
analyticsplausible, web-vitals, events
gitinit-repo, feature-branch, conventional-commit, pull-request, release-tag
memory + contextmemory-capture, memory-recall, memory-prune, scoped-read, context-budget, compact-and-offload
Full catalogue: Skill-Catalogue wiki → . How the engine runs them → . Writing a skill →
/slipstream:doctor
A one-shot end-to-end install verifier. Fifteen checks, each PASS or FAIL with the exact reason. Run it after install, run it after upgrades, run it when something feels off.
The doctor walkthrough: Troubleshooting wiki →
Tech stack
Boring on purpose. The MCP path has zero runtime dependencies. The server path uses only node:http, no Express, no socket library. The event store is a JSONL file, not a database.
Frequently asked
More in the wiki FAQ →
Do I have to use the full plugin to get the token savings?+
No. The MCP tools (sp_map, sp_symbol, sp_lines, sp_search, sp_remember, sp_recall, sp_forget, sp_budget, sp_mindmap) are standard Model Context Protocol. They register in any MCP-capable editor: Antigravity, Cursor, Windsurf, others. The skills, hooks, dashboard and lossless compaction need the host plugin layer; that is the trade-off.
Is the token budget accurate?+
It is a conservative estimate from bytes-into-context at ~3.6 bytes per token, not the real internal counter. It is tuned to warn early and compact a little before it has to. The wording everywhere says "estimate". I would rather be honestly approximate and conservative than precise-looking and wrong.
Does anything leave the machine?+
No. The dashboard server binds 127.0.0.1 on a free port. The memory store is files in your project. The event log is a JSONL file. There is no telemetry, no accounts, no hosted layer. Obvious secrets are pattern-redacted before they reach the log, but treat redaction as belt-and-braces rather than a vault.
What happens during the long session?+
Reads stay slice-sized because the agent is nudged to use sp_symbol and sp_lines. The PreToolUse hook warns before a large whole-file read. Just before the platform compacts, the PreCompact hook writes a structured digest of the open task, decisions and next step. The next session reloads that digest first and signal-ranks the rest.
Can the dashboard steer the agents?+
No, by design. It observes, it does not control. It cannot pause a tool call or redirect a subagent. The honest framing is "a local observability dashboard for your session". If you want control plane, that is a different product.
Why a hand-rolled MCP server instead of the SDK?+
The slice of the protocol the agent loop drives is small and stable: initialize, tools/list, tools/call. A plugin that bundles a server should add as little as possible to the install. The cost is I implement the framing; the benefit is zero runtime dependencies on the MCP path and a server I can audit in one file. The request handler is a pure exported function so tests drive it directly.
Does it work in Cursor, Windsurf, Antigravity?+
The MCP tools do, fully. Build the server once, register the absolute path under your editor's mcpServers block, and the agent gains all nine sp_* tools. The full plugin layer (skills, hooks, dashboard, lossless compaction) needs a coding-agent host that loads the plugin format. Use the full host when you can; use the MCP path when you cannot.
Is it production-ready?+
It is the runner I ship with. 88 tests across 11 files: dashboard event validity, the concurrency-safe append-only writer under 25 parallel writers, a real SSE server end to end, idempotent start, replay, the real MCP server spawned over stdio for tools/list and a sp_symbol call, the PreCompact digest builds and reloads, signal-ranked recall returns only the relevant subset within budget, the statusline string is pinned, and doctor runs against both the real tree and a deliberately broken one.
Ready to ship a sane long session?
Star the repo, drop it into your agent host, build the map once and run doctor. The next long session stops bleeding tokens and keeps its thread.