One proxy. Claude Code, Goose, or any AI agent — Anthropic or OpenAI API format. Slash costs by 90% without changing how you work.
Any agent. Any API format. One binary. Real savings.
Claude Code, Goose, Aider, or anything that talks to an LLM API. One proxy gives every agent compression, caching, memory, smart routing, and cost tracking — automatically.
aismush-startInstall AISmush, start it, point your agent at localhost:1849. That's it. Setup guide →
Whether you use Claude Code, Goose, or any other agent, a heavy coding session burns $20-50 in API costs. Most tokens go to mechanical work — reading files, processing tool results, making simple edits — that doesn't need frontier-model pricing.
AISmush sits between your agent and the API. It doesn't care which agent you're using. It just makes every session cheaper.
One command scans your codebase, sends it to AI for deep analysis, and generates Claude Code agents customized to YOUR project — your patterns, your frameworks, your architecture.
Not generic templates. Agents that know your specific file structure, your naming conventions, your test framework, your build commands.
AISmush automatically detects what kind of work each turn requires and routes it to the cheapest model that can handle it.
Planning and architecture? Claude ($15/M). Reading files and making edits? DeepSeek ($0.27/M). That's a 55x cost difference on the turns that matter most.
The biggest single improvement to token usage. Older tool results in your conversation get replaced with compact structural summaries — just function signatures, type definitions, and imports.
Your last 4 messages stay fully intact. Only older code results get summarized. JSON, YAML, and error results are never touched.
Every developer's frustration: "I already told you this yesterday."
Other tools remember tool names. AISmush captures entire conversations — your questions, the AI's answers, the reasoning, the decisions. Searchable by meaning, not just keywords.
--embeddings) — finds "JWT validation" when you search "auth bug"Claude handles 200K tokens. DeepSeek handles 64K. Long sessions blow past DeepSeek's limit, causing failures and lost work.
AISmush automatically manages the mismatch. Old tool results get trimmed, large contexts route to Claude, and your work is never blocked.
See exactly what you're saving. Every request tracked: which provider, how many tokens, what it cost, what it would have cost on Claude alone.
Ask Claude to make a plan, then say "run plan". AISmush analyzes every step, maps each one to the best specialized agent, figures out what can run in parallel, and executes the entire thing autonomously.
One command. ~15MB RAM. Won't slow your machine.
aismush --scan generates agents for your project.
aismush-start launches Claude Code. You save 90%.
Same proxy, same savings — choose the setup that fits your workflow.
Routes between Claude + DeepSeek. Max savings (~90%). Needs a free DeepSeek API key.
aismush-start
No DeepSeek needed. Still get compression, memory, agents, and tracking. Dashboard shows potential savings.
aismush-start --direct
Start AISmush, then point Goose at the proxy. All routing, compression, and memory features apply automatically.
aismush --daemon
ANTHROPIC_BASE_URL=http://localhost:1849 goose
One command on any platform. Works with or without a DeepSeek key.
Linux & macOS:
Windows (PowerShell):
Response-path SSE conversion for complete OpenAI mode compatibility with streaming agents.
Shared savings dashboard for engineering teams. Track ROI and usage across all developers.
First-class integrations for Aider, Continue.dev, Cursor, and other emerging AI coding agents.