One proxy. Claude Code, Goose, or any AI agent — Anthropic or OpenAI API format. Slash costs by 90% without changing how you work.
Any agent. Any API format. One binary. Real savings.
Claude Code, Goose, Aider, or anything that talks to an LLM API. One proxy gives every agent compression, caching, memory, smart routing, and cost tracking — automatically.
aismush-startInstall AISmush, start it, point your agent at localhost:1849. That's it. Setup guide →
Whether you use Claude Code, Goose, or any other agent, a heavy coding session burns $20-50 in API costs. Most tokens go to mechanical work — reading files, processing tool results, making simple edits — that doesn't need frontier-model pricing.
AISmush sits between your agent and the API. It doesn't care which agent you're using. It just makes every session cheaper.
One command scans your codebase, sends it to AI for deep analysis, and generates agents customized to YOUR project — your patterns, your frameworks, your architecture.
Not generic templates. Agents that know your specific file structure, your naming conventions, your test framework, your build commands.
aismush --scan now auto-installs 21 production-grade engineering workflow skills from Addy Osmani's agent-skills library alongside your project-specific agents.
Osmani is a well-known engineering leader at Google. These skills encode the entire professional development lifecycle — from first idea to production ship — as reusable, agent-ready workflows.
AISmush automatically detects what kind of work each turn requires and routes it to the cheapest model that can handle it.
Planning and architecture? Claude ($15/M). Reading files and making edits? DeepSeek ($0.27/M). That's a 55x cost difference on the turns that matter most.
The biggest single improvement to token usage. Older tool results in your conversation get replaced with compact structural summaries — just function signatures, type definitions, and imports.
Your last 4 messages stay fully intact. Only older code results get summarized. JSON, YAML, and error results are never touched.
Every developer's frustration: "I already told you this yesterday."
Other tools remember tool names. AISmush captures entire conversations — your questions, the AI's answers, the reasoning, the decisions. Searchable by meaning, not just keywords.
--embeddings) — finds "JWT validation" when you search "auth bug"Claude handles 200K tokens. DeepSeek handles 64K. Long sessions blow past DeepSeek's limit, causing failures and lost work.
AISmush automatically manages the mismatch. Old tool results get trimmed, large contexts route to Claude, and your work is never blocked.
See exactly what you're saving. Every request tracked: which provider, how many tokens, what it cost, what it would have cost at full frontier-model pricing.
Ask your agent to make a plan, then say "run plan". AISmush analyzes every step, maps each one to the best specialized agent, figures out what can run in parallel, and executes the entire thing autonomously.
One command. ~15MB RAM. Won't slow your machine.
aismush --scan generates project agents + 21 engineering skills.
Point any agent at localhost:1849 and start saving 90%.
Same proxy, same savings — choose the setup that fits your workflow.
Routes between Claude + DeepSeek. Max savings (~90%). Needs a free DeepSeek API key.
aismush-start
No DeepSeek needed. Still get compression, memory, agents, and tracking. Dashboard shows potential savings.
aismush-start --direct
Start AISmush, then point any agent at the proxy. All routing, compression, and memory features apply automatically.
aismush --daemon
ANTHROPIC_BASE_URL=http://localhost:1849 goose
One command on any platform. Works with or without a DeepSeek key.
Linux & macOS:
Windows (PowerShell):