AISmush Documentation

Everything you need to know about setting up and using AISmush.

loading...

On This Page

Installation

Linux / macOS

curl -fsSL https://raw.githubusercontent.com/Skunk-Tech/aismush/main/install.sh | bash

Windows — Package Managers (Recommended)

# Scoop
scoop bucket add aismush https://github.com/Skunk-Tech/aismush
scoop install aismush

# winget
winget install SkunkTech.AISmush

Windows — PowerShell Script

irm https://raw.githubusercontent.com/Skunk-Tech/aismush/main/install.ps1 | iex

From Source

git clone https://github.com/Skunk-Tech/aismush.git
cd aismush
cargo build --release
cp target/release/aismush ~/.local/bin/

Supports: Linux x86_64, macOS (Intel + Apple Silicon), Windows x86_64.

Provider Setup

After installing, run the interactive setup to configure your providers:

aismush --setup

This walks you through three provider types with connection testing for each:

1. DeepSeek (Smart Routing)

Routes mechanical tasks (tool results, file reads, simple edits) to DeepSeek at $0.27/M tokens instead of Claude's $3-15/M. Free tier available at platform.deepseek.com.

2. OpenRouter (290+ Models)

Single API key for GPT-4o, Llama, Mistral, Gemini, and hundreds more. Get a key at openrouter.ai/keys.

3. Local Models (Free)

AISmush auto-discovers local model servers running on known ports. Supported servers:

ServerDefault PortAuto-Detected
Ollama11434Yes
LM Studio1234Yes
llama.cpp8080Yes
vLLM8000Yes
Jan1337Yes
text-generation-webui5000Yes
KoboldCpp5001Yes

You don't need all three. Any single provider works. Or use --direct mode with just Claude — you still get compression, memory, agents, and cost tracking.

CLI Commands

Running the Proxy

CommandDescription
aismush-startStart proxy + launch Claude Code in one command (recommended)
aismush-start --directClaude-only mode — no DeepSeek key needed, full compression active
aismushStart the proxy server only (use when running Claude Code separately)
aismush --directStart proxy in Claude-only mode

Setup & Configuration

CommandDescription
aismush --setupInteractive provider configuration — tests each connection before saving
aismush --providersList all configured and auto-discovered providers with health status
aismush --configShow current configuration (keys, ports, thresholds)
aismush --scanScan codebase and generate project-specific agents, skills, and CLAUDE.md

Tools

CommandDescription
aismush --search "query"Search past conversations by meaning (semantic search)
aismush --embeddingsStart with 90MB semantic search model loaded (opt-in for memory)
aismush --statusCheck if proxy is running, show quick stats
aismush --versionShow version number
aismush --helpShow all available commands

Maintenance

CommandDescription
aismush --upgradeDownload and install the latest version
aismush --uninstallRemove AISmush completely (optionally delete data)

Running Modes

Smart Routing (Default)

Routes each turn to the cheapest model that can handle it. Requires at least one secondary provider (DeepSeek, OpenRouter, or local model).

aismush-start

Local + Cloud

If you have Ollama or another local server running, AISmush auto-detects it and routes free tasks there. Cloud providers handle the rest.

# Start Ollama, then:
aismush-start

Direct Mode (Claude Only)

No secondary provider needed. You still get full compression (file caching, command patterns, structural summaries), memory, agents, and cost tracking.

aismush-start --direct

Supported Providers

ProviderTierPricing (per M tokens)Use Case
Claude OpusUltra$15 in / $75 outMost complex reasoning
Claude SonnetPremium$3 in / $15 outPlanning, debugging, architecture
Claude HaikuPremium$0.80 in / $4 outFast responses
DeepSeekMid$0.27 in / $1.10 outCode generation, tool processing
OpenRouter modelsVariesVaries by modelAccess to 290+ models
Local modelsFree$0 / $0Tool results, file reads, simple edits

Smart Routing

AISmush uses multi-factor routing to pick the right provider for each turn:

Task Classification

Task TypeMinimum TierHow It's Detected
Planning / ArchitecturePremium (Claude)First messages, "plan"/"design"/"refactor" keywords
DebuggingMid3+ recent errors, "fix"/"bug"/"debug" keywords
Code GenerationMidMid-session with tool history
Tool ResultsFreeMessage is purely tool_result blocks
File ReadsFree"read"/"show me" keywords

Blast-Radius Analysis

AISmush parses your project's import graph to understand which files are critical. Editing a type definition that 12 other files import? That gets routed to Claude. Editing a leaf test file? Local model handles it free.

Blast Radius ScoreRouting Override
> 0.7 (high impact)Force Premium (Claude)
0.4 - 0.7 (moderate)Force Mid (DeepSeek)
< 0.4 (low impact)Allow Free (local model)

Compression

AISmush compresses context at three levels. All compression is active in every mode, including Claude-only direct mode.

Layer 1: File Caching

Claude Code reads the same files repeatedly. AISmush caches file content hashes and replaces unchanged re-reads with a compact marker.

Layer 2: Command-Specific Patterns

CLI output from Bash tool results gets compressed with command-aware patterns:

CommandWhat's KeptWhat's StrippedSavings
cargo testPass/fail summary, error detailsIndividual "ok" lines, build output~95%
cargo buildErrors, warnings, finish line"Compiling" lines, download progress~90%
git statusBranch, file list by statusHint text, section headers~80%
git diffFile names, hunks, changed linesHeaders, index lines~60%
git logShort hash, message, dateAuthor, decorations, full hash~70%
npm/yarnErrors, audit summaryPackage details, progress~85%
dockerNames, status, errorsSHA digests, build progress~80%

Layer 3: Structural Summarization

Older tool results (beyond the last 4 messages) get replaced with structural summaries — just function signatures, type definitions, and imports. Recent work stays fully intact.

Layer 4: Content-Type Compression

Project Agents

aismush --scan

Scans your codebase, sends it to AI for deep analysis, and generates Claude Code agents customized to your project. Not generic templates — agents that know your file structure, naming conventions, test framework, and build commands.

Plan Orchestrator

Ask Claude to make a plan, then say "run plan". AISmush builds a dependency graph, maps each step to a specialized agent, and executes with maximum parallelism.

How It Works

  1. Create a plan using Claude (it uses EnterPlanMode naturally)
  2. Say "run plan" or "execute plan"
  3. AISmush parses the steps and builds a dependency graph
  4. Shows the execution plan and asks for confirmation
  5. Launches agents in parallel — steps unblock individually as their dependencies complete
  6. Runs verification (cargo test, etc.) after completion

DAG-based execution: Step 3 starts the moment Step 1 finishes, without waiting for unrelated Step 2. Steps are assumed independent unless content explicitly indicates a dependency.

Dashboard

Live at http://localhost:1849/dashboard while the proxy is running.

Configuration

AISmush reads config from (in priority order):

  1. Environment variables (highest priority)
  2. config.json or .deepseek-proxy.json in the current directory
  3. ~/.hybrid-proxy/config.json

Config File Format

{
  "apiKey": "sk-your-deepseek-key",
  "openrouterKey": "sk-or-your-openrouter-key",
  "local": [
    {"name": "ollama", "url": "http://localhost:11434", "model": "qwen3:8b"}
  ],
  "routing": {
    "blastRadiusThreshold": 0.5,
    "preferLocal": true,
    "minTierForPlanning": "premium",
    "minTierForDebugging": "mid"
  },
  "port": 1849,
  "verbose": false
}

Environment Variables

VariableDefaultDescription
DEEPSEEK_API_KEY(none)DeepSeek API key for smart routing
OPENROUTER_API_KEY(none)OpenRouter API key for 290+ models
LOCAL_MODEL_URL(none)Local model server URL
LOCAL_MODEL_NAME(none)Local model name (e.g. qwen3:8b)
PROXY_PORT1849Port for the proxy server
FORCE_PROVIDER(none)Force all requests to a specific provider
PROXY_VERBOSEfalseEnable debug logging
AISMUSH_BLAST_THRESHOLD0.5Blast-radius score for tier escalation
AISMUSH_AUTO_DISCOVERtrueAuto-detect local model servers
AISMUSH_EMBEDDINGS0Load semantic search model on startup

Claude's API key: You don't configure this — Claude Code sends its own authentication headers and AISmush passes them through transparently.

API Endpoints

Available while the proxy is running on localhost:1849:

EndpointMethodDescription
/dashboardGETLive HTML dashboard
/statsGETAggregated statistics (JSON). Supports ?from=&to= Unix timestamps
/historyGETRecent request log (JSON). Supports ?from=&to= date filtering
/healthGETHealth check
/memoriesGETAll stored memories (JSON)
/memories/clearPOSTDelete all memories

FAQ

Does this affect response quality?

For planning and complex reasoning — no, those always go to Claude. For mechanical tasks (reading files, processing tool results) — the routing ensures you get the best model for each specific task. Compression only affects old messages, not your active work.

Can I use this with just Claude (no DeepSeek/local models)?

Yes. Run aismush-start --direct. You still get file caching, command compression, structural summaries, memory, agents, and cost tracking. No extra API key needed.

Does Claude Code know it's being proxied?

No. It sends requests to localhost instead of api.anthropic.com, but the API format is identical. All Claude Code features work normally.

What if a provider goes down?

AISmush has automatic fallback chains. If your local model stops responding, it falls back to DeepSeek. If DeepSeek fails, it falls back to Claude. Both have to be down simultaneously for a request to fail.

Is my data sent anywhere?

Your requests go to the same APIs you'd normally use (Anthropic, DeepSeek, OpenRouter, or your local server). The proxy runs locally. No third-party servers, no telemetry.

How much does the compression actually save?

It depends on your workflow. File caching saves 99% on repeated reads. Command compression saves 80-95% on CLI output. Structural summaries save 60-80% on old code. Combined, a typical session sees 30-60% total token reduction even in Claude-only mode.

Where is my data stored?

~/.hybrid-proxy/
  proxy.db       — SQLite database (requests, sessions, memories)
  config.json    — Your provider configuration
  instance_id    — Persistent machine fingerprint
  proxy.log      — Proxy log output

AISmush Home · GitHub · MIT Licensed

Created by Garret Acott / Skunk Tech