clawback.md

professional tokenmaxxing

clawback tunes how Claude Code (or any client) talks to Anthropic (or any provider) — same models, same plan, more quota, more work.

npx @zapgun/clawback quickstart

Requires Node 24+.


Why

We built clawback because we love using Claude but kept running into quota limits. Anthropic gives you plenty of knobs to tune your usage. We wanted a tool to manage those knobs for us, so we built clawback.

It works so well we decided to share it.

clawback is a transparent local gateway to frontier lab APIs like the Anthropic API. It sits between Claude Code and Anthropic, shows you exactly where your tokens go, and keeps your usage incredibly efficient. We turn a day's quota into a week of work.

Here's an actual efficiency gain captured while buiding clawback:

Chart from a real clawback session showing token efficiency over one night of work

How

“Show me the incentive and I'll show you the outcome.” — Charlie Munger

Anthropic prices cached prompts at a steep discount: reading from cache costs 12–20× less than writing to it. But the cache silently expires — after five idle minutes, at midnight when the date inside your prompt changes, whenever a hidden per-request token rotates. Every expiry means your next message pays a premium to rebuild context you already paid to store.

How you talk to Claude determines whether 5-minute caching is enough, when premium caching pays for itself, where cache breakpoints belong — and when to turn each of these optimzations off. clawback turns optimzations on when you need them, and turns then off when you don't, so that you can focus on what you want to accomplish, instead of on managing your quota.

You and your agents can read the code to see exactly how clawback works.


Optimizations

Apply each knob with one click in the dashboard. The defaults are sensible, and a built-in suggestions engine tells you when a knob would pay off — and when it won't.

The off switch

passthrough

Flip one switch and clawback stands down completely, becoming a transparent pipe. This gives you a clean baseline to measure optimzations against.

No more cold starts

keep-alive

Claude's prompt cache forgets you after idle time. Gentle pings every 1–4 minutes keep it warm between turns, so you stop paying to resend what you already sent.

Step away, come back warm

1h cache ttl

Stretches caching from 5 minutes to a full hour using Anthropic's extended tier. Coffee, standup, code review — pick up right where you left off.

Survive midnight

strip-ephemeral

Timestamps and rotating tokens silently change your prompt and break the cache. clawback normalizes them so a session that crosses midnight doesn't start from scratch.

Warm for less

extended cadence

With the 1-hour tier on, pings only need to fire every 15–45 minutes. Same warm cache, ~6–12× less spent keeping it that way.

Built for the train

mobile

Compresses outgoing traffic and collapses streams into single responses. Less radio-on time on hotspots, hotel wi-fi, and battery.

Runs that finish themselves

auto-continue

Hit a rate-limit wall at 3am? clawback resumes Claude the moment your cap clears — no babysitting, no lost overnight runs.


Quota Controls coming soon

Make sure your quota lands where you aim it, and you never run out of ammo.


Reporting and Validation

clawback ships with the rig we use for performance testing internally. The benchmark harness lives in the repo so you can A/B your own work and generate your own engineering reports:

$ npx @zapgun/clawback quickstart
  ▸ wrote ./CLAWBACK.md   ▸ wired ~/.claude/settings.json
  ▸ proxy on http://localhost:8080   ▸ launching claude…

$ open http://localhost:8080/_proxy/ui/         # watch your tokens live

$ git clone https://github.com/zapgun-ai/clawback && cd clawback   # A/B rig lives in the repo
$ .skills/ab/scripts/ab_block.sh --profile L2 --turns 200   # A/B vs doing nothing
			

Reports designed so that every number carries a 95% confidence interval. Run it against your own sessions and measure the benefit yourself.

Contact