Skip to content

Cache & Billing

Duplicate cache creation causes double-charging in one session

Symptom: within the same request, the cache is created twice and billed twice. If time-to-first-byte exceeds 30 seconds, suspect this first.

Cause: after a backup-load account switch, the secondary forward carries certain headers, causing misrouting and duplicate cache creation.

Fix:

  1. Confirm the symptom: review the request chain for duplicate cache creation in one request and a clearly >30s time-to-first-byte
  2. Remove the ccs proxy to avoid the intermediate layer that triggers backup-load account switching
  3. If you must keep the forwarding chain, inspect each header passed through on the secondary forward and remove the one causing misrouting
  4. Issue a fresh standalone request to verify

Prevention: keep the request chain single and stable; on secondary forwards, only keep necessary headers.

How to enable the 1-hour context cache?

For dedicated groups that support long caching, add to the env of ~/.claude/settings.json:

json
"ENABLE_PROMPT_CACHING_1H": "1"

Tradeoff: rebuilding a 1-hour cache costs more, so for high-frequency use the default short cache is usually better. Enable it only for long-chain tasks.

How to check current token usage?

Type /cost in the Claude Code interactive UI to see the current session's token usage.