Cache & Billing
Duplicate cache creation causes double-charging in one session
Symptom: within the same request, the cache is created twice and billed twice. If time-to-first-byte exceeds 30 seconds, suspect this first.
Cause: after a backup-load account switch, the secondary forward carries certain headers, causing misrouting and duplicate cache creation.
Fix:
- Confirm the symptom: review the request chain for duplicate cache creation in one request and a clearly >30s time-to-first-byte
- Remove the ccs proxy to avoid the intermediate layer that triggers backup-load account switching
- If you must keep the forwarding chain, inspect each header passed through on the secondary forward and remove the one causing misrouting
- Issue a fresh standalone request to verify
Prevention: keep the request chain single and stable; on secondary forwards, only keep necessary headers.
How to enable the 1-hour context cache?
For dedicated groups that support long caching, add to the env of ~/.claude/settings.json:
json
"ENABLE_PROMPT_CACHING_1H": "1"Tradeoff: rebuilding a 1-hour cache costs more, so for high-frequency use the default short cache is usually better. Enable it only for long-chain tasks.
How to check current token usage?
Type /cost in the Claude Code interactive UI to see the current session's token usage.