Why this benchmark exists
Most cache benchmarks optimize hit rate. CacheSafety Bench asks a stricter question: can an old answer safely answer a new request without creating a bad hit that users would notice?
| Safe Hit Rate | Reusable answers the user would not notice were cached |
| Bad Hit Rate | Unsafe reused answers |
| Cost Saved / 1K Requests | Estimated savings under a safety constraint |
| Semantic Trap Failure Rate | How often similar-looking prompts still fail reuse |
Hosted and local positioning
The local benchmark is open source and endpoint-neutral. NextModel hosted runs are optional for larger replay jobs, judge models, and shareable reports.
OpenAI-compatible endpoint
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.nextmodel.app/v1Where to start
Start with the public benchmark page, then move to API keys or billing only when you are ready to run larger hosted evaluations.
| Landing page | /benchmarks/cache-safety |
| API keys | /dashboard/api-keys |
| Billing | /dashboard/billing |