Loading...Working on your request
Benchmark cache

CacheSafety Bench

Do kha nang tai su dung an toan phan hoi LLM truoc khi bat cache production.

Phan lon benchmark cache chi toi uu hit rate. CacheSafety Bench con do Safe Hit Rate, Bad Hit Rate va muc tiet kiem chi phi API.

Doc docs

Van de

Chi hit rate la chua du.

Semantic caching co the giam chi phi, nhung chi mot bad hit cung co the khien model trong sai. CacheSafety Bench do xem tai su dung co an toan hay khong, chu khong chi xem hai prompt co giong nhau hay khong.

Chi so cot loi

Do do an toan truoc khi do quy mo.

SH
An toanSafe Hit Rate

Chi tinh nhung lan tai su dung ma nguoi dung khong nhan ra.

BH
GuardrailBad Hit Rate

Day la gioi han an toan nghiem ngat truoc cache production.

$/K
Kinh teCost Saved / 1K Requests

Chi tinh tiet kiem sau khi da xac nhan tai su dung an toan.

TR
Kiem thu baySemantic Trap Failure Rate

Do xem prompt giong nhau co van lam hong tai su dung hay khong.

Cach hoat dong

Ba buoc truoc khi tin vao cache.

P1
ReplayPhat lai cap request

Chay old_request, old_answer va new_request qua mot benchmark runner bao thu.

P2
Danh giaDanh gia tai su dung an toan

Kiem tra xem cau tra loi cu co that su dap ung request moi ma khong co vi pham an hay khong.

P3
Chinh sachUoc tinh tiet kiem an toan

Xuat bao cao va khuyen nghi chinh sach than trong truoc production rollout.

Xem truoc bao cao

Vi du bao cao tinh

Chinh sach cache tot la chinh sach tiet kiem chi phi ma nguoi dung khong nhan ra cau tra loi da duoc tai su dung.

Tong so cap2,000
Safe Hit Rate18.4%
Bad Hit Rate0.0%
Cost Saved / 1K Requests$0.42
Chinh sach de xuatExact + Canonical
Semantic cacheNot recommended yet

Run hosted

Benchmark cuc bo mien phi va open source. Hosted runs la tuy chon.

Benchmark hosted cua NextModel dung credit cho replay lon hon, judge models va bao cao co the chia se. Cac run cuc bo van la open source va endpoint-neutral.

Can do muc tiet kiem an toan truoc khi bat cache production. Hosted runs danh cho cac danh gia quy mo lon hon, khong phai dieu kien de dung benchmark nay.

Tich hop developer

Hoat dong voi cac client tuong thich OpenAI.

CacheSafety Bench van la open source va endpoint-neutral. NextModel chi la mot hosted endpoint tuy chon va production gateway.

Vi du tuong thich OpenAI
export OPENAI_API_KEY=...
export OPENAI_BASE_URL=https://api.nextmodel.app/v1

FAQ

Cau hoi thuong gap

Day co phai semantic cache khong?

Khong. CacheSafety Bench la benchmark de do tai su dung an toan cac phan hoi LLM, khong phai loi hua rang semantic cache nen duoc bat mac dinh.

Toi co can dung NextModel khong?

Khong. Cac benchmark run cuc bo la open source va endpoint-neutral. Hosted runs tren NextModel la tuy chon.

Bad hit la gi?

Bad hit la mot cau tra loi da duoc tai su dung nhung dang ra khong nen tra ve cho request moi vi no vi pham facts, constraints, timing, format hoac ky vong cua nguoi dung.

Toi co the chay no cuc bo khong?

Co. Benchmark duoc thiet ke de chay truoc tien o may cuc bo voi toy, synthetic hoac private datasets nam trong quyen kiem soat cua ban.

Bat dau ngay

Do tai su dung an toan phan hoi LLM truoc khi len production.

Hay chay benchmark mo cuc bo truoc, sau do chi dung hosted workflow khi ban can replay jobs lon hon va bao cao co the chia se.