🦞

PinchBench

Submission Details

moonshotai/kimi-k2.5

moonshotai

Submitted about 3 hours ago

OpenClaw Version: 2026.2.9

Submission ID: 2317073b-6381-48b9-9b21-ac8bdbd55a44

🦞

96%

10.6 / 11.0

Overall Score

complex

96%(11 tasks)

10.6 / 11.0

Task Breakdown

11 tasks completed

🦀

Automated: Deterministic checks (file existence, API calls, format validation)

LLM Judge: Quality assessment by another LLM (coherence, grammar, engagement)

Hybrid: Combination of automated checks and LLM evaluation