Claw-some AI Agent Testing
| Model | Provider | Score | Cost | Time | Bench Version | OpenClaw | Client | When |
|---|---|---|---|---|---|---|---|---|
| anthropic/claude-haiku-4.5 | anthropic | 74.8% | $0.75 | 14.6m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| anthropic/claude-opus-4.5 | anthropic | 69.2% | $3.58 | 13.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| anthropic/claude-opus-4.6 | anthropic | 81.7% | $2.72 | 28.0m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| anthropic/claude-sonnet-4.5 | anthropic | 78.4% | $2.70 | 20.2m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| anthropic/claude-sonnet-4.6 | anthropic | 75.3% | $2.79 | 20.9m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| arcee-ai/trinity-large-preview:free | arcee-ai | 64.2% | - | 41.7m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| deepseek/deepseek-chat | deepseek | 55.1% | $0.16 | 13.3m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| deepseek/deepseek-v3.2 | deepseek | 69.7% | $0.33 | 50.4m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| google/gemini-2.5-flash | 65.3% | $0.20 | 11.4m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago | |
| google/gemini-2.5-flash-lite | 22.0% | $0.06 | 12.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago | |
| google/gemini-2.5-pro | 65.5% | $1.33 | 15.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago | |
| google/gemini-3-flash-preview | 67.4% | $0.30 | 9.7m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago | |
| google/gemini-3.1-pro-preview | 73.3% | $1.39 | 26.3m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago | |
| meta-llama/llama-3.1-70b-instruct | meta-llama | 19.8% | $0.28 | 6.2m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| meta-llama/llama-4-maverick | meta-llama | 34.0% | $0.41 | 15.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| meta-llama/llama-4-scout | meta-llama | 4.3% | $0.05 | 8.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| minimax/minimax-m2.1 | minimax | 77.0% | $0.14 | 17.2m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| minimax/minimax-m2.5 | minimax | 79.7% | $0.19 | 24.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| mistralai/devstral-2512 | mistralai | 70.7% | $0.65 | 10.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| mistralai/mistral-large-2512 | mistralai | 56.2% | $0.86 | 19.2m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| moonshotai/kimi-k2.5 | moonshotai | 83.5% | $0.32 | 22.9m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| nvidia/nemotron-3-super-120b-a12b:free | nvidia | 75.0% | - | 27.7m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 6 days ago |
| nvidia/nemotron-3-super-120b-a12b:free | nvidia | 68.7% | - | 42.9m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| nvidia/nemotron-3-super-120b-a12b:free | nvidia | 65.1% | - | 34.5m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 6 days ago |
| openai/gpt-4o | openai | 66.5% | $1.83 | 10.0m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| openai/gpt-4o-mini | openai | 71.5% | $0.19 | 17.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| openai/gpt-5-mini | openai | 76.3% | $0.19 | 12.4m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| openai/gpt-5-nano | openai | 62.0% | $0.05 | 15.3m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| openai/gpt-5.4 | openai | 77.4% | $1.55 | 20.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| openai/gpt-oss-20b | openai | 55.7% | $0.04 | 6.5m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen-2.5-7b-instruct | qwen | 23.2% | $0.07 | 34.5m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3-max-thinking | qwen | 71.5% | $2.72 | 43.0m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3.5-122b-a10b | qwen | 74.0% | $0.78 | 13.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3.5-27b | qwen | 74.7% | $0.50 | 20.0m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3.5-35b-a3b | qwen | 78.4% | $0.28 | 20.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3.5-397b-a17b | qwen | 80.7% | $0.89 | 21.1m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| qwen/qwen3.5-plus-02-15 | qwen | 77.1% | $0.53 | 15.0m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| stepfun/step-3.5-flash | stepfun | 74.1% | $0.83 | 39.8m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| x-ai/grok-4.1-fast | x-ai | 80.0% | $0.25 | 18.6m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| z-ai/glm-4.5-air | z-ai | 70.8% | $0.17 | 29.9m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |
| z-ai/glm-5 | z-ai | 80.2% | $0.83 | 31.9m | 8a5a7d5 | OpenClaw 2026.3.8 (3caab92) | | 5 days ago |