Claude Code vs. Cursor CLI
Comparison between Claude Code and Cursor CLI across the Artificial Analysis Coding Agent Index, including benchmark scores, cost, execution time, and token usage.
For details relating to our methodology, see our methodology page.
Explore other comparisons
vs
Highlights
Updated
Artificial Analysis Coding Agent Index · Higher is better
Not currently available
Comparison
Side-by-side comparison of Claude Code and Cursor CLI.
Coding Agent Comparison
Metric | Analysis | ||
|---|---|---|---|
Agent Harness | Claude Code | Cursor CLI | |
Representative Model | Fable 5 (max) (with fallback) | GPT-5.5 (medium) | |
Coding Agent Index | 77 | 62 | Claude Code has a higher Coding Agent Index than Cursor CLI |
DeepSWE | 66% | 37% | Claude Code has a higher DeepSWE score than Cursor CLI |
Terminal-Bench v2 | 82% | 73% | Claude Code has a higher Terminal-Bench v2 score than Cursor CLI |
SWE-Atlas-QnA | 83% | 75% | Claude Code has a higher SWE-Atlas-QnA score than Cursor CLI |
Cost per Task | $11.75 | $2.01 | Cursor CLI has a lower cost per task than Claude Code |
Time per Task | 23.5m | 6.6m | Cursor CLI has a lower time per task than Claude Code |
Turns per Task | 138 | 78 | Cursor CLI has a lower turns per task than Claude Code |
Token Usage per Task | 14.1M | 4M | Cursor CLI has a lower token usage per task than Claude Code |
Cache Hit Rate | 96% | 89% | Claude Code has a higher cache hit rate than Cursor CLI |
Model Variants
Evaluated model variants for Claude Code and Cursor CLI.
Model Variants
77 | 66% | 82% | 83% | $11.75 | 23.5m | 14.1M | ||
73 | 56% | 79% | 82% | $7.70 | 23.1m | 18M | ||
67 | 49% | 75% | 77% | $3.26 | 12.4m | 7.8M | ||
65 | 40% | 74% | 81% | $5.64 | 15.8m | 16M | ||
57 | 27% | 71% | 72% | $1.68 | 6.3m | 4.6M | ||
54 | 29% | 63% | 70% | $1.97 | 13.7m | 8.5M | ||
52 | 19% | 65% | 73% | $4.33 | 19.6m | 25.9M | ||
52 | 19% | 65% | 72% | $6.23 | 10.6m | 8.7M | ||
47 | 9% | 65% | 68% | $0.27 | 17.9m | 9.7M | ||
47 | 17% | 64% | 60% | $1.18 | 41.2m | 11.4M | ||
71 | - | 70% | 72% | $1.26 | 8.0m | 4.5M | ||
62 | 37% | 73% | 75% | $2.01 | 6.6m | 4M | ||
60 | 32% | 71% | 78% | $2.68 | 13.6m | 5.7M | ||
52 | 16% | 67% | 72% | $0.08 | 9.7m | 3.6M | ||
52 | 16% | 67% | 72% | $0.55 | 6.8m | 4.3M | ||
69 | - | 65% | 73% | $1.52 | 8.3m | 3.8M | ||
67 | - | 64% | 69% | $0.04 | 8.6m | 2.9M |
Performance
Performance across the Artificial Analysis Coding Agent Index.
Artificial Analysis Coding Agent Index
Composite average pass@1 across DeepSWE, Terminal-Bench v2, and SWE-Atlas-QnA · Higher is better
Not currently available
Token Usage
Token consumption across the Artificial Analysis Coding Agent Index.
Token Usage per Task
Mean input, cache, and output tokens per task
Prompt cache hit rates can vary significantly by provider routing, which can materially change effective cost.
Artificial Analysis Coding Agent Index vs. Total Tokens
Artificial Analysis Coding Agent Index vs. mean total tokens per task
Most attractive quadrant
Cost
Pay-per-token API cost across the Artificial Analysis Coding Agent Index, based on current per-token pricing.
Cost per Task
Mean pay-per-token API cost per task (USD) · Lower is better
Not currently available
Artificial Analysis Coding Agent Index vs. Cost per Task
Artificial Analysis Coding Agent Index vs. mean pay-per-token API cost per task (USD)
Most attractive quadrant
Execution Time
Active agent runtime across the Artificial Analysis Coding Agent Index.
Time per Task
Mean agent wall time per task · Lower is better
Not currently available
Artificial Analysis Coding Agent Index vs. Execution Time
Artificial Analysis Coding Agent Index vs. mean agent wall time per task
Most attractive quadrant