Cursor’s Composer 2.5: third on the Coding Agent Index and ~10-60x lower cost than rivals

Composer 2.5 sits in third place on the Artificial Analysis Coding Agent Index, behind only higher-effort variants of Claude Opus 4.7 and GPT-5.5 which cost ~10-60x more per task. This release puts Composer among the leading coding agent models, something that wasn't clear for past releases.

Cursor has released Composer 2.5, the latest model in its Composer line. Composer 2.5 scored 62 on our Coding Agent Index, a 14 point gain over Composer 2 (48). This puts it in third place of our tested agents, behind only Claude Opus 4.7 (max) in Claude Code (66) and GPT-5.5 (xhigh reasoning) in Codex (65). These cost $4.10 and $4.82 per task respectively, ~10x the cost of Composer 2.5 Fast ($0.44) and ~60x the cost of Composer 2.5 standard ($0.07).

Key Takeaways:

➤ Cheaper than every other agent scoring above 60 on the Index: At $0.07 (standard) and $0.44 (Fast) per task, Composer 2.5 sits on the cost-quality Pareto frontier. Medium-effort peers cost $1.24–$2.21 per task; higher-effort variants land 3-4 points above at $4.10–$4.82

➤ Major gains vs Composer 2, led by SWE-Bench-Pro-Hard-AA: +35 points on SWE-Bench-Pro-Hard-AA (12% → 47%), +2 points on Terminal-Bench v2 (64% → 66%), and +3 points on SWE-Atlas-QnA (69% → 72%). At 47%, Composer 2.5's score on SWE-Bench-Pro-Hard-AA is comparable to Claude Opus 4.7 (max) in Claude Code

➤ Among the fastest coding agents: Composer 2.5 Fast runs at an average wall time of 6.7 minutes per task, the third-fastest agent on the Index, behind only Claude Opus 4.7 (medium) in Claude Code (5.8m) and GPT-5.5 (medium) in Cursor CLI (6.2m)

➤ Fast mode enables better responsiveness at 6x pricing: Fast runs 30% faster than standard Composer 2.5, but is ~6x the cost per task ($0.44 vs $0.07). Token pricing is 6x higher for Fast: $3.00/$15.00 vs $0.50/$2.50 per million input/output tokens

Cursor with Composer 2.5 is the cheapest agent scoring above 60 on the Coding Agent Index at $0.07 (standard) and $0.44 (Fast) per task. Higher-effort variants — Claude Opus 4.7 (max) in Claude Code (66, $4.10) and GPT-5.5 (xhigh) in Codex (65, $4.82) — score above at ~10x (Fast) to ~60x (standard) the per-task cost.

Composer 2.5 improves on all benchmarks versus Composer 2, but most notably on SWE-Bench Pro: +35 points on SWE-Bench-Pro-Hard-AA, +2 points on Terminal-Bench v2, and +3 points on SWE-Atlas-QnA.

Cursor serves the same Composer 2.5 model in two variants. We measure Fast executing tasks ~30% faster than standard (6.7 vs 9.3 minutes per task) but at ~6x the cost per task, aligned with Cursor's 6x token pricing differential ($3.00/$15.00 vs $0.50/$2.50 per million input/output tokens).

Model details:

➤ Base model: Continued training on Moonshot AI's open weights Kimi K2.5 as with Composer 2, with Cursor reporting ~85% of total compute from its own additional training and reinforcement learning

➤ Pricing: $0.50/$2.50 per million input/output tokens for the standard variant; $3.00/$15.00 for the Fast variant (the default in Cursor)

➤ Available exclusively in Cursor: both Cursor IDE and Cursor CLI, an externally accessible API is not available

See Artificial Analysis for further details and benchmarks: https://artificialanalysis.ai/agents/coding-agents

Cursor’s Composer 2.5: third on the Coding Agent Index and ~10-60x lower cost than rivals

Read the latest

Four frontier launches in eight days: six labs now field a model above 50 on the Artificial Analysis Intelligence Index

Kimi K3 achieves #3 in the Artificial Analysis Intelligence Index, comparable to Opus 4.8 and GPT-5.5

Thinking Machines has released Inkling, the new leading U.S. open weights model