Strategy & Ops Index
Measures performance on capabilities that matter most for office and operations work, including non-hallucinated data handling, agentic execution, business knowledge, and instruction following. Weights are derived from the relative frequency of those capabilities across the top tasks performed by office and administrative support workers.
See representative workflowsThe Artificial Analysis Strategy & Ops Index combines performance across benchmarks chosen for strategy, operations, and office administration. Weights follow how often each capability appears in tasks typical of office and administrative support roles—the largest occupational group in the U.S.—grouped by task rather than by job title.
This composite metric prevents narrow specialization and provides a single score for tracking model performance across operations and administrative work.
Each capability sub-score is normalised to a 0-100 scale, then combined using the weights below. All underlying benchmarks are run independently by Artificial Analysis. See our Intelligence Benchmarking Methodology for how evaluations are conducted.
| Category | Weight | Evaluations |
|---|---|---|
| Business Knowledge | 30% | AA-Omniscience Business Accuracy |
| Agentic | 30% | GDPval-AA v2 |
| Quantitative and Scientific Reasoning | 25% | Crit-Pt, HLE |
| Non-Hallucination | 10% | AA-Omniscience Non-Hallucination |
| Instruction Following | 5% | IFBench |
Score
Strategy & Ops Index
Strategy & Ops Index: Capability Breakdown
Capability Breakdown
Strategy & Ops Index: Business Knowledge
Representative Workflows
Real-world workflows that exercise the capabilities the Strategy & Ops Index weights most heavily.
Release Date
Strategy & Ops Index vs. Release Date
Speed
Strategy & Ops Index vs. Output Speed
Pricing
Pricing: Input and Output Prices
Strategy & Ops Index vs. Price
Token Usage
Strategy & Ops Index: Output Token Composition
Cost
Strategy & Ops Index: Cost Breakdown
Frequently Asked Questions
The Strategy & Ops Index is a composite benchmark from Artificial Analysis that measures performance on capabilities that matter most for office and operations work, including non-hallucinated data handling, agentic execution, business knowledge, and instruction following. Weights are derived from the relative frequency of those capabilities across the top tasks performed by office and administrative support workers.
The Strategy & Ops Index is calculated as a weighted average of capability sub-scores, each normalised to a 0–100 scale. The sub-scores and their weights are: Business Knowledge (30%), Agentic (30%), Quantitative and Scientific Reasoning (25%), Non-Hallucination (10%), and Instruction Following (5%).
The Strategy & Ops Index includes AA-Omniscience Business Accuracy, GDPval-AA v2, Crit-Pt, HLE, AA-Omniscience Non-Hallucination, and IFBench.
Claude Fable 5 (Adaptive Reasoning, Max Effort, Opus 4.8 Fallback) currently has the highest Strategy & Ops Index score, with a score of 55 among models with published results. View model
A higher Strategy & Ops Index score indicates stronger overall performance across the benchmarks that make up the index. For a specific use case, individual benchmark results may be more informative than the composite score.