Menu

logo
Artificial Analysis
HOME
logo

Amazon Bedrock: Models Intelligence, Performance & Price

Analysis of Amazon Bedrock's models across key metrics including quality, price, output speed, latency, context window & more. This analysis is intended to support you in choosing the best model provided by Amazon Bedrock for your use-case. For more details including relating to our methodology, see our FAQs. Models analyzed: Llama 3.3 70B, Llama 3.1 405B Standard, Llama 3.1 405B Latency Optimized, Llama 3.2 90B (Vision), Llama 3.1 70B Standard, Llama 3.1 70B Latency Optimized, Llama 3.2 11B (Vision), Llama 3.1 8B, Llama 3.2 3B, Llama 3.2 1B, Claude 3.5 Sonnet (Oct), Claude 3.5 Sonnet (June), Claude 3 Opus, Claude 3.5 Haiku Standard, Claude 3.5 Haiku Latency Optimized, Claude 3 Haiku, Claude 3.7 Sonnet Thinking, Claude 3.7 Sonnet, Mistral Large 2 (Jul '24), Mixtral 8x7B, DeepSeek R1, Nova Pro, Nova Pro Latency Optimized, Nova Lite, Nova Micro, Command-R+, Command-R+ (Apr '24), Command-R (Mar '24), Command-R, Llama 3 70B, Llama 3 8B, Claude 3 Sonnet, Claude 2.1, Mistral Large (Feb '24), and Mistral 7B.
Link:

Amazon Model Comparison Summary

Intelligence:DeepSeek R1 logo DeepSeek R1 and Claude 3.7 Sonnet Thinking logo Claude 3.7 Sonnet Thinking are the highest quality models offered by Amazon, followed by Claude 3.5 Sonnet (Oct) logo Claude 3.5 Sonnet (Oct), Llama 3.3 70B logo Llama 3.3 70B & Llama 3.1 405B Standard logo Llama 3.1 405B Standard.Output Speed (tokens/s):Nova Micro logo Nova Micro (328 t/s) and Nova Lite logo Nova Lite (293 t/s) are the fastest models offered by Amazon, followed by Llama 3.2 11B (Vision) logo Llama 3.2 11B (Vision), Llama 3.3 70B logo Llama 3.3 70B & Llama 3.1 70B Latency Optimized logo Llama 3.1 70B Latency Optimized.Latency (seconds):Claude 3.7 Sonnet Thinking logo Claude 3.7 Sonnet Thinking (0.00s) and  Llama 3.2 1B logo Llama 3.2 1B (0.31s) are the lowest latency models offered by Amazon, followed by Nova Micro logo Nova Micro, Llama 3 8B logo Llama 3 8B & Llama 3.2 3B logo Llama 3.2 3B.Blended Price ($/M tokens):Nova Micro logo Nova Micro ($0.06) and Llama 3.2 1B logo Llama 3.2 1B ($0.10) are the cheapest models offered by Amazon, followed by Nova Lite logo Nova Lite, Llama 3.2 3B logo Llama 3.2 3B & Llama 3.2 11B (Vision) logo Llama 3.2 11B (Vision).Context Window Size:Nova Pro logo Nova Pro (300k) and Nova Pro Latency Optimized logo Nova Pro Latency Optimized (300k) are the largest context window models offered by Amazon, followed by Nova Lite logo Nova Lite, Claude 3.5 Sonnet (Oct) logo Claude 3.5 Sonnet (Oct) & Claude 3.5 Sonnet (June) logo Claude 3.5 Sonnet (June).

Highlights

Intelligence
Artificial Analysis Intelligence Index; Higher is better
Speed
Output Tokens per Second; Higher is better
Price
USD per 1M Tokens; Lower is better
Parallel Queries:
Prompt Length:
Features
Model Intelligence
Price
Output tokens/s
Latency
Further
Analysis
Amazon Bedrock logo
DeepSeek logo
DeepSeek R1
128k
60
$2.36
83.6
0.45
Amazon Bedrock logo
Anthropic logo
Claude 3.7 Sonnet Thinking
200k
57
$6.00
62.4
0.00
Amazon Bedrock logo
Anthropic logo
Claude 3.5 Sonnet (Oct)
200k
44
$6.00
50.5
0.93
Amazon Bedrock logo
Meta logo
Llama 3.3 70B
128k
41
$0.71
141.7
0.59
Amazon Bedrock Standard logo
Meta logo
Llama 3.1 405B Standard
128k
40
$2.40
30.7
1.85
Amazon Bedrock Latency Optimized logo
Meta logo
Llama 3.1 405B Latency Optimized
128k
40
$3.00
65.5
0.74
Amazon Bedrock logo
Amazon logo
Nova Pro
300k
37
$1.40
106.4
0.34
Amazon Bedrock Latency Optimized logo
Amazon logo
Nova Pro Latency Optimized
300k
37
$1.75
125.3
0.64
Amazon Bedrock logo
Mistral logo
Mistral Large 2 (Jul '24)
128k
37
$3.00
33.5
0.45
Amazon Bedrock Standard logo
Meta logo
Llama 3.1 70B Standard
128k
35
$0.72
31.5
0.65
Amazon Bedrock Latency Optimized logo
Meta logo
Llama 3.1 70B Latency Optimized
128k
35
$0.90
138.6
0.33
Amazon Bedrock logo
Anthropic logo
Claude 3 Opus
200k
35
$30.00
24.2
1.23
Amazon Bedrock Standard logo
Anthropic logo
Claude 3.5 Haiku Standard
200k
35
$1.60
50.9
0.94
Amazon Bedrock Latency Optimized logo
Anthropic logo
Claude 3.5 Haiku Latency Optimized
200k
35
$2.00
100.3
0.51
Amazon Bedrock logo
Meta logo
Llama 3.2 90B (Vision)
128k
33
$0.72
56.4
0.37
Amazon Bedrock logo
Amazon logo
Nova Lite
300k
33
$0.10
292.8
0.33
Amazon Bedrock logo
Amazon logo
Nova Micro
130k
28
$0.06
328.5
0.31
Amazon Bedrock logo
Anthropic logo
Claude 3 Sonnet
200k
28
$6.00
61.2
0.75
Amazon Bedrock logo
Meta logo
Llama 3 70B
8k
27
$2.86
53.1
0.40
Amazon Bedrock logo
Mistral logo
Mistral Large (Feb '24)
33k
26
$6.00
43.2
0.39
Amazon Bedrock logo
Anthropic logo
Claude 2.1
200k
24
$12.00
29.5
1.62
Amazon Bedrock logo
Meta logo
Llama 3.1 8B
128k
24
$0.22
90.1
0.37
Amazon Bedrock logo
Cohere logo
Command-R+
128k
21
$6.00
47.9
0.50
Amazon Bedrock logo
Meta logo
Llama 3 8B
8k
21
$0.38
103.8
0.32
Amazon Bedrock logo
Cohere logo
Command-R+ (Apr '24)
128k
20
$6.00
47.0
0.50
Amazon Bedrock logo
Meta logo
Llama 3.2 3B
128k
20
$0.15
71.8
0.33
Amazon Bedrock logo
Mistral logo
Mixtral 8x7B
33k
17
$0.51
77.8
0.33
Amazon Bedrock logo
Cohere logo
Command-R (Mar '24)
128k
15
$0.75
109.2
0.34
Amazon Bedrock logo
Mistral logo
Mistral 7B
8k
10
$0.16
93.2
0.33
Amazon Bedrock logo
Meta logo
Llama 3.2 1B
128k
10
$0.10
117.0
0.31
Amazon Bedrock logo
Meta logo
Llama 3.2 11B (Vision)
128k
$0.16
143.9
0.33
Amazon Bedrock logo
Anthropic logo
Claude 3.5 Sonnet (June)
200k
$6.00
46.1
0.88
Amazon Bedrock logo
Anthropic logo
Claude 3 Haiku
200k
$0.50
108.0
1.03
Amazon Bedrock logo
Anthropic logo
Claude 3.7 Sonnet
200k
$6.00
47.1
0.86
Amazon Bedrock logo
Cohere logo
Command-R
128k
$0.75
108.4
0.34

Key definitions

Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
Time period: Metrics are 'live' and are based on the past 72 hours of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.