Question 1

Where can I access DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

DeepSeek V3.1 (Non-reasoning) is available through 9 API providers: Google Vertex, Amazon, CoreWeave, Novita, Together AI, Fireworks, DeepInfra (FP4), SambaNova, and Baseten (FP8). Each provider offers different performance characteristics and pricing.

Question 2

How many API providers offer DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

DeepSeek V3.1 (Non-reasoning) is currently available through 9 API providers that we benchmark and track.

Question 3

Which provider is fastest for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The fastest providers for DeepSeek V3.1 (Non-reasoning) by output speed are SambaNova (185.9 t/s), Baseten (FP8) (182.0 t/s), and Amazon (169.1 t/s). Output speed measures how quickly tokens are generated after the model starts responding.

Question 4

Which provider has the lowest latency for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The providers with the lowest time to first token for DeepSeek V3.1 (Non-reasoning) are Baseten (FP8) (0.76s), Google Vertex (0.80s), and DeepInfra (FP4) (1.19s). Lower latency means faster initial response time.

Question 5

Which provider is cheapest for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The most affordable providers for DeepSeek V3.1 (Non-reasoning) by blended price are DeepInfra (FP4) ($0.24 per 1M tokens), Novita ($0.34 per 1M tokens), and Baseten (FP8) ($0.42 per 1M tokens). Blended price uses a 7:2:1 cache hit/input/output token ratio.

Question 6

Which provider has the lowest input price for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The providers with the lowest input token pricing for DeepSeek V3.1 (Non-reasoning) are DeepInfra (FP4) ($0.25 per 1M input tokens), Novita ($0.27 per 1M input tokens), and Baseten (FP8) ($0.50 per 1M input tokens).

Question 7

Which provider has the lowest output price for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The providers with the lowest output token pricing for DeepSeek V3.1 (Non-reasoning) are DeepInfra (FP4) ($0.95 per 1M output tokens), Novita ($1.00 per 1M output tokens), and Baseten (FP8) ($1.50 per 1M output tokens).

Question 8

How much do prices vary across DeepSeek V3.1 (Non-reasoning) providers?

Accepted Answer

Prices for DeepSeek V3.1 (Non-reasoning) vary up to 13.3x across providers. The most affordable is DeepInfra (FP4) at $0.24 per 1M tokens, while SambaNova charges $3.15 per 1M tokens.

Question 9

How much does speed vary across DeepSeek V3.1 (Non-reasoning) providers?

Accepted Answer

Output speed for DeepSeek V3.1 (Non-reasoning) varies significantly across providers. SambaNova is the fastest at 185.9 t/s, which is 18.4x faster than DeepInfra (FP4) at 10.1 t/s.

Question 10

Which DeepSeek V3.1 (Non-reasoning) providers support JSON mode?

Accepted Answer

7 of 9 providers support JSON mode for DeepSeek V3.1 (Non-reasoning): Google Vertex, CoreWeave, Novita, Fireworks, DeepInfra (FP4), SambaNova, and Baseten (FP8).

Question 11

Which DeepSeek V3.1 (Non-reasoning) providers support function calling?

Accepted Answer

8 of 9 providers support function calling for DeepSeek V3.1 (Non-reasoning): Google Vertex, Amazon, CoreWeave, Novita, Fireworks, DeepInfra (FP4), SambaNova, and Baseten (FP8).

Question 12

Which is the best provider for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

The best provider for DeepSeek V3.1 (Non-reasoning) depends on your priorities: SambaNova offers the highest output speed, Baseten (FP8) has the lowest latency, and DeepInfra (FP4) provides the most competitive pricing.

Question 13

How do I choose a provider for DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

When choosing a provider for DeepSeek V3.1 (Non-reasoning), consider: output speed (for throughput-intensive tasks), latency (for interactive applications requiring quick first responses), pricing (for cost-sensitive workloads), and API features like JSON mode or function calling.

Question 14

Does provider performance for DeepSeek V3.1 (Non-reasoning) change over time?

Accepted Answer

Yes, provider performance can vary over time due to infrastructure changes, load balancing, and updates. We continuously benchmark all providers and display historical performance trends in the "Over Time" charts.

Question 15

What are the overall capabilities of DeepSeek V3.1 (Non-reasoning)?

Accepted Answer

For information about DeepSeek V3.1 (Non-reasoning)'s intelligence, capabilities, modalities, and how it compares to other models, see the model overview page.



Google	164k	Open	--	138	0.80	4.43	--
Amazon Bedrock	128k	Open	--	172	1.45	4.35	--
CoreWeave	128k	Open	--	62	1.40	9.42	--
Novita	164k	Open	--	37	2.69	16.07	--
Together AI	131k	Open	--	--	--	--	--
Fireworks	164k	Open	--	--	--	--	--
DeepInfra	164k	Open	--	19	1.08	26.97	--
SambaNova	131k	Open	--	184	2.85	5.56	--
Baseten	164k	Open	--	182	0.76	3.51	--

DeepSeek V3.1 (Non-reasoning) API Provider Benchmarking & Analysis

Fastest

Lowest Latency

Lowest Price

Speed

End-to-End Response Time

Price

Pricing

Pricing: Cache Hit, Input, and Output

Pricing: Blended Price

Pricing: Cache Discount

Output Speed vs. Price

Speed

Output Speed: DeepSeek V3.1 (Non-reasoning)

Latency vs. Output Speed

Latency

Time to First Token: DeepSeek V3.1 Providers

End-to-End Response Time

End-to-End Response Time: DeepSeek V3.1 Providers

Key Comparison Metrics & API Features

Frequently Asked Questions

DeepSeek V3.1 (Non-reasoning) API Provider Benchmarking & Analysis

Fastest

Lowest Latency

Lowest Price

Speed

End-to-End Response Time

Price

Pricing

Pricing: Cache Hit, Input, and Output

Cache Hit

Pricing: Blended Price

Price

Pricing: Cache Discount

Cache Price Discount

Output Speed vs. Price

Output Speed

Speed

Output Speed: DeepSeek V3.1 (Non-reasoning)

Output Speed

Latency vs. Output Speed

Output Speed

Latency

Time to First Token: DeepSeek V3.1 Providers

Latency (Time to First Token)

End-to-End Response Time

End-to-End Response Time: DeepSeek V3.1 Providers

End-to-End Response Time

Key Comparison Metrics & API Features

Frequently Asked Questions

Where can I access DeepSeek V3.1 (Non-reasoning)?

How many API providers offer DeepSeek V3.1 (Non-reasoning)?

Which provider is fastest for DeepSeek V3.1 (Non-reasoning)?

Which provider has the lowest latency for DeepSeek V3.1 (Non-reasoning)?

Which provider is cheapest for DeepSeek V3.1 (Non-reasoning)?

Which provider has the lowest input price for DeepSeek V3.1 (Non-reasoning)?

Which provider has the lowest output price for DeepSeek V3.1 (Non-reasoning)?

How much do prices vary across DeepSeek V3.1 (Non-reasoning) providers?

How much does speed vary across DeepSeek V3.1 (Non-reasoning) providers?

Which DeepSeek V3.1 (Non-reasoning) providers support JSON mode?

Which DeepSeek V3.1 (Non-reasoning) providers support function calling?

Which is the best provider for DeepSeek V3.1 (Non-reasoning)?

How do I choose a provider for DeepSeek V3.1 (Non-reasoning)?

Does provider performance for DeepSeek V3.1 (Non-reasoning) change over time?

What are the overall capabilities of DeepSeek V3.1 (Non-reasoning)?