Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) Intelligence, Performance & Price Analysis
Model summary
Intelligence
Speed
Input Price
USD per 1M tokens
Output Price
Verbosity
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is above average in intelligence and well priced when comparing to other open weight non-reasoning models of similar size. The model supports text input, outputs text, and has a 128k tokens context window with knowledge up to December 2023.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) scores 14 on the Artificial Analysis Intelligence Index, placing it above average among comparable models (averaging 13).
Pricing for Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is $0.00 per 1M input tokens (competitively priced, average: $0.20) and $0.00 per 1M output tokens (competitively priced, average: $0.56).
| Reasoning | No This page shows the non-reasoning version of this model. A reasoning variant may also exist. |
|---|---|
| Input modality | Supports: text |
| Output modality | Supports: text |
| Knowledge cutoff | Dec 1, 2023 |
| Context window | 128k ~192 A4 pages of size 12 Arial font |
| Total parameters | 49B |
| License | NVIDIA Open Model License Agreement |
| Model weights | Hugging Face |
Metrics are compared against models of the same class:
- Non-reasoning models → compared only with other non-reasoning models
- Reasoning models → compared across both reasoning and non-reasoning
- Open weights models → compared only with other open weights models of the same size class:
- Tiny: ≤4B parameters
- Small: 4B–40B parameters
- Medium: 40B–150B parameters
- Large: >150B parameters
- Proprietary models → compared across proprietary and open weights models of the same price range, using a blended 3:1 input/output price ratio:
- <$0.15 per 1M tokens
- $0.15–$1 per 1M tokens
- >$1 per 1M tokens
Highlights
Intelligence
Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index by Open Weights / Proprietary
Intelligence Evaluations
Openness
Artificial Analysis Openness Index: Results
Intelligence Index Comparisons
Intelligence vs. Price
Intelligence Index Token Use & Cost
Output Tokens Used to Run Artificial Analysis Intelligence Index
Cost to Run Artificial Analysis Intelligence Index
Context Window
Context Window
Pricing
Pricing now includes a “Cache Hit Price” alongside Input and Output pricing, with new blend ratios.
Pricing: Cache Hit, Input, and Output
Pricing Comparison of Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) API Providers
Speed
Measured by Output Speed (tokens per second)
Output Speed
Output Speed vs. Price
Latency
Measured by Time (seconds) to First Token
Latency: Time To First Answer Token
End-to-End Response Time
Seconds to output 500 tokens, calculated based on time to first token, 'thinking' time for reasoning models, and output speed
End-to-End Response Time
Model Size (Open Weights Models Only)
Model Size: Total and Active Parameters
Frequently Asked Questions
Common questions about Llama 3.3 Nemotron Super 49B v1 (Non-reasoning)
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) was released on March 18, 2025.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) was created by NVIDIA.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) scores 14 (estimated) on the Artificial Analysis Intelligence Index, placing it above average among other open weight non-reasoning models of similar size (median: 13).
No, Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is not a reasoning model. It provides direct responses without extended chain-of-thought reasoning.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) supports text input.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) supports text output.
No, Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) does not support image input. It can only process text.
No, Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is not multimodal. It only supports text input.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) has a context window of 130k tokens. This determines how much text and conversation history the model can process in a single request.
Yes, Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is open weights. The model weights are publicly available and can be downloaded for self-hosting.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) has 49 billion parameters.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is released under the NVIDIA Open Model License Agreement license. This license allows commercial use. View license
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) achieves a score of 14 on the Artificial Analysis Intelligence Index. This composite benchmark evaluates models across reasoning, knowledge, mathematics, and coding.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) has a knowledge cutoff of December 2023. The model's training data includes information up to this date.
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is an open weights model that can be self-hosted. View providers
Llama 3.3 Nemotron Super 49B v1 (Non-reasoning) is an open weights model that can be downloaded and self-hosted. Compare providers