Back to comparator
nvidia

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

nvidia/llama-3.3-nemotron-super-49b-v1.5

For $1, you can send approximately:
~286messages

How do we get this number?

One message = ~7,000 input tokens + ~7,000 output tokens
Input cost per message7,000 x $0.10/M = $0.000700
Output cost per message7,000 x $0.40/M = $0.002800
Total cost per message$0.003500
Messages for $1285.71
Context window
131k
tokens
Max response
0
tokens
Input price
$0.10
per million tokens
Output price
$0.40
per million tokens

Modalities

Input:textOutput:text

Description

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...