nvidia
NVIDIA: Nemotron 3 Ultra
nvidia/nemotron-3-ultra-550b-a55b
For $1, you can send approximately:
~47.6messages
How do we get this number?
One message = ~7,000 input tokens + ~7,000 output tokens
Input cost per message7,000 x $0.50/M = $0.003500
Output cost per message7,000 x $2.50/M = $0.017500
Total cost per message$0.021000
Messages for $147.62
Context window
1.0M
tokens
Max response
16k
tokens
Input price
$0.50
per million tokens
Output price
$2.50
per million tokens
Modalities
Input:textOutput:text
Description
NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...