Catalog
Blueprint by nvidia
LLM Inference with SGLang
Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance
NVIDIA blueprintStationRadixAttentionStructured OutputBlackwellDGX StationInference