NVIDIA catalog

Models, skills and blueprints for GPU jobs.

Browse NVIDIA workloads inside ICPX before creating a compute job.

Models146

Skills201

Blueprints72

nvidia

Active Speaker Detection

Detect and track speaker identities across video frames.

NVIDIA modelbroadcastlocalizationsmptespeaker detection

deepmind

alphafold2

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelnimBionemoBiologyprotein folding

deepmind

alphafold2-multimer

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelnimBionemoBiologyprotein folding

sqwh1lyrveic

AODT 1.2.1

AODT 1.2.1

sqwh1lyrveic

AODT 1.2.2

AODT 1.2.2

nvidia

Background Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

NVIDIA modeldenoisingcommunicationsnvidia ai for mediaspeech enhancement

nvidia

bevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

NVIDIA modelautonomous vehiclesbevautomotiveperception

baai

bge-m3

Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.

NVIDIA modelEmbeddingsRetrieval Augmented GenerationPartner EndpointDownload Available

mit

Boltz-2

Predict complex structures using Boltz-2.

NVIDIA modelnimBionemoBiologyProtein Folding

nvidia

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

NVIDIA modelAutomatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva

resembleai

chatterbox-multilingual-tts

Natural and expressive voices in 23 languages. For voice agents and brand ambassadors.

NVIDIA modelDGX SparkA100 SXM4 80GBL40STTS

nvidia

conformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case Spanish with record-setting accuracy and performance

NVIDIA modelASRstreamingSpanishNVIDIA NIM

nvidia

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

NVIDIA modelautonomous vehiclesPhysical AIroboticstext-to-world

nvidia

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cuopt

World-record accuracy and performance for complex route optimization.

NVIDIA modelDownload AvailableRoute Optimization

deepseek-ai

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

NVIDIA modelB200H200H100 80GB HBM3MoE

deepseek-ai

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

NVIDIA modelMoereasoningcodingagentic

mit

diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

NVIDIA modelChemistrynimBioNemoDocking

google

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

NVIDIA modelchatdiffusion-llmtext-to-textreasoning

abacusai

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

NVIDIA modelchatCode GenerationText-to-TextFree Endpoint

meta

esm2-650m

Generates embeddings of proteins from their amino acid sequences.

NVIDIA modelnimProtein EmbeddingBioNemoBiology

meta

esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelbiologynimBionemoprotein folding

arc

evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

NVIDIA modelDNA GenerationbiologynimBionemo

nvidia

eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

NVIDIA modelDownload AvailabletelepresenceNvidia MaxineDigital Human

cadence

fidelity

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

ansys

fluent

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

black-forest-labs

FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-Kontext-dev

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

NVIDIA modelTogether AIDeepinfraText-to-ImageImage Generation

black-forest-labs

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

NVIDIA modelimage editingRun-on-RTXText-to-ImageImage Generation

nvidia

fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

NVIDIA modelDownload AvailableWeather SimulationAI Weather Predictionclimate science

google

gemma-2-2b-it

Advanced small language generative AI model for edge applications

NVIDIA modelchatChatText-to-TextLanguage Generation

google

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatlanguage generationspeech recognitionVisual QA

google

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatTogether AIBitdeerlanguage generation

google

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

NVIDIA modelB200H200L40STogether AI

nvidia

genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

NVIDIA modelChemistrynimBioNemoMolecule Generation

nvidia

gliner-pii

GLiNER PII detects Personally Identifiable Information in text.

NVIDIA modelFree EndpointPII DetectionNVIDIA NIM

z-ai

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

NVIDIA modelB200H200Together AIBitdeer

openai

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

NVIDIA modelTogether AIEigen AICoreWeaveDeepinfra

openai

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

NVIDIA modelreasoningtext-to-textchatmath

nvidia

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

NVIDIA modelQuantumreasoningVision Language Modelcalibration

moonshotai

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

NVIDIA modelB200H200H100 80GB HBM3Together AI

nvidia

LipSync

Generative lip dubbing that syncs lips in a video to input audio.

NVIDIA modelbroadcastlipsynclocalizationnews

meta

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200Together AI

meta

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200CoreWeave

nvidia

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

NVIDIA modelH100 NVLA100 SXM4 80GBH200advanced reasoning

nvidia

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

NVIDIA modelchatdoc intelligencemultiple image understandingOCR

nvidia

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelcontent moderationllm safetymultilingual guard modelmultilingual content safety

meta

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

NVIDIA modelchatImage-Text RetrievalVisual QAImage Captioning

meta

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200GH200 480GBH100 NVLTogether AI

meta

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200GH200 480GBH100 NVLTogether AI

meta

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

NVIDIA modelchatTogether AIDeepinfraImage-Text Retrieval

meta

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

NVIDIA modelB200H100 NVLA100 PG509 200Together AI

nvidia

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

meta

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

meta

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

NVIDIA modelLLM Multimodal SafetyContent SafetyGuardrailContent Moderator

nvidia

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

NVIDIA modelDownload AvailableText-to-EmbeddingRetrieval Augmented GenerationNeMo Retriever

nvidia

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

NVIDIA modelnemo retrieverembeddingPartner EndpointDownload Available

nvidia

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingDownload AvailableRetrieval Augmented Generation

nvidia

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingPartner EndpointDownload Available

nvidia

magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA modelTTSNVIDIA NIMNVIDIA Rivamultilingual

nvidia

magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA modelTTSNVIDIA NIMNVIDIA RivaText-to-Speech

nvidia

megatron-1b-nmt

Enable smooth global interactions in 36 languages.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

minimaxai

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

NVIDIA modelB200H200H100 80GB HBM3Together AI

minimaxai

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

NVIDIA modelcodingtext-to-textreasoningchat

mistralai

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

NVIDIA modelL40Schatlanguage generationSLM

mistralai

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

NVIDIA modelchatlanguage generationmultimodalagentic

mistralai

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

NVIDIA modelcodingreasoningtextagentic

mistralai

mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

NVIDIA modelchatlanguage generationinstruction followingfunction calling

mistralai

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

NVIDIA modelchatcode generationreasoningimage-to-text

mistralai

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

NVIDIA modelB200L40H100 NVLAdvanced Reasoning

nvidia

molmim

MolMIM performs controlled generation, finding molecules with the right properties.

NVIDIA modelChemistrynimBioNemoMolecule Generation

colabfold

msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

NVIDIA modelnimBionemoBiologyProtein Folding

nvidia

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

NVIDIA modelnemo guardrailsllm securityNIMPrompt Injection

nvidia

nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modeloptical character recognitionnemo retrieverdata ingestiontable extraction

nvidia

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

NVIDIA modelchatMoEReasoningLong Context

nvidia

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

NVIDIA modelEigen AIBitdeerDeepinfraLightning AI

nvidia

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelMoEReasoningChatLong Context

nvidia

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelAgentMoEFrontierReasoning

nvidia

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-asr-streaming

Real-time speech recognition for English

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NVIDIA modelNeMo GuardrailsNemotronreasoningSafety and Moderation

nvidia

nemotron-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

NVIDIA modelchatChatText-to-TextLanguage Generation

nvidia

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

nvidia

nemotron-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemotron-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart DetectionTable Detectiondata ingestion

nvidia

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelDownload Availabletext and table extractiondocument parsingsupported language - english

nvidia

nemotron-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-voicechat

Nemotron 3 Voicechat

NVIDIA modelEnglishvoice chatNVIDIA NIMFree Endpoint

nvidia

nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

NVIDIA modelNon-Commercial Use OnlyText-to-EmbeddingRetrieval Augmented GenerationFree Endpoint

nvidia

nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

NVIDIA modelnemo retrieverEmbeddingFree EndpointRetrieval Augmented Generation

nvidia

nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

NVIDIA modelDownload AvailableEmbeddingrun-on-rtxNemo retriever

nvidia

nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionData ingestionChart Detectionnemo retriever

nvidia

nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

NVIDIA modelchatTogether AIDeepinfrathinking budget

openfold

openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

NVIDIA modelB200DGX SparkGH200 144G HBM3eBiology

openfold

openfold3

OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)

NVIDIA modelB200DGX SparkGH200 144G HBM3eBiology

baidu

paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

NVIDIA modelB200H100 NVLA100 PG509 200Optical Character Recognition

google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

NVIDIA modelimagecvVision Assistantvlm

nvidia

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

NVIDIA modelASRStreamingEnglishBatch

nvidia

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingSpanishNVIDIA NIM

nvidia

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingVietnameseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

NVIDIA modelASRStreamingMandarinNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

NVIDIA modelASRStreamingTaiwaneseNVIDIA NIM

nvidia

parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

NVIDIA modelASRStreamingEnglishbatch

nvidia

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

NVIDIA modelASREnglishNVIDIA NIMNVIDIA Riva

microsoft

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

NVIDIA modelchatCoreWeaveChatText-to-Text

microsoft

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

NVIDIA modelSpeech RecognitionVisual QALanguage GenerationChart and Table Understanding

ipd

proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

NVIDIA modelbiologynimBioNemoProtein Generation

qwen

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

NVIDIA modelTogether AIText-to-ImageImage GenerationPartner Endpoint

qwen

qwen-image-edit

Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

NVIDIA modelB200H200H100 80GB HBM3Together AI

qwen

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

NVIDIA modelB200GB200chattool calling

qwen

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

NVIDIA modelMoEimage-to-imageVLMagentic

nvidia

Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.

NVIDIA modelHDRIremote contributionlightingnvidia ai for media

nvidia

rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelRankingRetrieval Augmented GenerationFree Endpoint

ipd

rfdiffusion

A generative model of protein backbones for protein binder design.

NVIDIA modelA100 SXM4 80GBL40SA10Gbiology

nvidia

riva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

NVIDIA modelnvidia nimneural machine translationText TranslationFree Endpoint

sarvamai

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

NVIDIA modelcodingindic languageshybridreasoning

bytedance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

NVIDIA modelchatthinking budgetreasoningtext-generation

siemens

simcenter-star-ccm+

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

upstage

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

NVIDIA modelNon-Commercial Use OnlychatText-to-TextLanguage Generation

nvidia

sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

NVIDIA modelautonomous vehiclesbevav stackautomotive

cadence

spectre-x

Run large-scale electronics and chip design verification simulations

NVIDIA modelchip-designelectronic-design-automationedasemiconductor

stabilityai

stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

NVIDIA modelText-to-ImageImage GenerationDownload Available

stepfun-ai

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

NVIDIA modelchatAgenticCodingReasoning

stepfun-ai

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

NVIDIA modelB200H200H100 80GB HBM3chat

stockmark

stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

NVIDIA modelB200H100 NVLA100 PG509 200sovereign ai

nvidia

streampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

NVIDIA modelautonomous vehiclesbevAV Stackautomotive

nvidia

Studio Voice

Enhance input speech recorded with low-quality microphones in noisy or reverberant environments, producing studio-quality speech.

NVIDIA modelbroadcastsmptecommunicationsmic quality

nvidia

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.

NVIDIA modelbroadcastmedia2forensicsnvidia ai for media

0615409268808334

test_endpoint_20251218_133732_563_ouy_canary

For publishing test

microsoft

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

NVIDIA modeltext-to-3dRun-on-RTXimage-to-3dDownload Available

nvidia

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

NVIDIA modelInteractive AnnotationImage SegmentationNon-Commercial Use OnlyDownload Available

openai

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

NVIDIA modelB200H100 80GB HBM3ASRAST

nvidia

Active Speaker Detection

Detect and track speaker identities across video frames.

NVIDIA modelbroadcastlocalizationsmptespeaker detection

deepmind

alphafold2

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelnimBionemoBiologyprotein folding

deepmind

alphafold2-multimer

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelnimBionemoBiologyprotein folding

sqwh1lyrveic

AODT 1.2.1

AODT 1.2.1

sqwh1lyrveic

AODT 1.2.2

AODT 1.2.2

nvidia

Background Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

NVIDIA modeldenoisingcommunicationsnvidia ai for mediaspeech enhancement

nvidia

bevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

NVIDIA modelautonomous vehiclesbevautomotiveperception

baai

bge-m3

Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.

NVIDIA modelEmbeddingsRetrieval Augmented GenerationPartner EndpointDownload Available

mit

Boltz-2

Predict complex structures using Boltz-2.

NVIDIA modelnimBionemoBiologyProtein Folding

nvidia

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

NVIDIA modelAutomatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva

resembleai

chatterbox-multilingual-tts

Natural and expressive voices in 23 languages. For voice agents and brand ambassadors.

NVIDIA modelDGX SparkA100 SXM4 80GBL40STTS

nvidia

conformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case Spanish with record-setting accuracy and performance

NVIDIA modelASRstreamingSpanishNVIDIA NIM

nvidia

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

NVIDIA modelautonomous vehiclesPhysical AIroboticstext-to-world

nvidia

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cuopt

World-record accuracy and performance for complex route optimization.

NVIDIA modelDownload AvailableRoute Optimization

deepseek-ai

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

NVIDIA modelB200H200H100 80GB HBM3MoE

deepseek-ai

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

NVIDIA modelMoereasoningcodingagentic

mit

diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

NVIDIA modelChemistrynimBioNemoDocking

google

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

NVIDIA modelchatdiffusion-llmtext-to-textreasoning

abacusai

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

NVIDIA modelchatCode GenerationText-to-TextFree Endpoint

meta

esm2-650m

Generates embeddings of proteins from their amino acid sequences.

NVIDIA modelnimProtein EmbeddingBioNemoBiology

meta

esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelbiologynimBionemoprotein folding

arc

evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

NVIDIA modelDNA GenerationbiologynimBionemo

nvidia

eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

NVIDIA modelDownload AvailabletelepresenceNvidia MaxineDigital Human

cadence

fidelity

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

ansys

fluent

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

black-forest-labs

FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-Kontext-dev

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

NVIDIA modelTogether AIDeepinfraText-to-ImageImage Generation

black-forest-labs

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

NVIDIA modelimage editingRun-on-RTXText-to-ImageImage Generation

nvidia

fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

NVIDIA modelDownload AvailableWeather SimulationAI Weather Predictionclimate science

google

gemma-2-2b-it

Advanced small language generative AI model for edge applications

NVIDIA modelchatChatText-to-TextLanguage Generation

google

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatlanguage generationspeech recognitionVisual QA

google

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatTogether AIBitdeerlanguage generation

google

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

NVIDIA modelB200H200L40STogether AI

nvidia

genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

NVIDIA modelChemistrynimBioNemoMolecule Generation

nvidia

gliner-pii

GLiNER PII detects Personally Identifiable Information in text.

NVIDIA modelFree EndpointPII DetectionNVIDIA NIM

z-ai

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

NVIDIA modelB200H200Together AIBitdeer

openai

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

NVIDIA modelTogether AIEigen AICoreWeaveDeepinfra

openai

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

NVIDIA modelreasoningtext-to-textchatmath

nvidia

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

NVIDIA modelQuantumreasoningVision Language Modelcalibration

moonshotai

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

NVIDIA modelB200H200H100 80GB HBM3Together AI

nvidia

LipSync

Generative lip dubbing that syncs lips in a video to input audio.

NVIDIA modelbroadcastlipsynclocalizationnews

meta

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200Together AI

meta

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200CoreWeave

nvidia

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

NVIDIA modelH100 NVLA100 SXM4 80GBH200advanced reasoning

nvidia

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

NVIDIA modelchatdoc intelligencemultiple image understandingOCR

nvidia

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelcontent moderationllm safetymultilingual guard modelmultilingual content safety

meta

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

NVIDIA modelchatImage-Text RetrievalVisual QAImage Captioning

meta

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200GH200 480GBH100 NVLTogether AI

meta

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelB200GH200 480GBH100 NVLTogether AI

meta

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

NVIDIA modelchatTogether AIDeepinfraImage-Text Retrieval

meta

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

NVIDIA modelB200H100 NVLA100 PG509 200Together AI

nvidia

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

meta

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

meta

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

NVIDIA modelLLM Multimodal SafetyContent SafetyGuardrailContent Moderator

nvidia

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

NVIDIA modelDownload AvailableText-to-EmbeddingRetrieval Augmented GenerationNeMo Retriever

nvidia

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

NVIDIA modelnemo retrieverembeddingPartner EndpointDownload Available

nvidia

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingDownload AvailableRetrieval Augmented Generation

nvidia

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingPartner EndpointDownload Available

nvidia

magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA modelTTSNVIDIA NIMNVIDIA Rivamultilingual

nvidia

magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA modelTTSNVIDIA NIMNVIDIA RivaText-to-Speech

nvidia

megatron-1b-nmt

Enable smooth global interactions in 36 languages.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

minimaxai

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

NVIDIA modelB200H200H100 80GB HBM3Together AI

minimaxai

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

NVIDIA modelcodingtext-to-textreasoningchat

mistralai

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

NVIDIA modelL40Schatlanguage generationSLM

mistralai

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

NVIDIA modelchatlanguage generationmultimodalagentic

mistralai

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

NVIDIA modelcodingreasoningtextagentic

mistralai

mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

NVIDIA modelchatlanguage generationinstruction followingfunction calling

mistralai

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

NVIDIA modelchatcode generationreasoningimage-to-text

mistralai

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

NVIDIA modelB200L40H100 NVLAdvanced Reasoning

nvidia

molmim

MolMIM performs controlled generation, finding molecules with the right properties.

NVIDIA modelChemistrynimBioNemoMolecule Generation

colabfold

msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

NVIDIA modelnimBionemoBiologyProtein Folding

nvidia

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

NVIDIA modelnemo guardrailsllm securityNIMPrompt Injection

nvidia

nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modeloptical character recognitionnemo retrieverdata ingestiontable extraction

nvidia

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

NVIDIA modelchatMoEReasoningLong Context

nvidia

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

NVIDIA modelEigen AIBitdeerDeepinfraLightning AI

nvidia

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelMoEReasoningChatLong Context

nvidia

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelAgentMoEFrontierReasoning

nvidia

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-asr-streaming

Real-time speech recognition for English

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NVIDIA modelNeMo GuardrailsNemotronreasoningSafety and Moderation

nvidia

nemotron-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

NVIDIA modelchatChatText-to-TextLanguage Generation

nvidia

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

nvidia

nemotron-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemotron-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart DetectionTable Detectiondata ingestion

nvidia

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelDownload Availabletext and table extractiondocument parsingsupported language - english

nvidia

nemotron-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-voicechat

Nemotron 3 Voicechat

NVIDIA modelEnglishvoice chatNVIDIA NIMFree Endpoint

nvidia

nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

NVIDIA modelNon-Commercial Use OnlyText-to-EmbeddingRetrieval Augmented GenerationFree Endpoint

nvidia

nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

NVIDIA modelnemo retrieverEmbeddingFree EndpointRetrieval Augmented Generation

nvidia

nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

NVIDIA modelDownload AvailableEmbeddingrun-on-rtxNemo retriever

nvidia

nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelObject DetectionData ingestionChart Detectionnemo retriever

nvidia

nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

NVIDIA modelchatTogether AIDeepinfrathinking budget

openfold

openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

NVIDIA modelB200DGX SparkGH200 144G HBM3eBiology

openfold

openfold3

OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)

NVIDIA modelB200DGX SparkGH200 144G HBM3eBiology

baidu

paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

NVIDIA modelB200H100 NVLA100 PG509 200Optical Character Recognition

google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

NVIDIA modelimagecvVision Assistantvlm

nvidia

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

NVIDIA modelASRStreamingEnglishBatch

nvidia

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingSpanishNVIDIA NIM

nvidia

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingVietnameseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

NVIDIA modelASRStreamingMandarinNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

NVIDIA modelASRStreamingTaiwaneseNVIDIA NIM

nvidia

parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

NVIDIA modelASRStreamingEnglishbatch

nvidia

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

NVIDIA modelASREnglishNVIDIA NIMNVIDIA Riva

microsoft

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

NVIDIA modelchatCoreWeaveChatText-to-Text

microsoft

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

NVIDIA modelSpeech RecognitionVisual QALanguage GenerationChart and Table Understanding

ipd

proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

NVIDIA modelbiologynimBioNemoProtein Generation

qwen

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

NVIDIA modelTogether AIText-to-ImageImage GenerationPartner Endpoint

qwen

qwen-image-edit

Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

NVIDIA modelB200H200H100 80GB HBM3Together AI

qwen

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

NVIDIA modelB200GB200chattool calling

qwen

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

NVIDIA modelMoEimage-to-imageVLMagentic

nvidia

Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.

NVIDIA modelHDRIremote contributionlightingnvidia ai for media

nvidia

rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelRankingRetrieval Augmented GenerationFree Endpoint

ipd

rfdiffusion

A generative model of protein backbones for protein binder design.

NVIDIA modelA100 SXM4 80GBL40SA10Gbiology

nvidia

riva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

NVIDIA modelnvidia nimneural machine translationText TranslationFree Endpoint

sarvamai

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

NVIDIA modelcodingindic languageshybridreasoning

bytedance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

NVIDIA modelchatthinking budgetreasoningtext-generation

siemens

simcenter-star-ccm+

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

upstage

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

NVIDIA modelNon-Commercial Use OnlychatText-to-TextLanguage Generation

nvidia

sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

NVIDIA modelautonomous vehiclesbevav stackautomotive

cadence

spectre-x

Run large-scale electronics and chip design verification simulations

NVIDIA modelchip-designelectronic-design-automationedasemiconductor

stabilityai

stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

NVIDIA modelText-to-ImageImage GenerationDownload Available

stepfun-ai

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

NVIDIA modelchatAgenticCodingReasoning

stepfun-ai

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

NVIDIA modelB200H200H100 80GB HBM3chat

stockmark

stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

NVIDIA modelB200H100 NVLA100 PG509 200sovereign ai

nvidia

streampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

NVIDIA modelautonomous vehiclesbevAV Stackautomotive

nvidia

Studio Voice

Enhance input speech recorded with low-quality microphones in noisy or reverberant environments, producing studio-quality speech.

NVIDIA modelbroadcastsmptecommunicationsmic quality

nvidia

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.

NVIDIA modelbroadcastmedia2forensicsnvidia ai for media

0615409268808334

test_endpoint_20251218_133732_563_ouy_canary

For publishing test

microsoft

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

NVIDIA modeltext-to-3dRun-on-RTXimage-to-3dDownload Available

nvidia

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

NVIDIA modelInteractive AnnotationImage SegmentationNon-Commercial Use OnlyDownload Available

openai

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

NVIDIA modelB200H100 80GB HBM3ASRAST

nvidia

accelerated-computing-cudf

Official NVIDIA-authored guidance for NVIDIA cuDF GPU DataFrames, pandas acceleration, dask-cuDF, ETL, joins, groupby, CSV/Parquet I/O, nullable semantics, and multi-GPU DataFrame workloads.

NVIDIA skillDeveloperData EngineerData ScientistcuDF

nvidia

aiq-deploy

Use when asked to install, deploy, run, validate, troubleshoot, or stop NVIDIA AI-Q Blueprint infrastructure.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerAI And Machine Learning

nvidia

aiq-research

Use when asked to run deep research or AI-Q research through a reachable NVIDIA AI-Q Blueprint backend.

NVIDIA skillDeveloperAI EngineerAI And Machine LearningNeMo Agent Toolkit

nvidia

cudaq-guide

CUDA-Q onboarding guide for installation, test programs, GPU simulation, QPU hardware, and quantum applications.

NVIDIA skillDeveloperQuantum ResearcherHpc DeveloperCuda Q

nvidia

cufolio

Use when a user asks to build, optimize, backtest, rebalance, or analyze a stock portfolio with Mean-CVaR, efficient frontiers, scenario generation, or NVIDIA cuOpt.

NVIDIA skillDeveloperAI EngineerData ScientistApplication Developer

nvidia

cuopt-developer

Modify, build, test, debug, and contribute to NVIDIA cuOpt (C++/CUDA, Python, server, CI). Use for solver internals, PRs, DCO, and code conventions.

NVIDIA skillDeveloperHpc DeveloperDeveloper ToolscuOpt

nvidia

cuopt-install

Install cuOpt for Python, C, or server via pip, conda, or Docker; verify the install. For building cuOpt from source, see cuopt-developer.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperAccelerated Computing

nvidia

cuopt-numerical-optimization-api-c

LP, MILP, and QP (beta) with cuOpt — C API only. Use when the user is embedding LP, MILP, or QP in C/C++.

NVIDIA skillDeveloperHpc DeveloperApplication DeveloperAccelerated Computing

nvidia

cuopt-numerical-optimization-api-cli

LP, MILP, and QP (beta) with cuOpt — CLI only (MPS files, cuopt_cli). Use when the user is solving LP, MILP, or QP from MPS via command line.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-numerical-optimization-api-python

Solve LP, MILP, QP (beta) with cuOpt Python API — linear/quadratic objectives, integer variables, scheduling, portfolio, least squares.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-numerical-optimization-formulation

LP, MILP, QP — concepts, problem-text parsing, and formulation patterns (parameters, constraints, decisions, objective). Concepts only; no API.

NVIDIA skillDeveloperData ScientistApplication DeveloperAccelerated Computing

nvidia

cuopt-routing-api-python

Vehicle routing (VRP, TSP, PDP) with cuOpt — Python API only. Use when the user is building or solving routing in Python.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-routing-formulation

Vehicle routing (VRP, TSP, PDP) — problem types and data requirements. Domain concepts; no API or interface.

NVIDIA skillDeveloperData ScientistApplication DeveloperAccelerated Computing

nvidia

cuopt-server-api-python

cuOpt REST server — start server, endpoints, Python/curl client examples. Use when the user is deploying or calling the REST API.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperAccelerated Computing

nvidia

cuopt-server-common

cuOpt REST server — what it does and how requests flow. Domain concepts; no deploy or client code.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperSolutions Architect

nvidia

cuopt-skill-evolution

After solving a non-trivial problem, detect generalizable learnings and propose skill updates. Always active — applies to every interaction.

NVIDIA skillDeveloperDeveloper ToolscuOpt

nvidia

cuopt-user-rules

Base rules for end users calling NVIDIA cuOpt (routing/LP/MILP/QP/install/server). Not for cuOpt internals — use cuopt-developer for those.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cupynumeric-hdf5

Read and write large cuPyNumeric arrays to HDF5 with Legate's parallel, distributed HDF5 I/O (legate.io.hdf5: to_file, from_file, from_file_batched). Use when a developer needs to save a cuPyNumeric array to an .h5/.hdf5 file, load an HDF5 dataset into a

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-install

Install and verify cuPyNumeric for Python — requirements, commands, verification. Source builds are out of scope.

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-migration-readiness

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must b

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-parallel-data-load

Load a sharded, on-disk dataset (sharded .npy, Parquet/Arrow, raw binary, sharded HDF5, custom layouts) into a distributed cuPyNumeric ndarray via a manual partition + leaf @task launch with CPU/OMP/GPU variants. Use when no single-call loader fits, inclu

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

dali-dynamic-mode

DALI imperative dynamic mode (`nvidia.dali.experimental.dynamic`, ndd): use when working on ndd code or migrating pipelines; skip pipeline-only tasks.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

data-designer

Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

deepstream-dev

NVIDIA DeepStream SDK 9.0 development with Python pyservicemaker API. Use when building video analytics pipelines, GStreamer-based video processing, TensorRT inference integration, object detection/tracking, or Kafka/message broker integration.

NVIDIA skillDeveloperAI EngineerApplication DeveloperAI And Machine Learning

nvidia

deepstream-import-vision-model

Use this skill to bring any vision model from HuggingFace or NVIDIA NGC into an NVIDIA DeepStream pipeline with end-to-end automation: ONNX download, SafeTensors export, TRT engine build, custom nvinfer bbox parser, multi-stream benchmark, and PDF report.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

dicom-metadata-extract

Used for extracting selected metadata from one DICOM file and flagging standard-tag PHI presence. Not for anonymization or clinical use.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

dicom-series-preflight

Used for header-only preflight of one DICOM series folder before conversion or inference. Not for de-identification or clinical clearance.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

dicom-series-to-volume

Used for converting one CT DICOM series folder to a HU NIfTI volume with affine evidence. Not for multi-frame DICOM or clinical use.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

digital-health-clinical-asr-build

Stage 2 of the Clinical ASR Flywheel. Use when curating clinical terms, tagging IPA, and synthesizing a NeMo manifest. NOT for scoring (use /digital-health-clinical-asr-eval).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-eval

Stage 3 of Clinical ASR Flywheel. Score a NeMo manifest, produce the five-section KER leaderboard (by-ipa_source diagnostic). Not for ASR auth (/riva-asr).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-finetune

Stage 4 of the Clinical ASR Flywheel. Use when priority KER is above 0.3 to run stock NeMo SFT on Parakeet TDT v2 and offline cycle N+1 re-eval. NOT for generic word boosting (use /finetune-asr).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-setup

Stage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.

NVIDIA skillDeveloperAI EngineerMl EngineerNemotron for Digital Health

nvidia

dynamo-interconnect-check

Validate that a Dynamo deployment's NIXL/UCX/NCCL interconnect is ready for disaggregated serving over RDMA/NVLink. Use after recipe-runner brings a deployment up (especially disagg/multi-node) to confirm the KV transport is correct; use troubleshoot for

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-recipe-runner

Select, validate, patch, and deploy existing NVIDIA Dynamo Kubernetes recipes. Use for model/backend/GPU/deployment-mode recipe bring-up; use router-starter for router-only mode work and troubleshoot for broken deployments.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-router-starter

Start or patch Dynamo router modes and run router endpoint smoke checks. Use for round-robin, KV-aware, least-loaded, or device-aware routing setup; use recipe-runner for recipe deployment and troubleshoot for failure diagnosis.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-troubleshoot

Diagnose failed or unhealthy Dynamo deployments. Use when pods, model-cache jobs, PVCs, workers, frontend/router health, endpoints, or benchmark jobs fail; use recipe-runner/router-starter before this for normal bring-up.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

earth2studio-data-fetch

Fetch weather/climate data via Earth2Studio data sources for specific variables and times. Do NOT use for inference pipelines, model discovery, or installation.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-deterministic-forecast

Build deterministic forecast scripts with Earth2Studio (model, data source, IO, inference). Do NOT use for ensemble, diagnostics, data-only fetch, or install.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-discover

Find Earth2Studio models, data sources, and examples for a weather/climate use case. Do NOT use for writing inference code, downloading data, or installation.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-install

Guide installing Earth2Studio via uv or pip, selecting model extras, and configuring the environment. Do NOT use for writing inference code, choosing models, or PhysicsNeMo questions.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

holoscan-install-conda

Install Holoscan SDK v4.3+ via Conda in a CUDA 13 environment. Use for Conda installs; redirect CUDA 12 hosts to container/wheel.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-container

Install Holoscan SDK via the NGC Docker container. Use for container-based installs; not for native apt/pip/Conda installs.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerApplication Developer

nvidia

holoscan-install-debian

Install Holoscan SDK natively on Ubuntu via apt. Use for C++ installs on Ubuntu; pair with /holoscan-install-wheel for Python.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-source

Build Holoscan SDK from source via the in-tree ./run script. Use only when published packages don't meet the user's needs.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-wheel

Install Holoscan SDK Python wheel via pip into a venv. Use for Python installs; not for native C++/apt or Conda installs.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-setup

Guides Holoscan SDK installation: inspects the host, assesses platform compatibility, recommends an install method, and delegates to the matching install skill.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

hsb-app

Discover and run Holoscan Sensor Bridge example applications on a connected devkit. Filters available apps by the user's platform, HSB software version, board type, and sensors. Supports timed execution, failure analysis, code-edit suggestions, and iterat

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

hsb-flash

Flash the FPGA on an HSB board connected to an NVIDIA devkit. Supports HSB Lattice boards (FPGA versions 2407, 2412, 2507, 2510) and Leopard Imaging VB1940 "all-in-one" cameras (FPGA versions 2507, 2510). Uses release-specific YAML manifests and board-typ

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

hsb-setup

Clone the latest NVIDIA Holoscan Sensor Bridge repo, ask which supported devkit is being used, configure the host per platform, build the correct demo container, run it, and verify HSB connectivity by pinging 192.168.0.2. Use for Holoscan Sensor Bridge se

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerHands On Builder

nvidia

hsb-test

Execute QA test plans on Holoscan Sensor Bridge hardware. Reads a user-provided test document, filters tests by the user's setup, determines which tests can run automatically, executes them with pass/fail evaluation, and produces a structured test results

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

launch-nemo-rl

Playbook for launching, monitoring, stopping, and debugging NeMo-RL recipes on a Kubernetes cluster via the nrl-k8s CLI. Covers ephemeral vs long-lived RayCluster modes, iterating on runs, and debugging hung or failed training jobs.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

mcore-create-issue

Investigate a failing GitHub Actions run or job and create a GitHub issue for the failure.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

mcore-linting-and-formatting

Linting and formatting for Megatron-LM. Covers running autoformat.sh, tools (ruff, black, isort, pylint, mypy), and code style rules.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

mcore-run-on-slurm

How to launch distributed Megatron-LM training jobs on a SLURM cluster. Covers a minimal sbatch skeleton, environment-variable setup for torch.distributed.run, CUDA_DEVICE_MAX_CONNECTIONS rules across hardware and parallelism modes, container conventions,

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

mcore-split-pr

Split a PR into multiple PRs to reduce the number of required CODEOWNERS reviewer groups.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

mcore-testing

Test system for Megatron-LM. Covers test layout, recipe YAML structure, adding and running unit and functional tests, golden values, marker filters, and CI parity.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

nemo-automodel-distributed-training

Guide for selecting and configuring distributed training strategies in NeMo AutoModel, including FSDP2, Megatron FSDP, DDP, and parallelism settings.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-automodel-launcher-config

Configure NeMo AutoModel job launches for interactive runs, Slurm clusters, and SkyPilot cloud execution.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-automodel-model-onboarding

Guide for onboarding new model architectures into NeMo AutoModel, including architecture discovery, implementation patterns, registration, and validation.

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo Framework

nvidia

nemo-automodel-recipe-development

Create and modify NeMo AutoModel training and evaluation recipes, including YAML structure, builders, and execution flow.

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo Framework

nvidia

nemo-data-designer-plugin

Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

nemo-evaluator-plugin

Use when working on the Evaluator plugin CLI, jobs, SDK-backed specs, metric types, or plugin-owned Evaluator skills.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nemo-mbridge-mlm-bridge-training

Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-multi-node-slurm

Convert single-node scripts to multi-node Slurm sbatch jobs and debug common multi-node failures. Covers srun-native vs uv run torch.distributed approaches, container setup, NCCL timeouts, OOM sizing for MoE models, and interactive allocation.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-mbridge-perf-activation-recompute

Validate and use selective and full activation recompute in Megatron Bridge to reduce GPU memory usage at the cost of extra compute.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-cpu-offloading

Validate and use CPU offloading in Megatron Bridge, including layer-level activation offloading and fractional optimizer state offloading with HybridDeviceOptimizer.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-cuda-graphs

Validate and use CUDA graph capture in Megatron Bridge, including local full-iteration graphs and Transformer Engine scoped graphs for attention, MLP, and MoE modules.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-expert-parallel-overlap

Validate and use MoE expert-parallel communication overlap in Megatron-Bridge, including overlap_moe_expert_parallel_comm, delay_wgrad_compute, and flex dispatcher backends such as DeepEP and HybridEP.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-hierarchical-context-parallel

Operational guide for enabling hierarchical context parallelism in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-megatron-fsdp

Operational guide for enabling Megatron FSDP in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-memory-tuning

Techniques for reducing peak GPU memory in Megatron Bridge — expandable segments, parallelism resizing, activation recompute, CPU offloading constraints, and common OOM fixes.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-comm-overlap

MoE expert-parallel communication overlap in Megatron Bridge. Covers dispatch/combine overlap, flex dispatcher backends, and expert wgrad scheduling.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-dispatcher-selection

Choose the right MoE token dispatcher (`alltoall`, DeepEP, or HybridEP) for the hardware, EP degree, and optimization stage. Summarizes patterns from DSV3, Qwen3, Qwen3-Next, and VLM bring-up work.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-hardware-configs

Representative MoE training playbooks by hardware platform and model family. Summarizes rounded throughput bands, parallelism patterns, and common tuning stacks.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-long-context

Long-context MoE training guidance for Megatron Bridge. Covers CP sizing, selective recompute, dispatcher choices, and practical patterns from DSV3, Qwen3, and Qwen3-Next long-context experiments.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-optimization-workflow

Systematic workflow for MoE training optimization in Megatron Bridge, based on the Megatron-Core MoE paper. Covers the Three Walls framework, parallel folding, recompute strategy, dispatcher choice, and CUDA-graph bring-up.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-vlm-training

Practical guidance for training MoE VLMs in Megatron Bridge. Compares FSDP and 3D-parallel approaches, using rounded lessons from Qwen3-VL, Qwen3-Next, and other multimodal experiments.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-parallelism-strategies

Operational guide for choosing and combining parallelism strategies in Megatron Bridge, including sizing rules, hardware topology mapping, and combined parallelism configuration.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-sequence-packing

Validate and use packed sequences and long-context training in Megatron-Bridge, distinguishing offline packed SFT for LLMs from in-batch packing for VLMs, and applying the right CP constraints.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-tp-dp-comm-overlap

Operational guide for enabling TP, DP, and PP communication overlap in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-recipe-recommender

Recommend and customize Megatron Bridge recipes for a user's model, GPU count, and training goal. Indexes library recipes (pretrain/SFT/PEFT) and performance recipes.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-resiliency

Resiliency features in Megatron Bridge including fault tolerance, straggler detection, in-process restart, preemption, and re-run state machine.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-retriever

Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or

NVIDIA skillDeveloperData EngineerAI EngineerAI And Machine Learning

nvidia

nemo-rl-auto-research

Autonomous NeMo-RL research agent workflow for directed hypothesis testing and open-ended discovery. Guides agents through the full experiment lifecycle: understanding recipes and environments, wiring RL or NeMo-gym runs, launching reproducible baselines

NVIDIA skillDeveloperAI EngineerMl EngineerResearch Academic

nvidia

nemo-rl-brev-etiquette

Brev instance operating guidance for NeMo-RL agents working in /home/ubuntu/RL with limited workspace disk, a larger /ephemeral volume, and optional /home/ubuntu/RL/.env secrets. Use when running nemo-rl-auto-research campaigns, experiments, training jobs

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-rl-docs

Documentation conventions for NeMo-RL. Covers docs/index.md updates and docstring format. Do NOT use for: bug fixes, test fixes, dependency bumps, refactoring, CI/CD changes, performance tuning, or any task that does not involve writing or updating docume

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo RL

nvidia

nemo-rl-session-memory

Manage durable working-session memory for coding agents. Use when a user asks to preserve or recover agent context across disconnects, VS Code restarts, long-running work, handoffs, or any session where important state should be written periodically under

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo RL

nvidia

nemoclaw-user-agent-skills

Describes the agent skills shipped with NemoClaw and how to access them by cloning the repository. Use when users ask about AI agent support, coding assistant integration, or the .agents/skills/ directory. Trigger keywords - nemoclaw agent skills, ai codi

NVIDIA skillDeveloperAI EngineerApplication DeveloperNeMoClaw

nvidia

nemoclaw-user-configure-inference

Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server,

NVIDIA skillDeveloperAI EngineerPlatform EngineerApplication Developer

nvidia

nemoclaw-user-configure-security

Presents a risk framework for every configurable security control in NemoClaw. Use when evaluating security posture, reviewing sandbox security defaults, or assessing control trade-offs. Trigger keywords - nemoclaw security best practices, sandbox securit

NVIDIA skillDeveloperDevOps EngineerSecurity EngineerPlatform Engineer

nvidia

nemoclaw-user-deploy-remote

Explains how to run NemoClaw on a remote GPU instance, including the deprecated Brev compatibility path and the preferred installer plus onboard flow. Use when deploying NemoClaw to a remote VM, onboarding a Brev instance, or migrating away from the legac

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-get-started

Installs NemoClaw, launches a sandbox, and runs the first agent prompt. Use when onboarding, installing, or launching a NemoClaw sandbox for the first time. Trigger keywords - nemoclaw quickstart, install nemoclaw openclaw sandbox, nemohermes quickstart,

NVIDIA skillDeveloperAI EngineerApplication DeveloperNeMoClaw

nvidia

nemoclaw-user-manage-policy

Adds, removes, or modifies allowed endpoints in the sandbox policy. Use when customizing network policy, changing egress rules, or configuring sandbox endpoint access. Trigger keywords - customize nemoclaw network policy, sandbox egress policy configurati

NVIDIA skillDeveloperDevOps EngineerSecurity EngineerPlatform Engineer

nvidia

nemoclaw-user-manage-sandboxes

Explains operational tasks after the quickstart: listing sandboxes, status and health checks, logs, diagnostics, port forwards, multiple sandboxes, credential reset, rebuilds, network presets, upgrades, and uninstall. Trigger keywords - manage nemoclaw sa

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-monitor-sandbox

Inspects sandbox health, traces agent behavior, and diagnoses problems. Use when monitoring a running sandbox, debugging agent issues, or checking sandbox logs. Trigger keywords - monitor nemoclaw sandbox, debug nemoclaw agent issues.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-overview

Explains how OpenClaw, OpenShell, and NemoClaw form the ecosystem, NemoClaw's position in the stack, what NemoClaw adds beyond the community sandbox, and when to prefer NemoClaw versus integrating OpenShell and OpenClaw directly. Use when users ask about

NVIDIA skillDeveloperApplication DeveloperSolutions ArchitectNeMoClaw

nvidia

nemoclaw-user-reference

Describes the NemoClaw integration layer and blueprint architecture and how they orchestrate compatible agent sandboxes. Use when looking up architecture, agent integration, plugin structure, or blueprint design. Trigger keywords - nemoclaw architecture,

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperSolutions Architect

nvidia

nemotron-customize

Plan, configure, and chain repo-native Nemotron customization steps into single-step or multi-step pipelines: curation, translation, SFT/PEFT (AutoModel or Megatron-Bridge), pretraining/CPT, RL alignment (DPO/RLVR/GRPO/RLHF), BYOB/MCQ benchmarks, checkpoi

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

nemotron-policy-generator

Generates BYO custom safety policies for NVIDIA Nemotron content-safety guardrails — Nemotron-Content-Safety-Reasoning-4B (text) and multimodal Nemotron-3-Content-Safety. Produces a Markdown policy, JSON taxonomy, and drop-in inference prompts. Maps rough

NVIDIA skillDeveloperAI EngineerSecurity EngineerSolutions Architect

nvidia

nemotron-retrieval-recipes

Use when planning, debugging, tuning, evaluating, exporting, or deploying public Nemotron `embed`/`rerank` retrieval recipes.

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

nemotron-speech

Routes NVIDIA Nemotron Speech (Riva) NIM tasks — deploys, runs, and tests ASR, TTS, and NMT NIMs on build.nvidia.com or self-hosted.

NVIDIA skillDeveloperAI EngineerDevOps EngineerApplication Developer

nvidia

nv-generate-ct-rflow

Used for generating synthetic CT volumes and masks with NV-Generate-CTMR rflow-ct. Not for production training data without review.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr

Used for generating synthetic body MRI volumes with NV-Generate-CTMR rflow-mr. Not for paired masks or production training data.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr-brain

Used for generating synthetic brain MRI volumes with NV-Generate-CTMR rflow-mr-brain. Not for production training data.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr-brain-finetune

Used for finetuning NV-Generate-CTMR MR-brain diffusion UNet from a NIfTI datalist. Not for clinical or production data approval.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-vae-finetune

Used for finetuning the NV-Generate-CTMR MAISI VAE from CT/MRI NIfTI datalists. Not for clinical or production data approval.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-reason-cxr

Used for command-shape or live NV-Reason-CXR chest X-ray reasoning smoke tests. Not for diagnosis or clinical reporting.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ct

Used for running NV-Segment-CT VISTA3D on CT NIfTI volumes and recording label-map evidence.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ct-finetune

Used for smoke or dataset finetuning of NV-Segment-CT VISTA3D on CT NIfTI labels. Not for clinical validation.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ctmr

Used for running NV-Segment-CTMR on CT or MRI NIfTI volumes and recording label-map evidence. Not for clinical interpretation.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

omniverse-cad-to-simready

Coordinate the end-to-end CAD/source-asset to SimReady workflow. Use for broad requests such as CAD to SimReady, source asset to simulation-ready USD, or prop packaging that require conversion, material/physics assignment, SimReady conformance, validation

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

omniverse-realtime-viewer

Use as the top-level router for Omniverse Realtime Viewer USD app requests and focused viewer reference documents.

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

omniverse-usd-performance-tuning

Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

physical-ai-defect-image-generation

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path per

NVIDIA skillDeveloperAI EngineerMl EngineerPhysical AI

nvidia

physical-ai-infrastructure-setup-and-resilient-scaling

Use when the user wants to set up, scale, validate, or harden NVIDIA physical AI infrastructure for synthetic data generation workflows across local MicroK8s or Azure AKS, including Kubernetes clusters, inference endpoint deployment, OSMO deployment, work

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

physical-ai-neural-reconstruction

Router for NVIDIA NuRec/NRE: USDZ rendering, NCore conversion, 3DGS, gRPC sensor sim, PhysicalAI HF datasets. Do NOT use for SimReady or infra setup.

NVIDIA skillDeveloperAI EngineerSimulation EngineerPhysical AI

nvidia

physical-ai-video-data-augmentation

Use when running video data augmentation and auto-labeling workflows on OSMO: flow selection, preflight, submit-time interpolation, monitoring, and output retrieval. Trigger keywords: video data augmentation, data enrichment, auto labeling, VDA demo, OSMO

NVIDIA skillDeveloperAI EngineerMl EngineerPhysical AI

nvidia

physicsnemo-discover

Official NVIDIA-authored guidance for navigating PhysicsNeMo — pick the model, datapipe, or example for a SciML/AI4Science task (surrogates, forecasting, downscaling, physics-informed, inverse, generative). Points at existing files via live repo search; n

NVIDIA skillDeveloperData ScientistResearch AcademicPhysicsNeMo

nvidia

rag-blueprint

NVIDIA RAG Blueprint — deploy, configure, troubleshoot, and manage. Handles any RAG action: deploy, install, start, enable, disable, toggle, change, configure, troubleshoot, debug, fix, shutdown, stop, or tear down any RAG feature or service (Agentic RAG,

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

rag-eval

Filesystem RAG benchmarks: corpus/, train.json, evaluate_rag.py (RAGAS quality). Not for prod monitoring, latency/throughput benchmarking (use rag-perf), or evals outside this repo layout.

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

rag-perf

Performance benchmarking for a deployed NVIDIA RAG Blueprint server: profiling pass + aiperf load test driven by a single YAML config. Not for accuracy / RAGAS scoring (use rag-eval) or for deploying / repairing services (use rag-blueprint).

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

skill-card-generator

Use only to generate or update a governance skill card for a specified existing agent skill directory. Do not use for explaining, listing, comparing, or discussing skill capabilities.

NVIDIA skillDeveloperPlatform EngineerSolutions ArchitectTrustworthy AI

nvidia

tao-analyze-changenet-rca

Performs deep Root Cause Analysis (RCA) on NVIDIA TAO Visual ChangeNet classification experiments with image-evidence-driven investigation. Use when analyzing ChangeNet model failures, investigating poor recall / FAR / PASS-NO_PASS metrics, auditing visua

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-analyze-gaps-visual-changenet

Performs gap analysis on NVIDIA TAO Visual ChangeNet (VCN) Classify experiments by invoking the data-services container (`tao_toolkit.data_services` from `versions.yaml`) directly via `docker run … gap_analysis vcn_aoi …` — picks the optimal decision thre

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-analyze-gaps-vlm-bcq

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use after running VLM evaluation when you have a predictions JSON and need to identify failure cases for DEFT root cause analysis on a binary

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-convert-dataset-format

Run `tao-daft convert` to convert NVIDIA TAO DAFT datasets between supported formats. Do not use for non-DAFT data. Use when the user asks to convert a DAFT dataset, change DAFT format, change a TAO dataset format, or run `tao-daft convert`.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

tao-finetune-clip

CLIP vision-language model for image-text retrieval, zero-shot classification, embedding extraction, ONNX export, and TensorRT deployment. Use when fine-tuning or training CLIP, running zero-shot classification, computing image embeddings, or deploying CL

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-finetune-cosmos-embed

Cosmos-Embed1 video-text embedding for text-to-video retrieval, video-to-video search, semantic deduplication, and fine-tuning. Use when the user asks to "fine-tune Cosmos-Embed1", "run cosmos-embed inference", "export Cosmos-Embed1", "embed videos", or "

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-finetune-cosmos-reason

Cosmos-Reason2-8B video QA supervised fine-tuning with FSDP parallelism. Use when training or evaluating video question-answering models, fine-tuning Cosmos-Reason2 with SFT, or working with Cosmos-RL. Trigger phrases include "fine-tune Cosmos-Reason", "C

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-finetune-huggingface-model

Fine-tune any HuggingFace CV / VLM / LLM model on local NVIDIA GPUs inside an NGC PyTorch container. Use when the user wants to fine-tune a HuggingFace model (full or LoRA), train a vision / VLM / LLM model end-to-end, generate a reproducible HF training

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-image-grounding

Two-step image grounding pipeline: extracts referring expressions from (image, caption) pairs and grounds them to pixel-space bounding boxes via a VLM. Use when the user wants to ground captions to bboxes, generate phrase-grounded annotations, auto-label

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-referring-expressions

Four-step image referring-expression pipeline: turns images plus KITTI bounding-box labels into region descriptions, scene captions, grounded referring expressions, and (optionally) verified expressions via VLM distillation. Use when the user wants to gen

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-video-reasoning-annotations

Multi-step video annotation pipeline that turns raw videos into Chain-of-Thought training data — multi-level captions, structured descriptions, and QA pairs (MCQ, binary, open-ended) with reasoning traces, via VLM/LLM distillation. Use when the user wants

NVIDIA skillTAOData EngineerAI EngineerMl Engineer

nvidia

tao-launch-workflow

Shared launch intake for any TAO workflow or action. Use when the user wants to run TAO AutoML, train, evaluate, infer, export, generate TensorRT engines, or launch DEFT/workflow jobs on an execution platform.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-list-capabilities

Answer what the TAO Skill Bank plugin can do by generating the response from packaged application, data, model, AutoML, and platform manifests.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-mine-aoi-images

Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate next step after `tao-route-visual-cha

NVIDIA skillDeveloperData EngineerAI EngineerData Scientist

nvidia

tao-port-huggingface-model

Integrate a HuggingFace Computer Vision model into the NVIDIA TAO Toolkit ecosystem (tao-core config, tao-pytorch trainer, tao-deploy TensorRT pipeline). Use when the user asks to "integrate a HuggingFace model into TAO", "add an HF model to TAO Toolkit",

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-route-visual-changenet-samples

Routes the weakest VCN samples (output of `tao-analyze-gaps-visual-changenet`) into per-augmentation-module subsets — one parquet for k-NN mining, one for AnomalyGen (Cosmos SDG) — based on each module's label eligibility. Use as the immediate next step a

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-run-automl

Run AutoML / hyperparameter optimization (HPO) for NVIDIA TAO networks using AutoMLRunner. Handles algorithm selection (bayesian, hyperband, asha, bohb, llm, hybrid, autoresearch), WandB experiment tracking, job execution on any TAO SDK platform, result i

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-run-automl-deft-pipeline

Run the canonical NVIDIA AOI three-phase training pipeline — Phase 1 AutoML baseline (HPO), Phase 2 DEFT loop (RCA → SDG → mining → plain-train retrain), Phase 3 AutoML refinement on the DEFT-augmented dataset. This is the default entry point for any "run

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-run-deft-aoi

Run the full DEFT AOI improvement loop for NVIDIA TAO VisualChangeNet / ChangeNet PCB inspection models: baseline evaluate, RCA, ingestion of customer-supplied pre-generated AnomalyGen images, k-NN mining, retraining, and deployment gating until FAR / rec

NVIDIA skillAI EngineerMl EngineerApplication DeveloperAI And Machine Learning

nvidia

tao-run-inference-service

Start, query, and stop a network-specific TAO inference microservice ({network_arch}-inference-microservice) by delegating container execution to the appropriate platform skill. Handles container image resolution, job-payload JSON construction, and the se

NVIDIA skillAI EngineerDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-run-on-brev

Brev managed GPU instances with Docker support. Use when running TAO training, evaluation, or inference on Brev GPU instances, managing Brev deployments, or dispatching TAO jobs through the Brev CLI. Trigger phrases include "run on Brev", "Brev GPU instan

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-kubernetes

Kubernetes execution platform — submits TAO container jobs as single-pod k8s Jobs with NVIDIA GPU scheduling. Use when running on EKS / GKE / AKS / on-prem clusters with the NVIDIA GPU Operator installed, or when integrating TAO into an existing k8s-nativ

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-lepton

DGX Cloud Lepton managed GPU compute platform with run/status/cancel interface. Use when submitting TAO jobs to DGX Cloud, dispatching training/eval/inference to Lepton GPU resources, or managing Lepton workspace deployments. Trigger phrases include "run

NVIDIA skillAI EngineerDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-run-on-local-docker

Local Docker execution for TAO SDK job containers using the host Docker daemon and NVIDIA GPU runtime. Use when running TAO jobs on the current machine or a directly attached Docker host. Trigger phrases include "run locally", "local Docker", "use my GPU"

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-slurm

Remote SLURM GPU cluster execution over SSH with sbatch/srun, Pyxis/Enroot containers, and Lustre-backed results. Use when running TAO training/eval/inference jobs on an on-prem or DGX SLURM cluster. Trigger phrases include "run on SLURM", "submit sbatch"

NVIDIA skillTAOAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-platform

TAO Execution SDK for submitting and monitoring GPU training jobs on supported platforms (Lepton, Brev, SLURM, local Docker, Kubernetes). Use when the user wants to run TAO jobs through the SDK, get job tracking, S3 I/O wrapping, multi-node distributed tr

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-setup-nvidia-gpu-host

Host setup for TAO GPU backends. Checks and, after user approval, installs NVIDIA driver branch 580, CUDA Toolkit 13.0, and NVIDIA Container Toolkit 1.19.0 for Docker/local-Docker and Kubernetes GPU worker hosts. The `--check-only` path works on any Linux

NVIDIA skillDeveloperDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-train-action-recognition

Action recognition from video sequences. Supports RGB, optical flow, and joint (multi-stream) input types for classifying temporal actions in video clips. Use when training, evaluating, exporting, or running inference on a TAO action-recognition model. Tr

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-bevfusion

BEVFusion for multi-sensor 3D object detection. Fuses LiDAR point clouds and camera images in bird's-eye-view (BEV) space, used in autonomous driving for robust 3D perception. Use when training, evaluating, or running inference for a TAO BEVFusion model.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-centerpose

CenterPose for keypoint / pose estimation. Detects object centers and regresses keypoint locations for 6-DoF object pose estimation. Use when training, evaluating, exporting, or running inference for a TAO CenterPose model. Trigger phrases include "train

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-deformable-detr

Deformable DETR for 2D object detection. Uses deformable attention for efficient multi-scale feature processing, lighter than DINO with competitive accuracy. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Deformable-D

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-depth-anything-v2

Monocular depth estimation using Metric Depth Anything v2 or Relative Depth Anything architectures. Predicts per-pixel depth from single RGB images. Use when training, evaluating, exporting, or running inference for a TAO monocular depth model. Trigger ph

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-dino

DINO (DETR with Improved DeNoising Anchor Boxes) for 2D object detection. Transformer-based detector with denoising training, multi-scale features, and optional distillation support. Use when training, evaluating, exporting, distilling, quantizing, or run

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-fast-foundation-stereo

Real-time stereo depth estimation using FastFoundationStereo (FFS), the distilled bp2 commercial variant of FoundationStereo. Predicts disparity maps from stereo image pairs with ~10× lower latency than full FoundationStereo. Use when training, evaluating

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-foundation-stereo

Stereo depth estimation using FoundationStereo. Predicts disparity maps from stereo image pairs for 3D reconstruction. Use when training, evaluating, exporting, or running inference for a TAO FoundationStereo model. Trigger phrases include "train stereo d

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-grounding-dino

Grounding DINO for open-set object detection. Combines DINO-style detection with a BERT text encoder for language-guided detection — detects objects described by text prompts without a fixed class vocabulary. Use when training, evaluating, exporting, quan

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-image-classification

PyTorch-based TAO image classification. Supports a wide range of backbones (FAN, EfficientNet, ResNet, etc.) with distillation and quantization for deployment. Use when training, evaluating, distilling, quantizing, exporting, or running inference for a TA

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-auto-encoder

Masked Auto-Encoder (MAE) for self-supervised pretraining and fine-tuning. Masks random patches and reconstructs them to learn visual representations; supports pretrain and finetune stages. Use when training, evaluating, exporting, or running inference fo

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-auto-label

MAL (Mask Auto-Label) for weakly-supervised segmentation. Produces segmentation masks from minimal annotations (point or box annotations) using a ViT-MAE backbone. Use when training, evaluating, or running inference for a TAO MAL model. Trigger phrases in

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-grounding-dino

Mask Grounding DINO for grounded instance segmentation. Extends Grounding DINO with a mask-prediction head for open-set segmentation guided by text prompts. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask-Groundin

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-mask2former

Mask2Former for universal image segmentation (panoptic, instance, and semantic). Transformer-based with masked attention for high-quality segmentation results. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask2Forme

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-metric-learning-recognition

Metric-learning recognition (ml-recog) for fine-grained visual recognition. Learns embeddings for retrieval-based matching (e.g., retail product recognition) using triplet / contrastive losses. Use when training, evaluating, exporting, or running inferenc

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-nvdinov2

NVDINOv2 for self-supervised visual representation learning. Trains vision transformers via self-distillation (teacher-student) without labels and produces general-purpose visual features. Use when training, distilling, exporting, or running inference for

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-nvpanoptix3d

NVPanoptix3D for panoptic 3D scene reconstruction from posed RGB images. Produces 3D panoptic segmentation (semantic, instance, and panoptic masks) with occupancy completion. Built on a VGGT backbone with a Mask2Former-style head and 3D frustum reconstruc

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-ocdnet

OCDNet for scene text detection. Detects arbitrary-oriented text regions in natural images using a differentiable binarization approach. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCDNet model

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-ocrnet

OCRNet for scene text recognition. Recognizes text content from cropped text-region images and supports CTC and attention-based decoders. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCRNet mode

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-oneformer

OneFormer for universal image segmentation. Unifies panoptic, instance, and semantic segmentation with a single architecture using task-conditioned queries. Use when training, evaluating, exporting, quantizing, or running inference for a TAO OneFormer mod

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-optical-inspection

Optical Inspection for defect detection using Siamese networks. Compares image pairs to detect manufacturing defects, anomalies, or quality issues. Use when training, evaluating, exporting, or running inference for a TAO Optical Inspection model on AOI /

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-pointpillars

PointPillars for 3D object detection from LiDAR point clouds. Encodes point clouds into a pseudo-image via a pillar-based representation, then applies 2D detection — used in autonomous driving and robotics. Use when training, evaluating, exporting, prunin

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-pose-classification

Pose classification using ST-GCN (Spatial Temporal Graph Convolutional Network). Classifies skeleton sequences into action categories from pose-keypoint data. Use when training, evaluating, exporting, or running inference for a TAO pose-classification mod

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-reid

Person re-identification (ReID). Learns discriminative embeddings to match the same person across different camera views, based on metric learning. Use when training, evaluating, exporting, or running inference for a TAO person re-identification model. Tr

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-rtdetr

RT-DETR (Real-Time DEtection TRansformer) for 2D object detection. Designed for real-time inference with competitive accuracy and supports distillation and quantization for deployment optimization. Use when training, evaluating, distilling, quantizing, ex

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-segformer

SegFormer for semantic segmentation. Lightweight transformer-based architecture with hierarchical feature extraction, efficient for real-time segmentation tasks. Use when training, evaluating, exporting, quantizing, or running inference for a TAO SegForme

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-single-step

Standard single-step train/eval/export workflow for any TAO model. Use when training a TAO model on a dataset without iterative data augmentation, AutoML, or DEFT loops. Trigger phrases include "single train run", "train then evaluate then export", "plain

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-sparse4d

Sparse4D for multi-camera temporal 3D object detection and tracking. Uses sparse queries with deformable attention across camera views and time for end-to-end 3D perception, with an instance bank for temporal tracking. Use when training, evaluating, expor

NVIDIA skillRobotics DeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-visual-changenet

Visual ChangeNet for binary image classification and segmentation in AOI defect detection. Use when training, evaluating, exporting, or running inference for PCB defect detection or visual inspection, comparing image pairs for PASS/NO_PASS classification,

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-validate-dataset-format

Run `tao-daft validate` to check NVIDIA TAO DAFT datasets for structure, schema, and cross-reference errors. Do not use for non-DAFT formats. Use when the user asks to validate a DAFT dataset, check DAFT schema, validate a TAO dataset format, or run `tao-

NVIDIA skillDeveloperData EngineerAI EngineerData Scientist

nvidia

tilegym-adding-cutile-kernel

Add a new cuTile GPU kernel operator to TileGym. Covers dispatch registration in ops.py, cuTile backend implementation, __init__.py exports, test creation, and benchmark in tests/benchmark. Use when adding, creating, or implementing a new cuTile operator/

NVIDIA skillDeveloperHpc DeveloperAccelerated ComputingCUDA Tile

nvidia

tilegym-converting-cutile-to-julia

Converts cuTile Python GPU kernels (@ct.kernel) to cuTile.jl Julia equivalents. Handles kernel syntax translation, 0-indexed to 1-indexed conversion, broadcasting differences, memory layout (row-major to column-major), type system mapping, and launch API

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-converting-cutile-to-triton

Converts cuTile GPU kernels (@ct.kernel) to Triton (@triton.jit). Handles standard in-repo conversion, debugging (cudaErrorIllegalAddress, shape mismatch, numerical mismatch), and mapping cuTile idioms (ct.load/ct.store, ct.Constant, ct.launch) to Triton

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-cutile-autotuning

Use when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/performance issues in autotuned CuTile ker

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-cutile-python

Expert cuTile programming assistant. Write high-performance GPU kernels using cuTile's tile-based programming model with proper validation and optimization. Supports deep agent orchestration for complex multi-kernel tasks.

NVIDIA skillDeveloperAI EngineerApplication DeveloperHpc Developer

nvidia

tilegym-improve-cutile-kernel-perf

Iteratively optimize cuTile kernel performance through systematic profiling, bottleneck analysis, IR comparison, and targeted tuning. Covers tile sizes, occupancy, autotune configs, TMA, latency hints, persistent scheduling, num_ctas, flush_to_zero, and I

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-monkey-patch-kernels-to-transformers

Integrate TileGym kernels into Hugging Face `transformers` models by replacing the library's submodule(s) and certain class(es)' implementations, and patching certain class(es)' init/forward/load weight methods prior to instantiating models. Used when the

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-ask-video

Use this skill to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-deploy-dense-captioning

Use this skill when deploying standalone RT-VLM dense captioning or calling its REST API (uploads, captions, streams, chat-completions, Kafka). Not for VSS profile deploy or video-search ingestion.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-detection-tracking-2d

Use this skill when the user wants to deploy, run, debug, tear down, or call the REST API of the RTVI-CV 2D detection / tracking microservice. Trigger when the user says things like 'deploy rtvi-cv', 'start warehouse 2d', 'add a stream', 'check rtvi-cv he

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-detection-tracking-3d

Deploy and operate the RTVI-CV-3D microservice as MV3DT (`MODE=mv3dt`): per-camera DeepStream perception plus BEV Fusion over calibrated cameras. Supports the bundled sample dataset, custom video files, and RTSP streams, and chains to `vss-generate-video-

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-deploy-profile

Use to select, configure, deploy, verify, debug, or tear down a VSS profile (base, search, lvs, warehouse, edge). Not for standalone microservices — use the vss-deploy-* skill.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-video-embedding

Use this skill when deploying, operating, or integrating the VSS 3.2 GA RT-Embed Video Embedding microservice. Covers Docker Compose bring-up, GPU and storage prerequisites, the `/v1` REST API (file uploads, text and video embeddings, live RTSP streams, h

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerMl Engineer

nvidia

vss-generate-video-calibration

Use to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerHands On Builder

nvidia

vss-generate-video-report

Use this skill when producing a VSS analysis report — Mode A per-clip VLM, Mode B incident-range via video-analytics. Not for standalone video summarization, real-time alerts, or ad-hoc Q&A.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerMl EngineerApplication Developer

nvidia

vss-manage-alerts

Use for VSS alert workflows — real-time monitoring, Alert-Bridge subscriptions, Slack notifications, incident queries, camera onboarding. Not for non-alert analytics.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerProduction Operator

nvidia

vss-manage-video-io-storage

Use to call the VIOS REST API (sensor list, timelines, clip extraction, snapshots, add/delete sensors and streams). Not for VLM inference or search.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerApplication Developer

nvidia

vss-query-analytics

Use this skill when reading video-analytics metrics, incidents, alerts, and sensor data via the VA-MCP server (port 9901). Not for live VLM or incident-range narrative reports.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerPlatform EngineerProduction Operator

nvidia

vss-search-archive

Use to run top-level VSS fusion search on archived video, or to ingest video files / RTSP streams for search. Do NOT use for ad-hoc visual Q&A (use vss-ask-video), live captioning (use vss-deploy-dense-captioning), or video summarization and reports (use

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-setup-behavior-analytics

Use to deploy the vss-behavior-analytics service standalone (entrypoint, config-source, optional calibration). Not for the full warehouse deploy.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-setup-video-analytics-api

Use to deploy the vss-video-analytics-api REST service standalone (config-source, data-log bind, Elasticsearch, optional Kafka). Not for full warehouse deploy.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-summarize-video

Use to summarize a recorded video via the LVS summarization microservice (HITL-gated) with a VLM fallback. Not for report generation or live RTSP captioning.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

accelerated-computing-cudf

Official NVIDIA-authored guidance for NVIDIA cuDF GPU DataFrames, pandas acceleration, dask-cuDF, ETL, joins, groupby, CSV/Parquet I/O, nullable semantics, and multi-GPU DataFrame workloads.

NVIDIA skillDeveloperData EngineerData ScientistcuDF

nvidia

aiq-deploy

Use when asked to install, deploy, run, validate, troubleshoot, or stop NVIDIA AI-Q Blueprint infrastructure.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerAI And Machine Learning

nvidia

aiq-research

Use when asked to run deep research or AI-Q research through a reachable NVIDIA AI-Q Blueprint backend.

NVIDIA skillDeveloperAI EngineerAI And Machine LearningNeMo Agent Toolkit

nvidia

cudaq-guide

CUDA-Q onboarding guide for installation, test programs, GPU simulation, QPU hardware, and quantum applications.

NVIDIA skillDeveloperQuantum ResearcherHpc DeveloperCuda Q

nvidia

cufolio

Use when a user asks to build, optimize, backtest, rebalance, or analyze a stock portfolio with Mean-CVaR, efficient frontiers, scenario generation, or NVIDIA cuOpt.

NVIDIA skillDeveloperAI EngineerData ScientistApplication Developer

nvidia

cuopt-developer

Modify, build, test, debug, and contribute to NVIDIA cuOpt (C++/CUDA, Python, server, CI). Use for solver internals, PRs, DCO, and code conventions.

NVIDIA skillDeveloperHpc DeveloperDeveloper ToolscuOpt

nvidia

cuopt-install

Install cuOpt for Python, C, or server via pip, conda, or Docker; verify the install. For building cuOpt from source, see cuopt-developer.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperAccelerated Computing

nvidia

cuopt-numerical-optimization-api-c

LP, MILP, and QP (beta) with cuOpt — C API only. Use when the user is embedding LP, MILP, or QP in C/C++.

NVIDIA skillDeveloperHpc DeveloperApplication DeveloperAccelerated Computing

nvidia

cuopt-numerical-optimization-api-cli

LP, MILP, and QP (beta) with cuOpt — CLI only (MPS files, cuopt_cli). Use when the user is solving LP, MILP, or QP from MPS via command line.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-numerical-optimization-api-python

Solve LP, MILP, QP (beta) with cuOpt Python API — linear/quadratic objectives, integer variables, scheduling, portfolio, least squares.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-numerical-optimization-formulation

LP, MILP, QP — concepts, problem-text parsing, and formulation patterns (parameters, constraints, decisions, objective). Concepts only; no API.

NVIDIA skillDeveloperData ScientistApplication DeveloperAccelerated Computing

nvidia

cuopt-routing-api-python

Vehicle routing (VRP, TSP, PDP) with cuOpt — Python API only. Use when the user is building or solving routing in Python.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cuopt-routing-formulation

Vehicle routing (VRP, TSP, PDP) — problem types and data requirements. Domain concepts; no API or interface.

NVIDIA skillDeveloperData ScientistApplication DeveloperAccelerated Computing

nvidia

cuopt-server-api-python

cuOpt REST server — start server, endpoints, Python/curl client examples. Use when the user is deploying or calling the REST API.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperAccelerated Computing

nvidia

cuopt-server-common

cuOpt REST server — what it does and how requests flow. Domain concepts; no deploy or client code.

NVIDIA skillDeveloperDevOps EngineerApplication DeveloperSolutions Architect

nvidia

cuopt-skill-evolution

After solving a non-trivial problem, detect generalizable learnings and propose skill updates. Always active — applies to every interaction.

NVIDIA skillDeveloperDeveloper ToolscuOpt

nvidia

cuopt-user-rules

Base rules for end users calling NVIDIA cuOpt (routing/LP/MILP/QP/install/server). Not for cuOpt internals — use cuopt-developer for those.

NVIDIA skillDeveloperApplication DeveloperAccelerated ComputingcuOpt

nvidia

cupynumeric-hdf5

Read and write large cuPyNumeric arrays to HDF5 with Legate's parallel, distributed HDF5 I/O (legate.io.hdf5: to_file, from_file, from_file_batched). Use when a developer needs to save a cuPyNumeric array to an .h5/.hdf5 file, load an HDF5 dataset into a

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-install

Install and verify cuPyNumeric for Python — requirements, commands, verification. Source builds are out of scope.

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-migration-readiness

Pre-migration readiness assessor for porting NumPy to cuPyNumeric. Use BEFORE substantial porting work begins when the user asks whether code will scale on GPU, whether they should migrate to cuPyNumeric, which NumPy patterns transfer cleanly, what must b

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

cupynumeric-parallel-data-load

Load a sharded, on-disk dataset (sharded .npy, Parquet/Arrow, raw binary, sharded HDF5, custom layouts) into a distributed cuPyNumeric ndarray via a manual partition + leaf @task launch with CPU/OMP/GPU variants. Use when no single-call loader fits, inclu

NVIDIA skillDeveloperData ScientistHpc DevelopercuPyNumeric

nvidia

dali-dynamic-mode

DALI imperative dynamic mode (`nvidia.dali.experimental.dynamic`, ndd): use when working on ndd code or migrating pipelines; skip pipeline-only tasks.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

data-designer

Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

deepstream-dev

NVIDIA DeepStream SDK 9.0 development with Python pyservicemaker API. Use when building video analytics pipelines, GStreamer-based video processing, TensorRT inference integration, object detection/tracking, or Kafka/message broker integration.

NVIDIA skillDeveloperAI EngineerApplication DeveloperAI And Machine Learning

nvidia

deepstream-import-vision-model

Use this skill to bring any vision model from HuggingFace or NVIDIA NGC into an NVIDIA DeepStream pipeline with end-to-end automation: ONNX download, SafeTensors export, TRT engine build, custom nvinfer bbox parser, multi-stream benchmark, and PDF report.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

dicom-metadata-extract

Used for extracting selected metadata from one DICOM file and flagging standard-tag PHI presence. Not for anonymization or clinical use.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

dicom-series-preflight

Used for header-only preflight of one DICOM series folder before conversion or inference. Not for de-identification or clinical clearance.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

dicom-series-to-volume

Used for converting one CT DICOM series folder to a HU NIfTI volume with affine evidence. Not for multi-frame DICOM or clinical use.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

digital-health-clinical-asr-build

Stage 2 of the Clinical ASR Flywheel. Use when curating clinical terms, tagging IPA, and synthesizing a NeMo manifest. NOT for scoring (use /digital-health-clinical-asr-eval).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-eval

Stage 3 of Clinical ASR Flywheel. Score a NeMo manifest, produce the five-section KER leaderboard (by-ipa_source diagnostic). Not for ASR auth (/riva-asr).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-finetune

Stage 4 of the Clinical ASR Flywheel. Use when priority KER is above 0.3 to run stock NeMo SFT on Parakeet TDT v2 and offline cycle N+1 re-eval. NOT for generic word boosting (use /finetune-asr).

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

digital-health-clinical-asr-setup

Stage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.

NVIDIA skillDeveloperAI EngineerMl EngineerNemotron for Digital Health

nvidia

dynamo-interconnect-check

Validate that a Dynamo deployment's NIXL/UCX/NCCL interconnect is ready for disaggregated serving over RDMA/NVLink. Use after recipe-runner brings a deployment up (especially disagg/multi-node) to confirm the KV transport is correct; use troubleshoot for

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-recipe-runner

Select, validate, patch, and deploy existing NVIDIA Dynamo Kubernetes recipes. Use for model/backend/GPU/deployment-mode recipe bring-up; use router-starter for router-only mode work and troubleshoot for broken deployments.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-router-starter

Start or patch Dynamo router modes and run router endpoint smoke checks. Use for round-robin, KV-aware, least-loaded, or device-aware routing setup; use recipe-runner for recipe deployment and troubleshoot for failure diagnosis.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

dynamo-troubleshoot

Diagnose failed or unhealthy Dynamo deployments. Use when pods, model-cache jobs, PVCs, workers, frontend/router health, endpoints, or benchmark jobs fail; use recipe-runner/router-starter before this for normal bring-up.

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

earth2studio-data-fetch

Fetch weather/climate data via Earth2Studio data sources for specific variables and times. Do NOT use for inference pipelines, model discovery, or installation.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-deterministic-forecast

Build deterministic forecast scripts with Earth2Studio (model, data source, IO, inference). Do NOT use for ensemble, diagnostics, data-only fetch, or install.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-discover

Find Earth2Studio models, data sources, and examples for a weather/climate use case. Do NOT use for writing inference code, downloading data, or installation.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

earth2studio-install

Guide installing Earth2Studio via uv or pip, selecting model extras, and configuring the environment. Do NOT use for writing inference code, choosing models, or PhysicsNeMo questions.

NVIDIA skillDeveloperData ScientistResearch AcademicEarth2Studio

nvidia

holoscan-install-conda

Install Holoscan SDK v4.3+ via Conda in a CUDA 13 environment. Use for Conda installs; redirect CUDA 12 hosts to container/wheel.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-container

Install Holoscan SDK via the NGC Docker container. Use for container-based installs; not for native apt/pip/Conda installs.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerApplication Developer

nvidia

holoscan-install-debian

Install Holoscan SDK natively on Ubuntu via apt. Use for C++ installs on Ubuntu; pair with /holoscan-install-wheel for Python.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-source

Build Holoscan SDK from source via the in-tree ./run script. Use only when published packages don't meet the user's needs.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-install-wheel

Install Holoscan SDK Python wheel via pip into a venv. Use for Python installs; not for native C++/apt or Conda installs.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

holoscan-setup

Guides Holoscan SDK installation: inspects the host, assesses platform compatibility, recommends an install method, and delegates to the matching install skill.

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperPhysical AI

nvidia

hsb-app

Discover and run Holoscan Sensor Bridge example applications on a connected devkit. Filters available apps by the user's platform, HSB software version, board type, and sensors. Supports timed execution, failure analysis, code-edit suggestions, and iterat

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

hsb-flash

Flash the FPGA on an HSB board connected to an NVIDIA devkit. Supports HSB Lattice boards (FPGA versions 2407, 2412, 2507, 2510) and Leopard Imaging VB1940 "all-in-one" cameras (FPGA versions 2507, 2510). Uses release-specific YAML manifests and board-typ

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

hsb-setup

Clone the latest NVIDIA Holoscan Sensor Bridge repo, ask which supported devkit is being used, configure the host per platform, build the correct demo container, run it, and verify HSB connectivity by pinging 192.168.0.2. Use for Holoscan Sensor Bridge se

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerHands On Builder

nvidia

hsb-test

Execute QA test plans on Holoscan Sensor Bridge hardware. Reads a user-provided test document, filters tests by the user's setup, determines which tests can run automatically, executes them with pass/fail evaluation, and produces a structured test results

NVIDIA skillDeveloperPlatform EngineerHands On BuilderApplication Developer

nvidia

launch-nemo-rl

Playbook for launching, monitoring, stopping, and debugging NeMo-RL recipes on a Kubernetes cluster via the nrl-k8s CLI. Covers ephemeral vs long-lived RayCluster modes, iterating on runs, and debugging hung or failed training jobs.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

mcore-create-issue

Investigate a failing GitHub Actions run or job and create a GitHub issue for the failure.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

mcore-linting-and-formatting

Linting and formatting for Megatron-LM. Covers running autoformat.sh, tools (ruff, black, isort, pylint, mypy), and code style rules.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

mcore-run-on-slurm

How to launch distributed Megatron-LM training jobs on a SLURM cluster. Covers a minimal sbatch skeleton, environment-variable setup for torch.distributed.run, CUDA_DEVICE_MAX_CONNECTIONS rules across hardware and parallelism modes, container conventions,

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

mcore-split-pr

Split a PR into multiple PRs to reduce the number of required CODEOWNERS reviewer groups.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

mcore-testing

Test system for Megatron-LM. Covers test layout, recipe YAML structure, adding and running unit and functional tests, golden values, marker filters, and CI parity.

NVIDIA skillDeveloperAI EngineerMl EngineerMegatron Core

nvidia

nemo-automodel-distributed-training

Guide for selecting and configuring distributed training strategies in NeMo AutoModel, including FSDP2, Megatron FSDP, DDP, and parallelism settings.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-automodel-launcher-config

Configure NeMo AutoModel job launches for interactive runs, Slurm clusters, and SkyPilot cloud execution.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-automodel-model-onboarding

Guide for onboarding new model architectures into NeMo AutoModel, including architecture discovery, implementation patterns, registration, and validation.

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo Framework

nvidia

nemo-automodel-recipe-development

Create and modify NeMo AutoModel training and evaluation recipes, including YAML structure, builders, and execution flow.

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo Framework

nvidia

nemo-data-designer-plugin

Use when the user wants to create a dataset, generate synthetic data, or build a data generation pipeline.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

nemo-evaluator-plugin

Use when working on the Evaluator plugin CLI, jobs, SDK-backed specs, metric types, or plugin-owned Evaluator skills.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nemo-mbridge-mlm-bridge-training

Run Megatron-LM (MLM) and Megatron Bridge training with mock or real data. Covers correlation testing, available recipes, and multi-GPU examples.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-multi-node-slurm

Convert single-node scripts to multi-node Slurm sbatch jobs and debug common multi-node failures. Covers srun-native vs uv run torch.distributed approaches, container setup, NCCL timeouts, OOM sizing for MoE models, and interactive allocation.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-mbridge-perf-activation-recompute

Validate and use selective and full activation recompute in Megatron Bridge to reduce GPU memory usage at the cost of extra compute.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-cpu-offloading

Validate and use CPU offloading in Megatron Bridge, including layer-level activation offloading and fractional optimizer state offloading with HybridDeviceOptimizer.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-cuda-graphs

Validate and use CUDA graph capture in Megatron Bridge, including local full-iteration graphs and Transformer Engine scoped graphs for attention, MLP, and MoE modules.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-expert-parallel-overlap

Validate and use MoE expert-parallel communication overlap in Megatron-Bridge, including overlap_moe_expert_parallel_comm, delay_wgrad_compute, and flex dispatcher backends such as DeepEP and HybridEP.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-hierarchical-context-parallel

Operational guide for enabling hierarchical context parallelism in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-megatron-fsdp

Operational guide for enabling Megatron FSDP in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-memory-tuning

Techniques for reducing peak GPU memory in Megatron Bridge — expandable segments, parallelism resizing, activation recompute, CPU offloading constraints, and common OOM fixes.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-comm-overlap

MoE expert-parallel communication overlap in Megatron Bridge. Covers dispatch/combine overlap, flex dispatcher backends, and expert wgrad scheduling.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-dispatcher-selection

Choose the right MoE token dispatcher (`alltoall`, DeepEP, or HybridEP) for the hardware, EP degree, and optimization stage. Summarizes patterns from DSV3, Qwen3, Qwen3-Next, and VLM bring-up work.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-hardware-configs

Representative MoE training playbooks by hardware platform and model family. Summarizes rounded throughput bands, parallelism patterns, and common tuning stacks.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-long-context

Long-context MoE training guidance for Megatron Bridge. Covers CP sizing, selective recompute, dispatcher choices, and practical patterns from DSV3, Qwen3, and Qwen3-Next long-context experiments.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-optimization-workflow

Systematic workflow for MoE training optimization in Megatron Bridge, based on the Megatron-Core MoE paper. Covers the Three Walls framework, parallel folding, recompute strategy, dispatcher choice, and CUDA-graph bring-up.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-moe-vlm-training

Practical guidance for training MoE VLMs in Megatron Bridge. Compares FSDP and 3D-parallel approaches, using rounded lessons from Qwen3-VL, Qwen3-Next, and other multimodal experiments.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-parallelism-strategies

Operational guide for choosing and combining parallelism strategies in Megatron Bridge, including sizing rules, hardware topology mapping, and combined parallelism configuration.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-sequence-packing

Validate and use packed sequences and long-context training in Megatron-Bridge, distinguishing offline packed SFT for LLMs from in-batch packing for VLMs, and applying the right CP constraints.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-perf-tp-dp-comm-overlap

Operational guide for enabling TP, DP, and PP communication overlap in Megatron-Bridge, including config knobs, code anchors, pitfalls, and verification.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-recipe-recommender

Recommend and customize Megatron Bridge recipes for a user's model, GPU count, and training goal. Indexes library recipes (pretrain/SFT/PEFT) and performance recipes.

NVIDIA skillDeveloperAI EngineerMl EngineerHpc Developer

nvidia

nemo-mbridge-resiliency

Resiliency features in Megatron Bridge including fault tolerance, straggler detection, in-process restart, preemption, and re-run state machine.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-retriever

Use when the user wants to search, query, extract, transcribe, describe, quote, filter, or aggregate across documents — PDFs, scanned forms / images (`.jpg` `.png` `.tiff`), Office (`.docx` `.pptx`), text (`.html` `.txt`), audio (`.mp3` `.wav` `.m4a`), or

NVIDIA skillDeveloperData EngineerAI EngineerAI And Machine Learning

nvidia

nemo-rl-auto-research

Autonomous NeMo-RL research agent workflow for directed hypothesis testing and open-ended discovery. Guides agents through the full experiment lifecycle: understanding recipes and environments, wiring RL or NeMo-gym runs, launching reproducible baselines

NVIDIA skillDeveloperAI EngineerMl EngineerResearch Academic

nvidia

nemo-rl-brev-etiquette

Brev instance operating guidance for NeMo-RL agents working in /home/ubuntu/RL with limited workspace disk, a larger /ephemeral volume, and optional /home/ubuntu/RL/.env secrets. Use when running nemo-rl-auto-research campaigns, experiments, training jobs

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

nemo-rl-docs

Documentation conventions for NeMo-RL. Covers docs/index.md updates and docstring format. Do NOT use for: bug fixes, test fixes, dependency bumps, refactoring, CI/CD changes, performance tuning, or any task that does not involve writing or updating docume

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo RL

nvidia

nemo-rl-session-memory

Manage durable working-session memory for coding agents. Use when a user asks to preserve or recover agent context across disconnects, VS Code restarts, long-running work, handoffs, or any session where important state should be written periodically under

NVIDIA skillDeveloperAI EngineerMl EngineerNeMo RL

nvidia

nemoclaw-user-agent-skills

Describes the agent skills shipped with NemoClaw and how to access them by cloning the repository. Use when users ask about AI agent support, coding assistant integration, or the .agents/skills/ directory. Trigger keywords - nemoclaw agent skills, ai codi

NVIDIA skillDeveloperAI EngineerApplication DeveloperNeMoClaw

nvidia

nemoclaw-user-configure-inference

Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server,

NVIDIA skillDeveloperAI EngineerPlatform EngineerApplication Developer

nvidia

nemoclaw-user-configure-security

Presents a risk framework for every configurable security control in NemoClaw. Use when evaluating security posture, reviewing sandbox security defaults, or assessing control trade-offs. Trigger keywords - nemoclaw security best practices, sandbox securit

NVIDIA skillDeveloperDevOps EngineerSecurity EngineerPlatform Engineer

nvidia

nemoclaw-user-deploy-remote

Explains how to run NemoClaw on a remote GPU instance, including the deprecated Brev compatibility path and the preferred installer plus onboard flow. Use when deploying NemoClaw to a remote VM, onboarding a Brev instance, or migrating away from the legac

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-get-started

Installs NemoClaw, launches a sandbox, and runs the first agent prompt. Use when onboarding, installing, or launching a NemoClaw sandbox for the first time. Trigger keywords - nemoclaw quickstart, install nemoclaw openclaw sandbox, nemohermes quickstart,

NVIDIA skillDeveloperAI EngineerApplication DeveloperNeMoClaw

nvidia

nemoclaw-user-manage-policy

Adds, removes, or modifies allowed endpoints in the sandbox policy. Use when customizing network policy, changing egress rules, or configuring sandbox endpoint access. Trigger keywords - customize nemoclaw network policy, sandbox egress policy configurati

NVIDIA skillDeveloperDevOps EngineerSecurity EngineerPlatform Engineer

nvidia

nemoclaw-user-manage-sandboxes

Explains operational tasks after the quickstart: listing sandboxes, status and health checks, logs, diagnostics, port forwards, multiple sandboxes, credential reset, rebuilds, network presets, upgrades, and uninstall. Trigger keywords - manage nemoclaw sa

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-monitor-sandbox

Inspects sandbox health, traces agent behavior, and diagnoses problems. Use when monitoring a running sandbox, debugging agent issues, or checking sandbox logs. Trigger keywords - monitor nemoclaw sandbox, debug nemoclaw agent issues.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerNeMoClaw

nvidia

nemoclaw-user-overview

Explains how OpenClaw, OpenShell, and NemoClaw form the ecosystem, NemoClaw's position in the stack, what NemoClaw adds beyond the community sandbox, and when to prefer NemoClaw versus integrating OpenShell and OpenClaw directly. Use when users ask about

NVIDIA skillDeveloperApplication DeveloperSolutions ArchitectNeMoClaw

nvidia

nemoclaw-user-reference

Describes the NemoClaw integration layer and blueprint architecture and how they orchestrate compatible agent sandboxes. Use when looking up architecture, agent integration, plugin structure, or blueprint design. Trigger keywords - nemoclaw architecture,

NVIDIA skillDeveloperPlatform EngineerApplication DeveloperSolutions Architect

nvidia

nemotron-customize

Plan, configure, and chain repo-native Nemotron customization steps into single-step or multi-step pipelines: curation, translation, SFT/PEFT (AutoModel or Megatron-Bridge), pretraining/CPT, RL alignment (DPO/RLVR/GRPO/RLHF), BYOB/MCQ benchmarks, checkpoi

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

nemotron-policy-generator

Generates BYO custom safety policies for NVIDIA Nemotron content-safety guardrails — Nemotron-Content-Safety-Reasoning-4B (text) and multimodal Nemotron-3-Content-Safety. Produces a Markdown policy, JSON taxonomy, and drop-in inference prompts. Maps rough

NVIDIA skillDeveloperAI EngineerSecurity EngineerSolutions Architect

nvidia

nemotron-retrieval-recipes

Use when planning, debugging, tuning, evaluating, exporting, or deploying public Nemotron `embed`/`rerank` retrieval recipes.

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

nemotron-speech

Routes NVIDIA Nemotron Speech (Riva) NIM tasks — deploys, runs, and tests ASR, TTS, and NMT NIMs on build.nvidia.com or self-hosted.

NVIDIA skillDeveloperAI EngineerDevOps EngineerApplication Developer

nvidia

nv-generate-ct-rflow

Used for generating synthetic CT volumes and masks with NV-Generate-CTMR rflow-ct. Not for production training data without review.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr

Used for generating synthetic body MRI volumes with NV-Generate-CTMR rflow-mr. Not for paired masks or production training data.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr-brain

Used for generating synthetic brain MRI volumes with NV-Generate-CTMR rflow-mr-brain. Not for production training data.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-mr-brain-finetune

Used for finetuning NV-Generate-CTMR MR-brain diffusion UNet from a NIfTI datalist. Not for clinical or production data approval.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-generate-vae-finetune

Used for finetuning the NV-Generate-CTMR MAISI VAE from CT/MRI NIfTI datalists. Not for clinical or production data approval.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-reason-cxr

Used for command-shape or live NV-Reason-CXR chest X-ray reasoning smoke tests. Not for diagnosis or clinical reporting.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ct

Used for running NV-Segment-CT VISTA3D on CT NIfTI volumes and recording label-map evidence.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ct-finetune

Used for smoke or dataset finetuning of NV-Segment-CT VISTA3D on CT NIfTI labels. Not for clinical validation.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

nv-segment-ctmr

Used for running NV-Segment-CTMR on CT or MRI NIfTI volumes and recording label-map evidence. Not for clinical interpretation.

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

omniverse-cad-to-simready

Coordinate the end-to-end CAD/source-asset to SimReady workflow. Use for broad requests such as CAD to SimReady, source asset to simulation-ready USD, or prop packaging that require conversion, material/physics assignment, SimReady conformance, validation

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

omniverse-realtime-viewer

Use as the top-level router for Omniverse Realtime Viewer USD app requests and focused viewer reference documents.

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

omniverse-usd-performance-tuning

Top-level workflow skill for USD performance diagnosis and optimization. Use for slow loading, high memory, low FPS, or 'optimize my scene' requests; delegates auth/runtime setup to Phase 0 owners.

NVIDIA skillDeveloperApplication DeveloperSimulation EngineerOmniverse

nvidia

physical-ai-defect-image-generation

Use when the user wants to orchestrate defect image generation, run associated setup, or handle outputs on OSMO. The Day 0 path handles cold-start with USD-to-ROI, image-edit augmentation, and AnomalyGen to create initial PCBA datasets. The Day 1 path per

NVIDIA skillDeveloperAI EngineerMl EngineerPhysical AI

nvidia

physical-ai-infrastructure-setup-and-resilient-scaling

Use when the user wants to set up, scale, validate, or harden NVIDIA physical AI infrastructure for synthetic data generation workflows across local MicroK8s or Azure AKS, including Kubernetes clusters, inference endpoint deployment, OSMO deployment, work

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

physical-ai-neural-reconstruction

Router for NVIDIA NuRec/NRE: USDZ rendering, NCore conversion, 3DGS, gRPC sensor sim, PhysicalAI HF datasets. Do NOT use for SimReady or infra setup.

NVIDIA skillDeveloperAI EngineerSimulation EngineerPhysical AI

nvidia

physical-ai-video-data-augmentation

Use when running video data augmentation and auto-labeling workflows on OSMO: flow selection, preflight, submit-time interpolation, monitoring, and output retrieval. Trigger keywords: video data augmentation, data enrichment, auto labeling, VDA demo, OSMO

NVIDIA skillDeveloperAI EngineerMl EngineerPhysical AI

nvidia

physicsnemo-discover

Official NVIDIA-authored guidance for navigating PhysicsNeMo — pick the model, datapipe, or example for a SciML/AI4Science task (surrogates, forecasting, downscaling, physics-informed, inverse, generative). Points at existing files via live repo search; n

NVIDIA skillDeveloperData ScientistResearch AcademicPhysicsNeMo

nvidia

rag-blueprint

NVIDIA RAG Blueprint — deploy, configure, troubleshoot, and manage. Handles any RAG action: deploy, install, start, enable, disable, toggle, change, configure, troubleshoot, debug, fix, shutdown, stop, or tear down any RAG feature or service (Agentic RAG,

NVIDIA skillDeveloperAI EngineerDevOps EngineerPlatform Engineer

nvidia

rag-eval

Filesystem RAG benchmarks: corpus/, train.json, evaluate_rag.py (RAGAS quality). Not for prod monitoring, latency/throughput benchmarking (use rag-perf), or evals outside this repo layout.

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

rag-perf

Performance benchmarking for a deployed NVIDIA RAG Blueprint server: profiling pass + aiperf load test driven by a single YAML config. Not for accuracy / RAGAS scoring (use rag-eval) or for deploying / repairing services (use rag-blueprint).

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

skill-card-generator

Use only to generate or update a governance skill card for a specified existing agent skill directory. Do not use for explaining, listing, comparing, or discussing skill capabilities.

NVIDIA skillDeveloperPlatform EngineerSolutions ArchitectTrustworthy AI

nvidia

tao-analyze-changenet-rca

Performs deep Root Cause Analysis (RCA) on NVIDIA TAO Visual ChangeNet classification experiments with image-evidence-driven investigation. Use when analyzing ChangeNet model failures, investigating poor recall / FAR / PASS-NO_PASS metrics, auditing visua

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-analyze-gaps-visual-changenet

Performs gap analysis on NVIDIA TAO Visual ChangeNet (VCN) Classify experiments by invoking the data-services container (`tao_toolkit.data_services` from `versions.yaml`) directly via `docker run … gap_analysis vcn_aoi …` — picks the optimal decision thre

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-analyze-gaps-vlm-bcq

Extract false-positive and false-negative gaps from VLM binary-classification-question (BCQ, yes/no) predictions. Use after running VLM evaluation when you have a predictions JSON and need to identify failure cases for DEFT root cause analysis on a binary

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-convert-dataset-format

Run `tao-daft convert` to convert NVIDIA TAO DAFT datasets between supported formats. Do not use for non-DAFT data. Use when the user asks to convert a DAFT dataset, change DAFT format, change a TAO dataset format, or run `tao-daft convert`.

NVIDIA skillDeveloperData EngineerAI EngineerMl Engineer

nvidia

tao-finetune-clip

CLIP vision-language model for image-text retrieval, zero-shot classification, embedding extraction, ONNX export, and TensorRT deployment. Use when fine-tuning or training CLIP, running zero-shot classification, computing image embeddings, or deploying CL

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-finetune-cosmos-embed

Cosmos-Embed1 video-text embedding for text-to-video retrieval, video-to-video search, semantic deduplication, and fine-tuning. Use when the user asks to "fine-tune Cosmos-Embed1", "run cosmos-embed inference", "export Cosmos-Embed1", "embed videos", or "

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-finetune-cosmos-reason

Cosmos-Reason2-8B video QA supervised fine-tuning with FSDP parallelism. Use when training or evaluating video question-answering models, fine-tuning Cosmos-Reason2 with SFT, or working with Cosmos-RL. Trigger phrases include "fine-tune Cosmos-Reason", "C

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-finetune-huggingface-model

Fine-tune any HuggingFace CV / VLM / LLM model on local NVIDIA GPUs inside an NGC PyTorch container. Use when the user wants to fine-tune a HuggingFace model (full or LoRA), train a vision / VLM / LLM model end-to-end, generate a reproducible HF training

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-image-grounding

Two-step image grounding pipeline: extracts referring expressions from (image, caption) pairs and grounds them to pixel-space bounding boxes via a VLM. Use when the user wants to ground captions to bboxes, generate phrase-grounded annotations, auto-label

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-referring-expressions

Four-step image referring-expression pipeline: turns images plus KITTI bounding-box labels into region descriptions, scene captions, grounded referring expressions, and (optionally) verified expressions via VLM distillation. Use when the user wants to gen

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-generate-video-reasoning-annotations

Multi-step video annotation pipeline that turns raw videos into Chain-of-Thought training data — multi-level captions, structured descriptions, and QA pairs (MCQ, binary, open-ended) with reasoning traces, via VLM/LLM distillation. Use when the user wants

NVIDIA skillTAOData EngineerAI EngineerMl Engineer

nvidia

tao-launch-workflow

Shared launch intake for any TAO workflow or action. Use when the user wants to run TAO AutoML, train, evaluate, infer, export, generate TensorRT engines, or launch DEFT/workflow jobs on an execution platform.

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-list-capabilities

Answer what the TAO Skill Bank plugin can do by generating the response from packaged application, data, model, AutoML, and platform manifests.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-mine-aoi-images

Runs the DEFT embed-then-mine workflow for VCN AOI iterations — embeds the gap-analysis target parquet, embeds a source pool, and mines nearest-neighbour source images for downstream augmentation. Use as the immediate next step after `tao-route-visual-cha

NVIDIA skillDeveloperData EngineerAI EngineerData Scientist

nvidia

tao-port-huggingface-model

Integrate a HuggingFace Computer Vision model into the NVIDIA TAO Toolkit ecosystem (tao-core config, tao-pytorch trainer, tao-deploy TensorRT pipeline). Use when the user asks to "integrate a HuggingFace model into TAO", "add an HF model to TAO Toolkit",

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-route-visual-changenet-samples

Routes the weakest VCN samples (output of `tao-analyze-gaps-visual-changenet`) into per-augmentation-module subsets — one parquet for k-NN mining, one for AnomalyGen (Cosmos SDG) — based on each module's label eligibility. Use as the immediate next step a

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-run-automl

Run AutoML / hyperparameter optimization (HPO) for NVIDIA TAO networks using AutoMLRunner. Handles algorithm selection (bayesian, hyperband, asha, bohb, llm, hybrid, autoresearch), WandB experiment tracking, job execution on any TAO SDK platform, result i

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-run-automl-deft-pipeline

Run the canonical NVIDIA AOI three-phase training pipeline — Phase 1 AutoML baseline (HPO), Phase 2 DEFT loop (RCA → SDG → mining → plain-train retrain), Phase 3 AutoML refinement on the DEFT-augmented dataset. This is the default entry point for any "run

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-run-deft-aoi

Run the full DEFT AOI improvement loop for NVIDIA TAO VisualChangeNet / ChangeNet PCB inspection models: baseline evaluate, RCA, ingestion of customer-supplied pre-generated AnomalyGen images, k-NN mining, retraining, and deployment gating until FAR / rec

NVIDIA skillAI EngineerMl EngineerApplication DeveloperAI And Machine Learning

nvidia

tao-run-inference-service

Start, query, and stop a network-specific TAO inference microservice ({network_arch}-inference-microservice) by delegating container execution to the appropriate platform skill. Handles container image resolution, job-payload JSON construction, and the se

NVIDIA skillAI EngineerDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-run-on-brev

Brev managed GPU instances with Docker support. Use when running TAO training, evaluation, or inference on Brev GPU instances, managing Brev deployments, or dispatching TAO jobs through the Brev CLI. Trigger phrases include "run on Brev", "Brev GPU instan

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-kubernetes

Kubernetes execution platform — submits TAO container jobs as single-pod k8s Jobs with NVIDIA GPU scheduling. Use when running on EKS / GKE / AKS / on-prem clusters with the NVIDIA GPU Operator installed, or when integrating TAO into an existing k8s-nativ

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-lepton

DGX Cloud Lepton managed GPU compute platform with run/status/cancel interface. Use when submitting TAO jobs to DGX Cloud, dispatching training/eval/inference to Lepton GPU resources, or managing Lepton workspace deployments. Trigger phrases include "run

NVIDIA skillAI EngineerDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-run-on-local-docker

Local Docker execution for TAO SDK job containers using the host Docker daemon and NVIDIA GPU runtime. Use when running TAO jobs on the current machine or a directly attached Docker host. Trigger phrases include "run locally", "local Docker", "use my GPU"

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-on-slurm

Remote SLURM GPU cluster execution over SSH with sbatch/srun, Pyxis/Enroot containers, and Lustre-backed results. Use when running TAO training/eval/inference jobs on an on-prem or DGX SLURM cluster. Trigger phrases include "run on SLURM", "submit sbatch"

NVIDIA skillTAOAI EngineerDevOps EngineerMl Engineer

nvidia

tao-run-platform

TAO Execution SDK for submitting and monitoring GPU training jobs on supported platforms (Lepton, Brev, SLURM, local Docker, Kubernetes). Use when the user wants to run TAO jobs through the SDK, get job tracking, S3 I/O wrapping, multi-node distributed tr

NVIDIA skillDeveloperAI EngineerDevOps EngineerMl Engineer

nvidia

tao-setup-nvidia-gpu-host

Host setup for TAO GPU backends. Checks and, after user approval, installs NVIDIA driver branch 580, CUDA Toolkit 13.0, and NVIDIA Container Toolkit 1.19.0 for Docker/local-Docker and Kubernetes GPU worker hosts. The `--check-only` path works on any Linux

NVIDIA skillDeveloperDevOps EngineerMl EngineerPlatform Engineer

nvidia

tao-train-action-recognition

Action recognition from video sequences. Supports RGB, optical flow, and joint (multi-stream) input types for classifying temporal actions in video clips. Use when training, evaluating, exporting, or running inference on a TAO action-recognition model. Tr

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-bevfusion

BEVFusion for multi-sensor 3D object detection. Fuses LiDAR point clouds and camera images in bird's-eye-view (BEV) space, used in autonomous driving for robust 3D perception. Use when training, evaluating, or running inference for a TAO BEVFusion model.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-centerpose

CenterPose for keypoint / pose estimation. Detects object centers and regresses keypoint locations for 6-DoF object pose estimation. Use when training, evaluating, exporting, or running inference for a TAO CenterPose model. Trigger phrases include "train

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-deformable-detr

Deformable DETR for 2D object detection. Uses deformable attention for efficient multi-scale feature processing, lighter than DINO with competitive accuracy. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Deformable-D

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-depth-anything-v2

Monocular depth estimation using Metric Depth Anything v2 or Relative Depth Anything architectures. Predicts per-pixel depth from single RGB images. Use when training, evaluating, exporting, or running inference for a TAO monocular depth model. Trigger ph

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-dino

DINO (DETR with Improved DeNoising Anchor Boxes) for 2D object detection. Transformer-based detector with denoising training, multi-scale features, and optional distillation support. Use when training, evaluating, exporting, distilling, quantizing, or run

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-fast-foundation-stereo

Real-time stereo depth estimation using FastFoundationStereo (FFS), the distilled bp2 commercial variant of FoundationStereo. Predicts disparity maps from stereo image pairs with ~10× lower latency than full FoundationStereo. Use when training, evaluating

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-foundation-stereo

Stereo depth estimation using FoundationStereo. Predicts disparity maps from stereo image pairs for 3D reconstruction. Use when training, evaluating, exporting, or running inference for a TAO FoundationStereo model. Trigger phrases include "train stereo d

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-grounding-dino

Grounding DINO for open-set object detection. Combines DINO-style detection with a BERT text encoder for language-guided detection — detects objects described by text prompts without a fixed class vocabulary. Use when training, evaluating, exporting, quan

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-image-classification

PyTorch-based TAO image classification. Supports a wide range of backbones (FAN, EfficientNet, ResNet, etc.) with distillation and quantization for deployment. Use when training, evaluating, distilling, quantizing, exporting, or running inference for a TA

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-auto-encoder

Masked Auto-Encoder (MAE) for self-supervised pretraining and fine-tuning. Masks random patches and reconstructs them to learn visual representations; supports pretrain and finetune stages. Use when training, evaluating, exporting, or running inference fo

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-auto-label

MAL (Mask Auto-Label) for weakly-supervised segmentation. Produces segmentation masks from minimal annotations (point or box annotations) using a ViT-MAE backbone. Use when training, evaluating, or running inference for a TAO MAL model. Trigger phrases in

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-mask-grounding-dino

Mask Grounding DINO for grounded instance segmentation. Extends Grounding DINO with a mask-prediction head for open-set segmentation guided by text prompts. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask-Groundin

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-mask2former

Mask2Former for universal image segmentation (panoptic, instance, and semantic). Transformer-based with masked attention for high-quality segmentation results. Use when training, evaluating, exporting, quantizing, or running inference for a TAO Mask2Forme

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-metric-learning-recognition

Metric-learning recognition (ml-recog) for fine-grained visual recognition. Learns embeddings for retrieval-based matching (e.g., retail product recognition) using triplet / contrastive losses. Use when training, evaluating, exporting, or running inferenc

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-nvdinov2

NVDINOv2 for self-supervised visual representation learning. Trains vision transformers via self-distillation (teacher-student) without labels and produces general-purpose visual features. Use when training, distilling, exporting, or running inference for

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-nvpanoptix3d

NVPanoptix3D for panoptic 3D scene reconstruction from posed RGB images. Produces 3D panoptic segmentation (semantic, instance, and panoptic masks) with occupancy completion. Built on a VGGT backbone with a Mask2Former-style head and 3D frustum reconstruc

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-ocdnet

OCDNet for scene text detection. Detects arbitrary-oriented text regions in natural images using a differentiable binarization approach. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCDNet model

NVIDIA skillAI EngineerData ScientistMl EngineerApplication Developer

nvidia

tao-train-ocrnet

OCRNet for scene text recognition. Recognizes text content from cropped text-region images and supports CTC and attention-based decoders. Use when training, evaluating, exporting, pruning, quantizing, retraining, or running inference for a TAO OCRNet mode

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-oneformer

OneFormer for universal image segmentation. Unifies panoptic, instance, and semantic segmentation with a single architecture using task-conditioned queries. Use when training, evaluating, exporting, quantizing, or running inference for a TAO OneFormer mod

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-optical-inspection

Optical Inspection for defect detection using Siamese networks. Compares image pairs to detect manufacturing defects, anomalies, or quality issues. Use when training, evaluating, exporting, or running inference for a TAO Optical Inspection model on AOI /

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-pointpillars

PointPillars for 3D object detection from LiDAR point clouds. Encodes point clouds into a pseudo-image via a pillar-based representation, then applies 2D detection — used in autonomous driving and robotics. Use when training, evaluating, exporting, prunin

NVIDIA skillDeveloperRobotics DeveloperAI EngineerMl Engineer

nvidia

tao-train-pose-classification

Pose classification using ST-GCN (Spatial Temporal Graph Convolutional Network). Classifies skeleton sequences into action categories from pose-keypoint data. Use when training, evaluating, exporting, or running inference for a TAO pose-classification mod

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-reid

Person re-identification (ReID). Learns discriminative embeddings to match the same person across different camera views, based on metric learning. Use when training, evaluating, exporting, or running inference for a TAO person re-identification model. Tr

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-rtdetr

RT-DETR (Real-Time DEtection TRansformer) for 2D object detection. Designed for real-time inference with competitive accuracy and supports distillation and quantization for deployment optimization. Use when training, evaluating, distilling, quantizing, ex

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-segformer

SegFormer for semantic segmentation. Lightweight transformer-based architecture with hierarchical feature extraction, efficient for real-time segmentation tasks. Use when training, evaluating, exporting, quantizing, or running inference for a TAO SegForme

NVIDIA skillDeveloperAI EngineerMl EngineerAI And Machine Learning

nvidia

tao-train-single-step

Standard single-step train/eval/export workflow for any TAO model. Use when training a TAO model on a dataset without iterative data augmentation, AutoML, or DEFT loops. Trigger phrases include "single train run", "train then evaluate then export", "plain

NVIDIA skillDeveloperAI EngineerData ScientistMl Engineer

nvidia

tao-train-sparse4d

Sparse4D for multi-camera temporal 3D object detection and tracking. Uses sparse queries with deformable attention across camera views and time for end-to-end 3D perception, with an instance bank for temporal tracking. Use when training, evaluating, expor

NVIDIA skillRobotics DeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-train-visual-changenet

Visual ChangeNet for binary image classification and segmentation in AOI defect detection. Use when training, evaluating, exporting, or running inference for PCB defect detection or visual inspection, comparing image pairs for PASS/NO_PASS classification,

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

tao-validate-dataset-format

Run `tao-daft validate` to check NVIDIA TAO DAFT datasets for structure, schema, and cross-reference errors. Do not use for non-DAFT formats. Use when the user asks to validate a DAFT dataset, check DAFT schema, validate a TAO dataset format, or run `tao-

NVIDIA skillDeveloperData EngineerAI EngineerData Scientist

nvidia

tilegym-adding-cutile-kernel

Add a new cuTile GPU kernel operator to TileGym. Covers dispatch registration in ops.py, cuTile backend implementation, __init__.py exports, test creation, and benchmark in tests/benchmark. Use when adding, creating, or implementing a new cuTile operator/

NVIDIA skillDeveloperHpc DeveloperAccelerated ComputingCUDA Tile

nvidia

tilegym-converting-cutile-to-julia

Converts cuTile Python GPU kernels (@ct.kernel) to cuTile.jl Julia equivalents. Handles kernel syntax translation, 0-indexed to 1-indexed conversion, broadcasting differences, memory layout (row-major to column-major), type system mapping, and launch API

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-converting-cutile-to-triton

Converts cuTile GPU kernels (@ct.kernel) to Triton (@triton.jit). Handles standard in-repo conversion, debugging (cudaErrorIllegalAddress, shape mismatch, numerical mismatch), and mapping cuTile idioms (ct.load/ct.store, ct.Constant, ct.launch) to Triton

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-cutile-autotuning

Use when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/performance issues in autotuned CuTile ker

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-cutile-python

Expert cuTile programming assistant. Write high-performance GPU kernels using cuTile's tile-based programming model with proper validation and optimization. Supports deep agent orchestration for complex multi-kernel tasks.

NVIDIA skillDeveloperAI EngineerApplication DeveloperHpc Developer

nvidia

tilegym-improve-cutile-kernel-perf

Iteratively optimize cuTile kernel performance through systematic profiling, bottleneck analysis, IR comparison, and targeted tuning. Covers tile sizes, occupancy, autotune configs, TMA, latency hints, persistent scheduling, num_ctas, flush_to_zero, and I

NVIDIA skillDeveloperApplication DeveloperHpc DeveloperAccelerated Computing

nvidia

tilegym-monkey-patch-kernels-to-transformers

Integrate TileGym kernels into Hugging Face `transformers` models by replacing the library's submodule(s) and certain class(es)' implementations, and patching certain class(es)' init/forward/load weight methods prior to instantiating models. Used when the

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-ask-video

Use this skill to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-deploy-dense-captioning

Use this skill when deploying standalone RT-VLM dense captioning or calling its REST API (uploads, captions, streams, chat-completions, Kafka). Not for VSS profile deploy or video-search ingestion.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-detection-tracking-2d

Use this skill when the user wants to deploy, run, debug, tear down, or call the REST API of the RTVI-CV 2D detection / tracking microservice. Trigger when the user says things like 'deploy rtvi-cv', 'start warehouse 2d', 'add a stream', 'check rtvi-cv he

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-detection-tracking-3d

Deploy and operate the RTVI-CV-3D microservice as MV3DT (`MODE=mv3dt`): per-camera DeepStream perception plus BEV Fusion over calibrated cameras. Supports the bundled sample dataset, custom video files, and RTSP streams, and chains to `vss-generate-video-

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-deploy-profile

Use to select, configure, deploy, verify, debug, or tear down a VSS profile (base, search, lvs, warehouse, edge). Not for standalone microservices — use the vss-deploy-* skill.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerPlatform Engineer

nvidia

vss-deploy-video-embedding

Use this skill when deploying, operating, or integrating the VSS 3.2 GA RT-Embed Video Embedding microservice. Covers Docker Compose bring-up, GPU and storage prerequisites, the `/v1` REST API (file uploads, text and video embeddings, live RTSP streams, h

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerMl Engineer

nvidia

vss-generate-video-calibration

Use to run AutoMagicCalib on local MP4s, RTSP, or the bundled sample dataset, and to deploy vss-auto-calibration when needed. Do not use for non-AMC calibration or runtime analytics.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerDevOps EngineerHands On Builder

nvidia

vss-generate-video-report

Use this skill when producing a VSS analysis report — Mode A per-clip VLM, Mode B incident-range via video-analytics. Not for standalone video summarization, real-time alerts, or ad-hoc Q&A.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerMl EngineerApplication Developer

nvidia

vss-manage-alerts

Use for VSS alert workflows — real-time monitoring, Alert-Bridge subscriptions, Slack notifications, incident queries, camera onboarding. Not for non-alert analytics.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerProduction Operator

nvidia

vss-manage-video-io-storage

Use to call the VIOS REST API (sensor list, timelines, clip extraction, snapshots, add/delete sensors and streams). Not for VLM inference or search.

NVIDIA skillDeveloperDevOps EngineerPlatform EngineerApplication Developer

nvidia

vss-query-analytics

Use this skill when reading video-analytics metrics, incidents, alerts, and sensor data via the VA-MCP server (port 9901). Not for live VLM or incident-range narrative reports.

NVIDIA skillVideo Search and Summarization (VSS)AI EngineerPlatform EngineerProduction Operator

nvidia

vss-search-archive

Use to run top-level VSS fusion search on archived video, or to ingest video files / RTSP streams for search. Do NOT use for ad-hoc visual Q&A (use vss-ask-video), live captioning (use vss-deploy-dense-captioning), or video summarization and reports (use

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

vss-setup-behavior-analytics

Use to deploy the vss-behavior-analytics service standalone (entrypoint, config-source, optional calibration). Not for the full warehouse deploy.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-setup-video-analytics-api

Use to deploy the vss-video-analytics-api REST service standalone (config-source, data-log bind, Elasticsearch, optional Kafka). Not for full warehouse deploy.

NVIDIA skillVideo Search and Summarization (VSS)AI And Machine LearningDevOps EngineerPlatform Engineer

nvidia

vss-summarize-video

Use to summarize a recorded video via the LVS summarization microservice (HITL-gated) with a VLM fallback. Not for report generation or live RTSP captioning.

NVIDIA skillDeveloperAI EngineerMl EngineerApplication Developer

nvidia

Build a Video Search and Summarization (VSS) Agent

Run the VSS Blueprint on your Spark

NVIDIA blueprintDGXSpark

nvidia

Build and Deploy a Multi-Agent Chatbot

Deploy a multi-agent chatbot system and chat with agents on your Spark

NVIDIA blueprintDGXAgentsSpark

nvidia

CLI Coding Agent

Build local CLI coding agents with Ollama

NVIDIA blueprintCodingOllamaClaude CodeOpenCode

nvidia

Comfy UI

Install and use Comfy UI to generate images

NVIDIA blueprintDGXSpark

nvidia

Connect Multiple DGX Spark through a Switch

Set up a cluster of DGX Spark devices that are connected through Switch

NVIDIA blueprintDGXSpark

nvidia

Connect Three DGX Spark in a Ring Topology

Connect and set up three DGX Spark devices in a ring topology

NVIDIA blueprintDGXSpark

nvidia

Connect Two Sparks

Connect two Spark devices and setup them up for inference and fine-tuning

NVIDIA blueprintDGXSpark

nvidia

CUDA-X Data Science

Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes

NVIDIA blueprintpandasdimensionality reductiondata analyticsDGX

nvidia

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

NVIDIA blueprintFMHACross-PlatformDeepSeekDocker

nvidia

DGX Dashboard

Monitor your DGX system and launch JupyterLab

NVIDIA blueprintDGXSpark

nvidia

DGX Station AI Skills for Coding Agents

Give your coding agent (Claude Code, Codex, Gemini CLI, Cursor) DGX Station expertise via an AGENTS.md and on-demand Agent Skills

NVIDIA blueprintvLLMAI AgentsBlackwellDGX Station

nvidia

Fine-tune with NeMo

Use NVIDIA NeMo to fine-tune models locally

NVIDIA blueprintDGXSpark

nvidia

Fine-tune with Pytorch

Use Pytorch to fine-tune models locally

NVIDIA blueprintDGXSpark

nvidia

FLUX.1 Dreambooth LoRA Fine-tuning

Fine-tune FLUX.1-dev 12B model using Dreambooth LoRA for custom image generation

NVIDIA blueprintImage GenerationComfyUIDGXLoRA

nvidia

How to Build a Multi-GPU AI PC - A Practical Guide

Many people explore local generative AI for privacy and to avoid token limits, but newer models require significant memory and compute—leading some to adopt multi-GPU setups.

NVIDIA blueprintComfyUILlama.cppRTX

nvidia

How to Fine-Tune an LLM on NVIDIA GPUs With Unsloth

Fine-tune popular AI models faster in Unsloth with NVIDIA RTX AI PCs, RTX PRO workstations, and DGX Spark—plus explore the new Nemotron Nano 3 family of open models.

NVIDIA blueprintFine-TuningRTXLLMGPU

nvidia

How to Get Started With Large Language Models on NVIDIA RTX PCs

Learn about using LLMs locally on PCs and workstations with Ollama, AnythingLLM, and LM Studio.

NVIDIA blueprintLLMsOllamaRTXAnythingLLM

nvidia

How to Get Started With Visual Generative AI on NVIDIA RTX PCs

Learn how to run advanced image and video generation locally with ComfyUI and LTX-2 on RTX PCs.

NVIDIA blueprintGen AIComfyUILTX-2RTX

nvidia

Image & Video Generation with ComfyUI

Generate images and videos with FLUX, Wan 2.1, HunyuanVideo, and Cosmos on DGX Station

NVIDIA blueprintStationImage GenerationComfyUIDocker

nvidia

Install and Use Isaac Sim and Isaac Lab

Build Isaac Sim and Isaac Lab from source for Spark

NVIDIA blueprintDGXSpark

nvidia

Isaac GR00T N1.6 Fine-Tuning

Fine-tune and benchmark NVIDIA's GR00T N1.6 robotics foundation model on DGX Station

NVIDIA blueprintStationFine-TuningIsaac GR00TBlackwell

nvidia

Live VLM WebUI

Real-time Vision Language Model interaction with webcam streaming

NVIDIA blueprintVision AIDGXVLMSpark

nvidia

LLaMA Factory

Install and fine-tune models with LLaMA Factory

NVIDIA blueprintDGXSpark

nvidia

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

NVIDIA blueprintStationRadixAttentionStructured OutputBlackwell

nvidia

LM Studio on DGX Spark

Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.

NVIDIA blueprintInferencellmsterLM StudioLM Link

nvidia

Local Coding Agent

Run local CLI coding agents with Ollama on DGX Station (NVIDIA GB300) using glm-4.7-flash (fast) or unsloth/GLM-4.7-GGUF:Q8_0 (best quality)

NVIDIA blueprintStationCodingOllamaClaude Code

nvidia

Local Healthcare Agent on DGX Station

Run healthcare AI agents that analyze patient data and predict protein structures in an OpenShell sandbox on DGX Station

NVIDIA blueprintStationOpenFold3NemoClawNemotron

nvidia

MIG on DGX Station

Enable and configure Multi-Instance GPU (MIG) on DGX Station with GB300 Ultra (B300 GPUs)

NVIDIA blueprintStationSystem ConfigurationDGX StationMIG

nvidia

Multi-modal Inference

Setup multi-modal inference with TensorRT

NVIDIA blueprintDGXSpark

nvidia

Nanochat on Dual-Spark

Setup Nanochat on Dual-Spark

NVIDIA blueprintDGXSpark

nvidia

Nanochat Training

Train a small ChatGPT-style LLM (nanochat) with tokenizer, pretraining, midtraining, and SFT on DGX Station with GB300 Ultra

NVIDIA blueprintTrainingnanochatPyTorchDGX Station

nvidia

NCCL for Two Sparks

Install and test NCCL on two Sparks

NVIDIA blueprintDGXSpark

nvidia

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

NVIDIA blueprintNemotronInferenceLLMllama.cpp

nvidia

NIM on Spark

Deploy a NIM on Spark

NVIDIA blueprintDGXSpark

nvidia

NVFP4 Pretraining with Megatron Bridge

Pretrain Llama 3.1 8B with NVFP4 mixed precision on DGX Station using Megatron Bridge

NVIDIA blueprintTrainingNVFP4Megatron BridgeStation

nvidia

NVFP4 Quantization

Quantize a model to NVFP4 to run on DGX Station using TensorRT Model Optimizer

NVIDIA blueprintStationDGX

nvidia

NVFP4 Quantization

Quantize a model to NVFP4 to run on Spark using TensorRT Model Optimizer

NVIDIA blueprintDGXSpark

nvidia

NVIDIA Video Generation Guide

Learn how to create videos using LTX-2 in ComfyUI, accelerated on RTX. Learn how to take control of visual generative AI, creating high resolution video on RTX.

NVIDIA blueprintComfyUILTX-2RTX

nvidia

Open WebUI with Ollama

Install Open WebUI and use Ollama to chat with models on your Spark

NVIDIA blueprintDGXSpark

nvidia

OpenClaw 🦞

Run OpenClaw locally on DGX Spark with a vLLM-served local model

NVIDIA blueprintDGXSparkLocal LLMAI Agent

nvidia

Optimized JAX

Optimize JAX to run on Spark

NVIDIA blueprintDGXSpark

nvidia

Portfolio Optimization

GPU-Accelerated portfolio optimization using cuOpt and cuML

NVIDIA blueprintData ScienceRAPIDSFinancial Services

nvidia

Profiler-Driven Kernel Optimization for Fine-Tuning

Use torch.profiler to find training bottlenecks, then write custom Triton kernels to optimize LLaMA 8B fine-tuning

NVIDIA blueprintTrainingFine-TuningPerformance OptimizationKernel Development

nvidia

RAG Application in AI Workbench

Install and use AI Workbench to clone and run a reproducible RAG application

NVIDIA blueprintDGXSpark

nvidia

Register DGX Spark to Brev

Link your DGX Spark to Brev for remote access and shared environments

NVIDIA blueprintDGX SparkBrevSpark

nvidia

Register DGX Station to Brev

Link your DGX Station to Brev for remote access and sharing

NVIDIA blueprintStationDGX StationBrev

nvidia

Run Hermes Agent with Local Models

Install and run the Hermes self-improving AI agent on DGX Spark.

NVIDIA blueprintNous ResearchLLMAI AgentSpark

nvidia

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

NVIDIA blueprintDGX SparkInferenceLLMllama.cpp

nvidia

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Station using NemoClaw in a secure sandbox, with optional Telegram.

NVIDIA blueprintStationTelegramAgentic WorkflowNemoClaw

nvidia

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Spark using NemoClaw and Ollama in a secure sandbox, with optional Telegram.

NVIDIA blueprintTelegramDGX SparkAgentic WorkflowNemoClaw

nvidia

Run OpenClaw For Free On NVIDIA RTX GPUs & DGX Spark

Learn how to set up and host the popular AI agent using local inference apps optimized for RTX.

NVIDIA blueprintDGX SparkOpenClawRTX

nvidia

Secure Long Running AI Agents with OpenShell on DGX Station

Run OpenClaw with local models in an NVIDIA OpenShell sandbox on DGX Station

NVIDIA blueprintStationDGX StationOpenShellSecurity

nvidia

Secure Long Running AI Agents with OpenShell on DGX Spark

Run OpenClaw with local models in an NVIDIA OpenShell sandbox on DGX Spark

NVIDIA blueprintDGXOpenShellSparkSecurity

nvidia

Set Up Local Network Access

NVIDIA Sync helps set up and configure SSH access

NVIDIA blueprintDGXSpark

nvidia

Set up Tailscale on Your Spark

Use Tailscale to connect to your Spark on your home network no matter where you are

NVIDIA blueprintDGXSpark

nvidia

SGLang for Inference

Install and use SGLang on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

Single-cell RNA Sequencing

An end-to-end GPU-powered workflow for scRNA-seq using RAPIDS

NVIDIA blueprintdata science

nvidia

Spark & Reachy Photo Booth

AI augmented photo booth using the DGX Spark and Reachy Mini.

NVIDIA blueprintgenerative-aiagentsdockerSpark

nvidia

Spark & Reachy Photo Booth

AI augmented photo booth using the DGX Spark and Reachy Mini.

NVIDIA blueprintgenerative-aiagentsdockerSpark

nvidia

Speculative Decoding

Learn how to set up speculative decoding for fast inference on Spark

NVIDIA blueprintDGXSpark

nvidia

Text to Knowledge Graph

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

NVIDIA blueprintGraphRAGKnowledge GraphsNLPDGX

nvidia

Text to Knowledge Graph on DGX Station

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

NVIDIA blueprintGraphRAGKnowledge GraphsNLPOllama

nvidia

Topic Modeling

Extract insights from massive text datasets using cuML's GPU-accelerated BERTopic

NVIDIA blueprintData ScienceNLPBERTopicMachine Learning

nvidia

TRT LLM for Inference

Install and use TensorRT-LLM on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

Unsloth on DGX Spark

Optimized fine-tuning with Unsloth

NVIDIA blueprintDGXSpark

nvidia

Vibe Coding in VS Code

Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue

NVIDIA blueprintDGXVibeCodingSpark

nvidia

Vision-Language Model Fine-tuning

Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3

NVIDIA blueprintDGXImage UnderstandingVision-Language ModelsGRPO

nvidia

vLLM for Inference

Install and use vLLM on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

vLLM for Inference

Install and use vLLM on DGX Station

NVIDIA blueprintStationvLLMInference

nvidia

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

NVIDIA blueprintvLLMInferenceRTX

nvidia

VS Code

Install and use VS Code locally or remotely

NVIDIA blueprintDGXSpark

nvidia

🦞 Set Up Example NemoClaw Agents 🦞

Ready-to-run application examples for your NemoClaw sandbox — policy, prompt, and personalization for each workflow

NVIDIA blueprintPersonal AssistantTelegramApplicationsDGX Spark

nvidia

Build a Video Search and Summarization (VSS) Agent

Run the VSS Blueprint on your Spark

NVIDIA blueprintDGXSpark

nvidia

Build and Deploy a Multi-Agent Chatbot

Deploy a multi-agent chatbot system and chat with agents on your Spark

NVIDIA blueprintDGXAgentsSpark

nvidia

CLI Coding Agent

Build local CLI coding agents with Ollama

NVIDIA blueprintCodingOllamaClaude CodeOpenCode

nvidia

Comfy UI

Install and use Comfy UI to generate images

NVIDIA blueprintDGXSpark

nvidia

Connect Multiple DGX Spark through a Switch

Set up a cluster of DGX Spark devices that are connected through Switch

NVIDIA blueprintDGXSpark

nvidia

Connect Three DGX Spark in a Ring Topology

Connect and set up three DGX Spark devices in a ring topology

NVIDIA blueprintDGXSpark

nvidia

Connect Two Sparks

Connect two Spark devices and setup them up for inference and fine-tuning

NVIDIA blueprintDGXSpark

nvidia

CUDA-X Data Science

Install and use NVIDIA cuML and NVIDIA cuDF to accelerate UMAP, HDBSCAN, pandas and more with zero code changes

NVIDIA blueprintpandasdimensionality reductiondata analyticsDGX

nvidia

cuTile Kernels

Run cuTile kernel benchmarks, FMHA implementation, and LLM inference on DGX Spark and B300

NVIDIA blueprintFMHACross-PlatformDeepSeekDocker

nvidia

DGX Dashboard

Monitor your DGX system and launch JupyterLab

NVIDIA blueprintDGXSpark

nvidia

DGX Station AI Skills for Coding Agents

Give your coding agent (Claude Code, Codex, Gemini CLI, Cursor) DGX Station expertise via an AGENTS.md and on-demand Agent Skills

NVIDIA blueprintvLLMAI AgentsBlackwellDGX Station

nvidia

Fine-tune with NeMo

Use NVIDIA NeMo to fine-tune models locally

NVIDIA blueprintDGXSpark

nvidia

Fine-tune with Pytorch

Use Pytorch to fine-tune models locally

NVIDIA blueprintDGXSpark

nvidia

FLUX.1 Dreambooth LoRA Fine-tuning

Fine-tune FLUX.1-dev 12B model using Dreambooth LoRA for custom image generation

NVIDIA blueprintImage GenerationComfyUIDGXLoRA

nvidia

How to Build a Multi-GPU AI PC - A Practical Guide

Many people explore local generative AI for privacy and to avoid token limits, but newer models require significant memory and compute—leading some to adopt multi-GPU setups.

NVIDIA blueprintComfyUILlama.cppRTX

nvidia

How to Fine-Tune an LLM on NVIDIA GPUs With Unsloth

Fine-tune popular AI models faster in Unsloth with NVIDIA RTX AI PCs, RTX PRO workstations, and DGX Spark—plus explore the new Nemotron Nano 3 family of open models.

NVIDIA blueprintFine-TuningRTXLLMGPU

nvidia

How to Get Started With Large Language Models on NVIDIA RTX PCs

Learn about using LLMs locally on PCs and workstations with Ollama, AnythingLLM, and LM Studio.

NVIDIA blueprintLLMsOllamaRTXAnythingLLM

nvidia

How to Get Started With Visual Generative AI on NVIDIA RTX PCs

Learn how to run advanced image and video generation locally with ComfyUI and LTX-2 on RTX PCs.

NVIDIA blueprintGen AIComfyUILTX-2RTX

nvidia

Image & Video Generation with ComfyUI

Generate images and videos with FLUX, Wan 2.1, HunyuanVideo, and Cosmos on DGX Station

NVIDIA blueprintStationImage GenerationComfyUIDocker

nvidia

Install and Use Isaac Sim and Isaac Lab

Build Isaac Sim and Isaac Lab from source for Spark

NVIDIA blueprintDGXSpark

nvidia

Isaac GR00T N1.6 Fine-Tuning

Fine-tune and benchmark NVIDIA's GR00T N1.6 robotics foundation model on DGX Station

NVIDIA blueprintStationFine-TuningIsaac GR00TBlackwell

nvidia

Live VLM WebUI

Real-time Vision Language Model interaction with webcam streaming

NVIDIA blueprintVision AIDGXVLMSpark

nvidia

LLaMA Factory

Install and fine-tune models with LLaMA Factory

NVIDIA blueprintDGXSpark

nvidia

LLM Inference with SGLang

Serve LLMs with SGLang on DGX Station (Qwen3-8B default; Qwen3.6 MoE optional)—prefix-cached multi-turn, structured output, benchmarks, and inference-server guidance

NVIDIA blueprintStationRadixAttentionStructured OutputBlackwell

nvidia

LM Studio on DGX Spark

Deploy LM Studio and serve LLMs on a Spark device; use LM Link to access models remotely.

NVIDIA blueprintInferencellmsterLM StudioLM Link

nvidia

Local Coding Agent

Run local CLI coding agents with Ollama on DGX Station (NVIDIA GB300) using glm-4.7-flash (fast) or unsloth/GLM-4.7-GGUF:Q8_0 (best quality)

NVIDIA blueprintStationCodingOllamaClaude Code

nvidia

Local Healthcare Agent on DGX Station

Run healthcare AI agents that analyze patient data and predict protein structures in an OpenShell sandbox on DGX Station

NVIDIA blueprintStationOpenFold3NemoClawNemotron

nvidia

MIG on DGX Station

Enable and configure Multi-Instance GPU (MIG) on DGX Station with GB300 Ultra (B300 GPUs)

NVIDIA blueprintStationSystem ConfigurationDGX StationMIG

nvidia

Multi-modal Inference

Setup multi-modal inference with TensorRT

NVIDIA blueprintDGXSpark

nvidia

Nanochat on Dual-Spark

Setup Nanochat on Dual-Spark

NVIDIA blueprintDGXSpark

nvidia

Nanochat Training

Train a small ChatGPT-style LLM (nanochat) with tokenizer, pretraining, midtraining, and SFT on DGX Station with GB300 Ultra

NVIDIA blueprintTrainingnanochatPyTorchDGX Station

nvidia

NCCL for Two Sparks

Install and test NCCL on two Sparks

NVIDIA blueprintDGXSpark

nvidia

Nemotron-3-Nano with llama.cpp

Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark

NVIDIA blueprintNemotronInferenceLLMllama.cpp

nvidia

NIM on Spark

Deploy a NIM on Spark

NVIDIA blueprintDGXSpark

nvidia

NVFP4 Pretraining with Megatron Bridge

Pretrain Llama 3.1 8B with NVFP4 mixed precision on DGX Station using Megatron Bridge

NVIDIA blueprintTrainingNVFP4Megatron BridgeStation

nvidia

NVFP4 Quantization

Quantize a model to NVFP4 to run on DGX Station using TensorRT Model Optimizer

NVIDIA blueprintStationDGX

nvidia

NVFP4 Quantization

Quantize a model to NVFP4 to run on Spark using TensorRT Model Optimizer

NVIDIA blueprintDGXSpark

nvidia

NVIDIA Video Generation Guide

Learn how to create videos using LTX-2 in ComfyUI, accelerated on RTX. Learn how to take control of visual generative AI, creating high resolution video on RTX.

NVIDIA blueprintComfyUILTX-2RTX

nvidia

Open WebUI with Ollama

Install Open WebUI and use Ollama to chat with models on your Spark

NVIDIA blueprintDGXSpark

nvidia

OpenClaw 🦞

Run OpenClaw locally on DGX Spark with a vLLM-served local model

NVIDIA blueprintDGXSparkLocal LLMAI Agent

nvidia

Optimized JAX

Optimize JAX to run on Spark

NVIDIA blueprintDGXSpark

nvidia

Portfolio Optimization

GPU-Accelerated portfolio optimization using cuOpt and cuML

NVIDIA blueprintData ScienceRAPIDSFinancial Services

nvidia

Profiler-Driven Kernel Optimization for Fine-Tuning

Use torch.profiler to find training bottlenecks, then write custom Triton kernels to optimize LLaMA 8B fine-tuning

NVIDIA blueprintTrainingFine-TuningPerformance OptimizationKernel Development

nvidia

RAG Application in AI Workbench

Install and use AI Workbench to clone and run a reproducible RAG application

NVIDIA blueprintDGXSpark

nvidia

Register DGX Spark to Brev

Link your DGX Spark to Brev for remote access and shared environments

NVIDIA blueprintDGX SparkBrevSpark

nvidia

Register DGX Station to Brev

Link your DGX Station to Brev for remote access and sharing

NVIDIA blueprintStationDGX StationBrev

nvidia

Run Hermes Agent with Local Models

Install and run the Hermes self-improving AI agent on DGX Spark.

NVIDIA blueprintNous ResearchLLMAI AgentSpark

nvidia

Run models with llama.cpp on DGX Spark

Build llama.cpp with CUDA and serve models via an OpenAI-compatible API

NVIDIA blueprintDGX SparkInferenceLLMllama.cpp

nvidia

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Station using NemoClaw in a secure sandbox, with optional Telegram.

NVIDIA blueprintStationTelegramAgentic WorkflowNemoClaw

nvidia

Run NemoClaw with a Local LLM

Build your first local AI assistant on DGX Spark using NemoClaw and Ollama in a secure sandbox, with optional Telegram.

NVIDIA blueprintTelegramDGX SparkAgentic WorkflowNemoClaw

nvidia

Run OpenClaw For Free On NVIDIA RTX GPUs & DGX Spark

Learn how to set up and host the popular AI agent using local inference apps optimized for RTX.

NVIDIA blueprintDGX SparkOpenClawRTX

nvidia

Secure Long Running AI Agents with OpenShell on DGX Station

Run OpenClaw with local models in an NVIDIA OpenShell sandbox on DGX Station

NVIDIA blueprintStationDGX StationOpenShellSecurity

nvidia

Secure Long Running AI Agents with OpenShell on DGX Spark

Run OpenClaw with local models in an NVIDIA OpenShell sandbox on DGX Spark

NVIDIA blueprintDGXOpenShellSparkSecurity

nvidia

Set Up Local Network Access

NVIDIA Sync helps set up and configure SSH access

NVIDIA blueprintDGXSpark

nvidia

Set up Tailscale on Your Spark

Use Tailscale to connect to your Spark on your home network no matter where you are

NVIDIA blueprintDGXSpark

nvidia

SGLang for Inference

Install and use SGLang on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

Single-cell RNA Sequencing

An end-to-end GPU-powered workflow for scRNA-seq using RAPIDS

NVIDIA blueprintdata science

nvidia

Spark & Reachy Photo Booth

AI augmented photo booth using the DGX Spark and Reachy Mini.

NVIDIA blueprintgenerative-aiagentsdockerSpark

nvidia

Spark & Reachy Photo Booth

AI augmented photo booth using the DGX Spark and Reachy Mini.

NVIDIA blueprintgenerative-aiagentsdockerSpark

nvidia

Speculative Decoding

Learn how to set up speculative decoding for fast inference on Spark

NVIDIA blueprintDGXSpark

nvidia

Text to Knowledge Graph

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

NVIDIA blueprintGraphRAGKnowledge GraphsNLPDGX

nvidia

Text to Knowledge Graph on DGX Station

Transform unstructured text into interactive knowledge graphs with LLM inference and graph visualization

NVIDIA blueprintGraphRAGKnowledge GraphsNLPOllama

nvidia

Topic Modeling

Extract insights from massive text datasets using cuML's GPU-accelerated BERTopic

NVIDIA blueprintData ScienceNLPBERTopicMachine Learning

nvidia

TRT LLM for Inference

Install and use TensorRT-LLM on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

Unsloth on DGX Spark

Optimized fine-tuning with Unsloth

NVIDIA blueprintDGXSpark

nvidia

Vibe Coding in VS Code

Use DGX Spark as a local or remote Vibe Coding assistant with Ollama and Continue

NVIDIA blueprintDGXVibeCodingSpark

nvidia

Vision-Language Model Fine-tuning

Fine-tune Vision-Language Models for image and video understanding tasks using Qwen2.5-VL and InternVL3

NVIDIA blueprintDGXImage UnderstandingVision-Language ModelsGRPO

nvidia

vLLM for Inference

Install and use vLLM on DGX Spark

NVIDIA blueprintDGXSpark

nvidia

vLLM for Inference

Install and use vLLM on DGX Station

NVIDIA blueprintStationvLLMInference

nvidia

vLLM for Inference

Install and use vLLM on NVIDIA RTX Pro 6000

NVIDIA blueprintvLLMInferenceRTX

nvidia

VS Code

Install and use VS Code locally or remotely

NVIDIA blueprintDGXSpark

nvidia

🦞 Set Up Example NemoClaw Agents 🦞

Ready-to-run application examples for your NemoClaw sandbox — policy, prompt, and personalization for each workflow

NVIDIA blueprintPersonal AssistantTelegramApplicationsDGX Spark