Models, skills and blueprints for GPU jobs.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelautonomous vehiclesPhysical AIroboticstext-to-world

nvidia

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelDownload AvailableRoute Optimization

nvidia

cuopt

World-record accuracy and performance for complex route optimization.

NVIDIA modelB200H200H100 80GB HBM3MoE

deepseek-ai

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

NVIDIA modelMoereasoningcodingagentic

deepseek-ai

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

NVIDIA modelChemistrynimBioNemoDocking

mit

diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

NVIDIA modelchatdiffusion-llmtext-to-textreasoning

google

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

NVIDIA modelchatCode GenerationText-to-TextFree Endpoint

abacusai

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

NVIDIA modelnimProtein EmbeddingBioNemoBiology

meta

esm2-650m

Generates embeddings of proteins from their amino acid sequences.

NVIDIA modelbiologynimBionemoprotein folding

meta

esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelDNA GenerationbiologynimBionemo

arc

evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

NVIDIA modelDownload AvailabletelepresenceNvidia MaxineDigital Human

nvidia

eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

cadence

fidelity

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

ansys

fluent

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-Kontext-dev

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

NVIDIA modelTogether AIDeepinfraText-to-ImageImage Generation

black-forest-labs

FLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

NVIDIA modelimage editingRun-on-RTXText-to-ImageImage Generation

black-forest-labs

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

NVIDIA modelDownload AvailableWeather SimulationAI Weather Predictionclimate science

nvidia

fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

NVIDIA modelchatChatText-to-TextLanguage Generation

google

gemma-2-2b-it

Advanced small language generative AI model for edge applications

NVIDIA modelchatlanguage generationspeech recognitionVisual QA

google

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatTogether AIBitdeerlanguage generation

google

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelB200H200L40STogether AI

google

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

nvidia

genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

NVIDIA modelFree EndpointPII DetectionNVIDIA NIM

nvidia

gliner-pii

GLiNER PII detects Personally Identifiable Information in text.

NVIDIA modelB200H200Together AIBitdeer

z-ai

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

NVIDIA modelTogether AIEigen AICoreWeaveDeepinfra

openai

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

NVIDIA modelreasoningtext-to-textchatmath

openai

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

NVIDIA modelQuantumreasoningVision Language Modelcalibration

nvidia

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

moonshotai

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

NVIDIA modelbroadcastlipsynclocalizationnews

nvidia

LipSync

Generative lip dubbing that syncs lips in a video to input audio.

meta

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200CoreWeave

meta

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

NVIDIA modelH100 NVLA100 SXM4 80GBH200advanced reasoning

nvidia

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

NVIDIA modelchatdoc intelligencemultiple image understandingOCR

nvidia

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

NVIDIA modelcontent moderationllm safetymultilingual guard modelmultilingual content safety

nvidia

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelchatImage-Text RetrievalVisual QAImage Captioning

meta

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

meta

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

meta

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelchatTogether AIDeepinfraImage-Text Retrieval

meta

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

meta

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

meta

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

NVIDIA modelLLM Multimodal SafetyContent SafetyGuardrailContent Moderator

meta

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

NVIDIA modelDownload AvailableText-to-EmbeddingRetrieval Augmented GenerationNeMo Retriever

nvidia

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

NVIDIA modelnemo retrieverembeddingPartner EndpointDownload Available

nvidia

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

NVIDIA modelnemo retrieverrerankingDownload AvailableRetrieval Augmented Generation

nvidia

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingPartner EndpointDownload Available

nvidia

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelTTSNVIDIA NIMNVIDIA Rivamultilingual

nvidia

magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA modelTTSNVIDIA NIMNVIDIA RivaText-to-Speech

nvidia

magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

megatron-1b-nmt

Enable smooth global interactions in 36 languages.

minimaxai

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

NVIDIA modelcodingtext-to-textreasoningchat

minimaxai

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

NVIDIA modelL40Schatlanguage generationSLM

mistralai

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

NVIDIA modelchatlanguage generationmultimodalagentic

mistralai

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

NVIDIA modelcodingreasoningtextagentic

mistralai

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

NVIDIA modelchatlanguage generationinstruction followingfunction calling

mistralai

mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

NVIDIA modelchatcode generationreasoningimage-to-text

mistralai

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

NVIDIA modelB200L40H100 NVLAdvanced Reasoning

mistralai

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

nvidia

molmim

MolMIM performs controlled generation, finding molecules with the right properties.

NVIDIA modelnimBionemoBiologyProtein Folding

colabfold

msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

NVIDIA modelnemo guardrailsllm securityNIMPrompt Injection

nvidia

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modeloptical character recognitionnemo retrieverdata ingestiontable extraction

nvidia

nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelchatMoEReasoningLong Context

nvidia

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

NVIDIA modelEigen AIBitdeerDeepinfraLightning AI

nvidia

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

NVIDIA modelMoEReasoningChatLong Context

nvidia

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelAgentMoEFrontierReasoning

nvidia

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

nemotron-asr-streaming

Real-time speech recognition for English

NVIDIA modelNeMo GuardrailsNemotronreasoningSafety and Moderation

nvidia

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelchatChatText-to-TextLanguage Generation

nvidia

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

NVIDIA modelchatlanguage generationvision assistantvisual question answering

nvidia

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemotron-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelObject DetectionChart DetectionTable Detectiondata ingestion

nvidia

nemotron-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelDownload Availabletext and table extractiondocument parsingsupported language - english

nvidia

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelEnglishvoice chatNVIDIA NIMFree Endpoint

nvidia

nemotron-voicechat

Nemotron 3 Voicechat

NVIDIA modelNon-Commercial Use OnlyText-to-EmbeddingRetrieval Augmented GenerationFree Endpoint

nvidia

nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

NVIDIA modelnemo retrieverEmbeddingFree EndpointRetrieval Augmented Generation

nvidia

nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

NVIDIA modelDownload AvailableEmbeddingrun-on-rtxNemo retriever

nvidia

nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

NVIDIA modelObject DetectionData ingestionChart Detectionnemo retriever

nvidia

nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelchatTogether AIDeepinfrathinking budget

nvidia

nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

openfold

openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

openfold

openfold3

OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)

NVIDIA modelB200H100 NVLA100 PG509 200Optical Character Recognition

baidu

paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

NVIDIA modelimagecvVision Assistantvlm

google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

NVIDIA modelASRStreamingEnglishBatch

nvidia

parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

NVIDIA modelASRStreamingSpanishNVIDIA NIM

nvidia

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingVietnameseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingMandarinNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

NVIDIA modelASRStreamingTaiwaneseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

NVIDIA modelASRStreamingEnglishbatch

nvidia

parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

NVIDIA modelASREnglishNVIDIA NIMNVIDIA Riva

nvidia

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

NVIDIA modelchatCoreWeaveChatText-to-Text

microsoft

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

NVIDIA modelSpeech RecognitionVisual QALanguage GenerationChart and Table Understanding

microsoft

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

NVIDIA modelbiologynimBioNemoProtein Generation

ipd

proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

NVIDIA modelTogether AIText-to-ImageImage GenerationPartner Endpoint

qwen

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

qwen

qwen-image-edit

Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.

qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

NVIDIA modelB200GB200chattool calling

qwen

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

NVIDIA modelMoEimage-to-imageVLMagentic

qwen

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

NVIDIA modelHDRIremote contributionlightingnvidia ai for media

nvidia

Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.

NVIDIA modelRankingRetrieval Augmented GenerationFree Endpoint

nvidia

rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelA100 SXM4 80GBL40SA10Gbiology

ipd

rfdiffusion

A generative model of protein backbones for protein binder design.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

riva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA modelnvidia nimneural machine translationText TranslationFree Endpoint

nvidia

riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

NVIDIA modelcodingindic languageshybridreasoning

sarvamai

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

NVIDIA modelchatthinking budgetreasoningtext-generation

bytedance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

siemens

simcenter-star-ccm+

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelNon-Commercial Use OnlychatText-to-TextLanguage Generation

upstage

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

NVIDIA modelautonomous vehiclesbevav stackautomotive

nvidia

sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

NVIDIA modelchip-designelectronic-design-automationedasemiconductor

cadence

spectre-x

Run large-scale electronics and chip design verification simulations

NVIDIA modelText-to-ImageImage GenerationDownload Available

stabilityai

stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

NVIDIA modelchatAgenticCodingReasoning

stepfun-ai

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

NVIDIA modelB200H200H100 80GB HBM3chat

stepfun-ai

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

NVIDIA modelB200H100 NVLA100 PG509 200sovereign ai

stockmark

stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

NVIDIA modelautonomous vehiclesbevAV Stackautomotive

nvidia

streampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

NVIDIA modelbroadcastsmptecommunicationsmic quality

nvidia

Studio Voice

Enhance input speech recorded with low-quality microphones in noisy or reverberant environments, producing studio-quality speech.

NVIDIA modelbroadcastmedia2forensicsnvidia ai for media

nvidia

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.

0615409268808334

test_endpoint_20251218_133732_563_ouy_canary

For publishing test

NVIDIA modeltext-to-3dRun-on-RTXimage-to-3dDownload Available

microsoft

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

NVIDIA modelInteractive AnnotationImage SegmentationNon-Commercial Use OnlyDownload Available

nvidia

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

NVIDIA modelB200H100 80GB HBM3ASRAST

openai

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

NVIDIA modelbroadcastlocalizationsmptespeaker detection

nvidia

Active Speaker Detection

Detect and track speaker identities across video frames.

NVIDIA modelnimBionemoBiologyprotein folding

deepmind

alphafold2

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelnimBionemoBiologyprotein folding

deepmind

alphafold2-multimer

Predicts the 3D structure of a protein from its amino acid sequence.

sqwh1lyrveic

AODT 1.2.1

sqwh1lyrveic

AODT 1.2.2

NVIDIA modeldenoisingcommunicationsnvidia ai for mediaspeech enhancement

nvidia

Background Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

NVIDIA modelautonomous vehiclesbevautomotiveperception

nvidia

bevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

NVIDIA modelEmbeddingsRetrieval Augmented GenerationPartner EndpointDownload Available

baai

bge-m3

Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.

NVIDIA modelnimBionemoBiologyProtein Folding

mit

Boltz-2

Predict complex structures using Boltz-2.

NVIDIA modelAutomatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva

nvidia

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

NVIDIA modelDGX SparkA100 SXM4 80GBL40STTS

resembleai

chatterbox-multilingual-tts

Natural and expressive voices in 23 languages. For voice agents and brand ambassadors.

NVIDIA modelASRstreamingSpanishNVIDIA NIM

nvidia

conformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case Spanish with record-setting accuracy and performance

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelSynthetic Data GenerationAutonomous VehiclesPhysical AIrobotics

nvidia

cosmos-transfer2.5-2b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

NVIDIA modelautonomous vehiclesPhysical AIroboticstext-to-world

nvidia

cosmos3-nano

Generates physics-aware videos from text prompts or an image prompt for physical AI development.

NVIDIA modelvideo understandingautonomous vehiclesindustrialPhysical AI

nvidia

cosmos3-nano-reasoner

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

NVIDIA modelDownload AvailableRoute Optimization

nvidia

cuopt

World-record accuracy and performance for complex route optimization.

NVIDIA modelB200H200H100 80GB HBM3MoE

deepseek-ai

deepseek-v4-flash

DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.

NVIDIA modelMoereasoningcodingagentic

deepseek-ai

deepseek-v4-pro

DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.

NVIDIA modelChemistrynimBioNemoDocking

mit

diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

NVIDIA modelchatdiffusion-llmtext-to-textreasoning

google

diffusiongemma-26b-a4b-it

Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps

NVIDIA modelchatCode GenerationText-to-TextFree Endpoint

abacusai

dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

NVIDIA modelnimProtein EmbeddingBioNemoBiology

meta

esm2-650m

Generates embeddings of proteins from their amino acid sequences.

NVIDIA modelbiologynimBionemoprotein folding

meta

esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

NVIDIA modelDNA GenerationbiologynimBionemo

arc

evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

NVIDIA modelDownload AvailabletelepresenceNvidia MaxineDigital Human

nvidia

eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

cadence

fidelity

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

ansys

fluent

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

black-forest-labs

FLUX.1-Kontext-dev

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

NVIDIA modelTogether AIDeepinfraText-to-ImageImage Generation

black-forest-labs

FLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

NVIDIA modelimage editingRun-on-RTXText-to-ImageImage Generation

black-forest-labs

flux.2-klein-4b

FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed

NVIDIA modelDownload AvailableWeather SimulationAI Weather Predictionclimate science

nvidia

fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

NVIDIA modelchatChatText-to-TextLanguage Generation

google

gemma-2-2b-it

Advanced small language generative AI model for edge applications

NVIDIA modelchatlanguage generationspeech recognitionVisual QA

google

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelchatTogether AIBitdeerlanguage generation

google

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

NVIDIA modelB200H200L40STogether AI

google

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

nvidia

genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

NVIDIA modelFree EndpointPII DetectionNVIDIA NIM

nvidia

gliner-pii

GLiNER PII detects Personally Identifiable Information in text.

NVIDIA modelB200H200Together AIBitdeer

z-ai

glm-5.1

GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.

NVIDIA modelTogether AIEigen AICoreWeaveDeepinfra

openai

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

NVIDIA modelreasoningtext-to-textchatmath

openai

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

NVIDIA modelQuantumreasoningVision Language Modelcalibration

nvidia

ising-calibration-1-35b-a3b

Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.

moonshotai

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

NVIDIA modelbroadcastlipsynclocalizationnews

nvidia

LipSync

Generative lip dubbing that syncs lips in a video to input audio.

meta

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

NVIDIA modelB200H100 NVLA100 PG509 200CoreWeave

meta

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelnemo guardrailsLLM safetySafety and moderationdialogue safety

nvidia

llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

NVIDIA modelH100 NVLA100 SXM4 80GBH200advanced reasoning

nvidia

llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

NVIDIA modelchatdoc intelligencemultiple image understandingOCR

nvidia

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

NVIDIA modelcontent moderationllm safetymultilingual guard modelmultilingual content safety

nvidia

llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

NVIDIA modelchatImage-Text RetrievalVisual QAImage Captioning

meta

llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

meta

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

meta

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

NVIDIA modelchatTogether AIDeepinfraImage-Text Retrieval

meta

llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

meta

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelB200GH200 480GBH100 NVLadvanced reasoning

nvidia

llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

NVIDIA modelchatlanguage generationvision assistantvisual question answering

meta

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

NVIDIA modelLLM Multimodal SafetyContent SafetyGuardrailContent Moderator

meta

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

NVIDIA modelDownload AvailableText-to-EmbeddingRetrieval Augmented GenerationNeMo Retriever

nvidia

llama-nemotron-embed-1b-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

NVIDIA modelnemo retrieverembeddingPartner EndpointDownload Available

nvidia

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

NVIDIA modelnemo retrieverrerankingDownload AvailableRetrieval Augmented Generation

nvidia

llama-nemotron-rerank-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelnemo retrieverrerankingPartner EndpointDownload Available

nvidia

llama-nemotron-rerank-vl-1b-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelTTSNVIDIA NIMNVIDIA Rivamultilingual

nvidia

magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA modelTTSNVIDIA NIMNVIDIA RivaText-to-Speech

nvidia

magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

megatron-1b-nmt

Enable smooth global interactions in 36 languages.

minimaxai

minimax-m2.7

MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.

NVIDIA modelcodingtext-to-textreasoningchat

minimaxai

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

NVIDIA modelL40Schatlanguage generationSLM

mistralai

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

NVIDIA modelchatlanguage generationmultimodalagentic

mistralai

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

NVIDIA modelcodingreasoningtextagentic

mistralai

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

NVIDIA modelchatlanguage generationinstruction followingfunction calling

mistralai

mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

NVIDIA modelchatcode generationreasoningimage-to-text

mistralai

mistral-small-4-119b-2603

Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context

NVIDIA modelB200L40H100 NVLAdvanced Reasoning

mistralai

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

nvidia

molmim

MolMIM performs controlled generation, finding molecules with the right properties.

NVIDIA modelnimBionemoBiologyProtein Folding

colabfold

msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

NVIDIA modelnemo guardrailsllm securityNIMPrompt Injection

nvidia

nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modeloptical character recognitionnemo retrieverdata ingestiontable extraction

nvidia

nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelchatMoEReasoningLong Context

nvidia

nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

NVIDIA modelEigen AIBitdeerDeepinfraLightning AI

nvidia

nemotron-3-nano-omni-30b-a3b-reasoning

Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.

NVIDIA modelMoEReasoningChatLong Context

nvidia

nemotron-3-super-120b-a12b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelAgentMoEFrontierReasoning

nvidia

nemotron-3-ultra-550b-a55b

Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more

NVIDIA modelllm safetysafety and moderationmultilingual content safetyai safety nemo guardrails

nvidia

nemotron-3.5-content-safety

Multilingual, multimodal model for detecting unsafe and toxic content.

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

nemotron-asr-streaming

Real-time speech recognition for English

NVIDIA modelNeMo GuardrailsNemotronreasoningSafety and Moderation

nvidia

nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelchatChatText-to-TextLanguage Generation

nvidia

nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

NVIDIA modelchatlanguage generationvision assistantvisual question answering

nvidia

nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

NVIDIA modelTable Extractionnemo retrieverdata ingestionextraction

nvidia

nemotron-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

NVIDIA modelObject DetectionChart DetectionTable Detectiondata ingestion

nvidia

nemotron-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelDownload Availabletext and table extractiondocument parsingsupported language - english

nvidia

nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

NVIDIA modelObject DetectionChart Detectionnemo retrieverTable Detection

nvidia

nemotron-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelEnglishvoice chatNVIDIA NIMFree Endpoint

nvidia

nemotron-voicechat

Nemotron 3 Voicechat

NVIDIA modelNon-Commercial Use OnlyText-to-EmbeddingRetrieval Augmented GenerationFree Endpoint

nvidia

nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

NVIDIA modelnemo retrieverEmbeddingFree EndpointRetrieval Augmented Generation

nvidia

nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

NVIDIA modelDownload AvailableEmbeddingrun-on-rtxNemo retriever

nvidia

nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

NVIDIA modelObject DetectionData ingestionChart Detectionnemo retriever

nvidia

nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

NVIDIA modelchatTogether AIDeepinfrathinking budget

nvidia

nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

openfold

openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

openfold

openfold3

OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)

NVIDIA modelB200H100 NVLA100 PG509 200Optical Character Recognition

baidu

paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

NVIDIA modelimagecvVision Assistantvlm

google

paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

NVIDIA modelAutomatic Speech RecognitionNVIDIA NIMNVIDIA RivaDownload Available

nvidia

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

NVIDIA modelASRStreamingEnglishBatch

nvidia

parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

NVIDIA modelASRStreamingSpanishNVIDIA NIM

nvidia

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingVietnameseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

NVIDIA modelASRStreamingMandarinNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

NVIDIA modelASRStreamingTaiwaneseNVIDIA NIM

nvidia

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

NVIDIA modelASRStreamingEnglishbatch

nvidia

parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

NVIDIA modelASREnglishNVIDIA NIMNVIDIA Riva

nvidia

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

NVIDIA modelchatCoreWeaveChatText-to-Text

microsoft

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

NVIDIA modelSpeech RecognitionVisual QALanguage GenerationChart and Table Understanding

microsoft

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

NVIDIA modelbiologynimBioNemoProtein Generation

ipd

proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

NVIDIA modelTogether AIText-to-ImageImage GenerationPartner Endpoint

qwen

qwen-image

Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.

NVIDIA modelText-to-ImageImage GenerationPartner EndpointDownload Available

qwen

qwen-image-edit

Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.

qwen

qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

NVIDIA modelB200GB200chattool calling

qwen

qwen3.5-122b-a10b

122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.

NVIDIA modelMoEimage-to-imageVLMagentic

qwen

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

NVIDIA modelHDRIremote contributionlightingnvidia ai for media

nvidia

Relighting

Re-illuminate people in video to match target lighting from a 360 HDRI environment map.

NVIDIA modelRankingRetrieval Augmented GenerationFree Endpoint

nvidia

rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

NVIDIA modelA100 SXM4 80GBL40SA10Gbiology

ipd

rfdiffusion

A generative model of protein backbones for protein binder design.

NVIDIA modelNeural machine translationNVIDIA NIMText TranslationDownload Available

nvidia

riva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA modelnvidia nimneural machine translationText TranslationFree Endpoint

nvidia

riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

NVIDIA modelcodingindic languageshybridreasoning

sarvamai

sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

NVIDIA modelchatthinking budgetreasoningtext-generation

bytedance

seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

NVIDIA modelaerodynamicscaefluid-dynamicssimulation

siemens

simcenter-star-ccm+

Run computational-fluid dynamics (CFD) simulations

NVIDIA modelNon-Commercial Use OnlychatText-to-TextLanguage Generation

upstage

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

NVIDIA modelautonomous vehiclesbevav stackautomotive

nvidia

sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

NVIDIA modelchip-designelectronic-design-automationedasemiconductor

cadence

spectre-x

Run large-scale electronics and chip design verification simulations

NVIDIA modelText-to-ImageImage GenerationDownload Available

stabilityai

stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

NVIDIA modelchatAgenticCodingReasoning

stepfun-ai

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

NVIDIA modelB200H200H100 80GB HBM3chat

stepfun-ai

step-3.7-flash

A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.

NVIDIA modelB200H100 NVLA100 PG509 200sovereign ai

stockmark

stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

NVIDIA modelautonomous vehiclesbevAV Stackautomotive

nvidia

streampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

NVIDIA modelbroadcastsmptecommunicationsmic quality

nvidia

Studio Voice

Enhance input speech recorded with low-quality microphones in noisy or reverberant environments, producing studio-quality speech.

NVIDIA modelbroadcastmedia2forensicsnvidia ai for media

nvidia

synthetic-video-detector

NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.

0615409268808334

test_endpoint_20251218_133732_563_ouy_canary

For publishing test

NVIDIA modeltext-to-3dRun-on-RTXimage-to-3dDownload Available

microsoft

TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

NVIDIA modelInteractive AnnotationImage SegmentationNon-Commercial Use OnlyDownload Available

nvidia

vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.