Model by nvidia

llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

NVIDIA modelchatdoc intelligencemultiple image understandingOCRDownload AvailableFree Endpoint