Top News

NVIDIA's new AI model combines text, vision, and speech
NewsBytes | April 29, 2026 6:39 PM CST



NVIDIA's new AI model combines text, vision, and speech
29 Apr 2026


NVIDIA has unveiled a new artificial intelligence (AI) model, the Nemotron 3 Nano Omni. The system combines text, vision, and speech capabilities into a single platform.

With around 30 billion parameters, the model uses a mixture-of-experts architecture to deliver extremely low latency while offering high flexibility and control.


Touted to be 9 times faster than its rivals
Innovative design


The Nemotron 3 Nano Omni model combines vision and audio encoders with NVIDIA's 30B-AD3B hybrid MoE architecture.

This innovative design does away with the need for separate perception modules, allowing the AI model to integrate everything into one.

The result is improved efficiency at scale and up to nine times faster throughput than other open omni models currently available in the market.


The new model can help improve agentic AI applications
Enhanced performance


The new model is expected to significantly improve the performance of agentic AI applications.

"To build useful agents, you can't wait seconds for a model to interpret a screen," said Gautier Cloix, CEO of H Company.

He added that "By building on Nemotron 3 Nano Omni, our agents can rapidly interpret full HD screen recordings — something that wasn't practical before."


The smaller size of the model makes it more versatile
Versatile integration


The smaller size of the Nemotron 3 Nano Omni model also makes it possible to run on higher-end consumer hardware and execute efficiently on enterprise cloud deployments.

It is designed to work with other proprietary cloud models or NVIDIA's own Nemotron open models, such as Nemotron 3 Super for high-frequency execution or Super for complex planning.


The Nemotron 3 Nano Omni is available on Hugging Face
User-friendly deployment


The new model can quickly understand documents, computer displays, voice activity, video, and more. This makes it an ideal interface for human-machine interaction.

NVIDIA has made the Nemotron 3 Nano Omni available on Hugging Face, OpenRouter and build.nvidia.com as an NVIDIA NIM microservice.


READ NEXT
Cancel OK