NVIDIA Introduces AI for Multilingual Translation and Speech Services
NVIDIA has released new microservices that will help developers add generative AI features like machine translation, transcription, and text-to-speech to their apps. These microservices use a scalable, flexible, and modular architecture that works best with large language models.
NVIDIA, a chipmaker and AI powerhouse that was ranked third in the world at the time of this writing, has announced a set of microservices to help developers add generative AI to their apps.
Developers can incorporate machine translation (MT) across 30 languages, transcription, and text-to-speech features with these microservices.
“Microservice” is a way of setting up applications. Previously, a single, tightly connected unit contained all functions. On the other hand, microservice architecture separates applications into independent modules for deployment.
The microservice approach lets you work on different parts of an application at the same time. This speeds up development and lets you roll out updates one at a time without affecting the whole app.
“The microservices architecture is especially well-suited for developing generative AI applications because it is scalable, has better modularity, and is flexible,” NVIDIA wrote in a blog post in July 2024.
NVIDIA NIM is a group of accelerated inference microservices that let AI models run on NVIDIA GPUs anywhere, like in the cloud, a data centre, or a central workstation.
Medical care, data processing, and retrieval-augmented generation (RAG) are just some of the fields and use cases that NVIDIA’s microservices cover.
NVIDIA released dozens of “enterprise-grade generative AI microservices” in March 2024. NVIDIA released NVIDIA ACE generative AI microservices in June 2024, aiming to accelerate the development of digital humans.
NVIDIA Riva was part of the NVIDIA ACE suite. It was a set of GPU-accelerated speech and translation microservices for automatic speech recognition, machine translation, and text-to-speech. NVIDIA provides a compelling use case that involves “transforming chatbots into engaging, expressive multilingual assistants and avatars.”
NVIDIA Expands Multilingual Voice Capabilities
The most recent news from NVIDIA also discusses several methods for incorporating multilingual voice capabilities into apps. These include customer service bots, interactive voice assistants, and multilingual content platforms.
The most recent blog post from NVIDIA shows people how to use interactive speech and translation model interfaces in the NVIDIA API catalogue to do basic inference tasks right in their browsers.
Microservices are helpful for large language models (LLMs), which have powered some of the most recent and important advances in machine translation. LLMs need a lot of computing power, but microservices can help scale resource-heavy parts of the system while having the least amount of impact on the rest of the system.
NVIDIA is still working on machine translation research. Their most recent work is a paper published on September 20, 2024, called EMMeTT: Efficient Multimodal Machine Translation Training.
That it is still working to improve language technology, especially language AI, is intriguing when you consider that the company’s main business is making chips, which is not very close to translation, transcription, and text-to-speech. For NVIDIA, this means going up against some big names in tech, like IBM, Microsoft‘s Azure, and AWS, which calls itself “the most complete platform for microservices.”