Rebeca Moen
February 21, 2025 10:54
NVIDIA uses new multilingual functions to improve Riva ASR using the Whisper and Canary model to integrate advanced features for offline and automatic voice translation.
NVIDIA has made great progress in developing an automatic voice recognition (ASR) system by introducing improved functions through RIVA 2.18.0 containers and SDK. This development is part of NVIDIA’s continuous efforts to improve speech and translation AI micro service according to the GPU as described by the NVIDIA developer blog in detail.
Integration of new models
Repetition of the latest RIVA includes support for Parakeet Architecture, facilitating multilingual ASR streaming and facilitating Whisper and Canary models of offline ASR and automatic voice translation (AST). WHISPER and HUGGINGF developed by Openai
The canary model further expands the function of RIVA by supporting offline ASR and AST in various language combinations, including English, English, and all translations. This model meets a variety of linguistic demands to support language detection and translation.
Optional NMT disabled
One of the most notable features introduced in this update is to selectively disable part of the nerve machine translation (NMT) process. <dnt>
SSML tag. This feature allows users to specify a text segment that should not be translated to better control the translation output. The new DNT dictionary also rejects the way you need to translate certain words or phrases to improve custom definitions of the translation process.
Distribution and use
Distributing these new features is simplified through the Riva Skills Quick Start Resource folder. This includes the scripts and configuration files needed to set up the RIVA server with whisper and canary. The user can choose from the Whisper or Canary model according to a specific ASR requirement and optimizes the model distribution according to the GPU architecture using the provided script.
NVIDIA’s promises to expand the linguistic and functional scope of the ASR system are clear in the integration of these advanced models and functions. RIVA continues to set up industry standards for voice recognition and translation technology by supporting a wider range of languages and providing improved translation control.
For more information about NVIDIA’s latest ASR development, visit the NVIDIA developer blog.
Image Source: Shutter Stock