AssemblyAI has announced significant improvements to its automatic language detection (ALD) model, promising greater accuracy and support for a wider range of languages. According to AssemblyAI, these improvements are aimed at helping companies build more robust, multilingual applications.
Improved accuracy and expanded language support
The updated ALD model supports 17 languages, up from 7 previously, and includes additional languages such as Chinese, Finnish, and Hindi. AssemblyAI claims that the model delivers best-in-class accuracy in 15 of these 17 languages, outperforming four leading market vendors when benchmarked using the industry-standard FLEURS benchmark.
These improvements are expected to benefit a wide range of applications, including video captioning, meeting transcripts, and podcast processing. The improved accuracy and expanded language support will allow multilingual applications to work seamlessly without manual language selection.
Customizable confidence thresholds
In addition to improving accuracy and expanding language support, AssemblyAI introduces customizable confidence thresholds. This feature allows developers to set a minimum confidence level for language detection, ensuring that only high-certainty transcripts are processed. These thresholds can be adjusted for specific use cases, such as setting a high threshold for critical applications like customer service bots, or a low threshold for preliminary content classification.
For example, in a multilingual call center, setting a high confidence threshold for language detection ensures that calls are recorded using the correct language model, maintaining the accuracy of customer interactions. Conversely, for less critical applications, such as initial content classification, a lower threshold can help capture a wider range of content, which can guide further processing or manual review.
Accuracy that speaks volumes
AssemblyAI has rigorously tested and validated the performance of the ALD model. The results compared to four major market suppliers demonstrate the technical superiority of the model and lead to practical benefits for applications.
- Single API: Best Tier supports 17 languages and Nano supports 99 languages, simplifying multilingual applications and reducing development time.
- Reliable manuscript: Minimize troubleshooting with industry-leading language detection accuracy.
- Market Expansion: Consistent performance across multiple languages allows you to get to market quickly without extensive coordination.
- Better user experience: High accuracy ensures a great user experience in all supported languages.
Real-world use cases
These improvements are designed to be easily integrated into a variety of applications with just a few lines of code. Some practical use cases include:
- Global Meeting Transcript: Accurately document multilingual discussions without manual intervention.
- Customer Service Analytics: Accurate language classification enables analysis of cross-regional interactions, enabling accurate sentiment analysis and trend identification.
- Adaptive voice support: Improve natural language interactions by creating helpers that switch languages based on user input.
- Podcast transcript: We build a platform to accurately record and index content in multiple languages to improve searchability and accessibility.
These scenarios highlight how to build a robust, scalable solution for handling multilingual content by leveraging improved accuracy, expanded language support, and customizable confidence thresholds.
Start today
To learn more about AssemblyAI’s ALD model, visit the official documentation. Developers can get a free API key from AssemblyAI and start building on the API today.
Image source: Shutterstock