NVIDIA announced the launch of a new NVIDIA AI Foundry service, along with the NVIDIA NIM™ inference microservice, aimed at revolutionizing generative AI capabilities for enterprises worldwide. The initiative features Llama 3.1, an open-source model collection introduced to provide advanced AI tools to enterprises.
Customized AI solutions for businesses
With NVIDIA AI Foundry, companies and countries can now use Llama 3.1 and NVIDIA technology to build custom ‘supermodels’ tailored to their specific industry requirements. These models can be trained on proprietary and synthetic data generated by Llama 3.1 405B and NVIDIA Nemotron™ Reward models.
AI Foundry is powered by the NVIDIA DGX™ cloud AI platform, co-engineered with leading public cloud providers, providing scalable compute resources to meet evolving AI demands. The service aims to help enterprises and countries develop sovereign AI strategies and custom large-scale language models (LLMs) for domain-specific applications.
Major Industry Adoption
Accenture is the first company to leverage NVIDIA AI Foundry to build custom Llama 3.1 models for its clients. Companies like Aramco, AT&T, and Uber are early adopters of the new Llama NVIDIA NIM microservices, and we are seeing strong interest across a range of industries.
“Meta’s publicly available Llama 3.1 model represents a pivotal moment in the adoption of generative AI in enterprises around the world,” said Jensen Huang, founder and CEO of NVIDIA. “Llama 3.1 opens the door for every enterprise and industry to build cutting-edge generative AI applications. NVIDIA AI Foundry has Llama 3.1 fully integrated and is ready to help enterprises build and deploy custom Llama supermodels.”
Enhanced AI capabilities
The NVIDIA NIM inference microservice for Llama 3.1 is available now for download and promises up to 2.5x higher throughput than traditional inference methods. Enterprises can also pair it with the new NVIDIA NeMo Retriever NIM microservice to create advanced AI search pipelines for digital assistants and human avatars.
Accenture is leading the way in developing custom Llama 3.1 models using NVIDIA AI Foundry, leveraging the AI Refinery™ framework. “The world’s leading companies have seen how generative AI is transforming every industry and are eager to deploy applications powered by custom models,” said Julie Sweet, Accenture president and CEO. “Accenture has been working with NVIDIA NIM inference microservices for our internal AI applications, and now with NVIDIA AI Foundry, we can help our clients rapidly build and deploy custom Llama 3.1 models to power innovative AI applications that align with their own business priorities.”
Comprehensive AI model service
NVIDIA AI Foundry provides an end-to-end service that includes model curation, synthetic data generation, fine-tuning, discovery, and evaluation. Companies can create domain-specific models using Llama 3.1 models and the NVIDIA NeMo platform, with the option to generate synthetic data to improve model accuracy.
NVIDIA and Meta have collaborated to provide distillation recipes for Llama 3.1, enabling developers to build smaller, more customizable models that fit a variety of infrastructures, from AI workstations to laptops.
Leading companies in healthcare, energy, financial services, retail, transportation, and telecommunications are already integrating the NVIDIA NIM microservice for Llama 3.1, trained on over 16,000 NVIDIA H100 Tensor Core GPUs.
Future outlook
Production support for the Llama 3.1 NIM and NeMo Retriever NIM microservices is available through NVIDIA AI Enterprise. Additionally, members of the NVIDIA Developer Program will soon have free access to the NIM microservices for research, development, and testing.
For more information, visit the NVIDIA Newsroom.
Image source: Shutterstock