NVIDIA’s EMBark revolutionizes training large-scale recommender systems.

Ted Hisokawa
November 21, 2024 02:40

NVIDIA introduces EMBark, which optimizes the embedding process to power deep learning recommendation models and significantly increases training efficiency for large-scale systems.

In an effort to increase the efficiency of large-scale recommender systems, NVIDIA introduced EMBark, a new approach that aims to optimize the embedding process of deep learning recommendation models. According to NVIDIA, recommender systems play a central role in the Internet industry, and training them efficiently is a critical task for many companies.

Challenges of training recommendation systems

Deep learning recommendation models (DLRMs) often incorporate billions of identity features and require robust training solutions. Recent advances in GPU technology, such as NVIDIA Merlin HugeCTR and TorchRec, have improved DLRM training by leveraging GPU memory to handle large-scale identity feature embeddings. However, as the number of GPUs increases, the communication overhead during embedding becomes a bottleneck, sometimes accounting for more than half of the total training overhead.

EMBark’s innovative approach

EMBark, presented at RecSys 2024, addresses these challenges by implementing a 3D flexible sharding strategy and communication compression techniques, aiming to balance the load during training and reduce communication time for embedding. The EMBark system includes three core components: an embedding cluster, a flexible 3D sharding scheme, and a sharding planner.

Includes cluster

These clusters promote efficient training by grouping similar features and applying custom compression strategies. EMBark categorizes clusters into data-parallel (DP), reduction-based (RB), and unique-based (UB) types, each suitable for different training scenarios.

Flexible 3D sharding method

This innovative scheme allows precise control of workload balancing across GPUs by leveraging 3D tuples to represent each shard. This flexibility addresses imbalance issues found in traditional sharding methods.

Sharding Planner

The sharding planner uses a greedy search algorithm to determine the optimal sharding strategy and improves the training process based on hardware and embedding configuration.

Performance and Evaluation

The efficiency of EMBark was tested on NVIDIA DGX H100 nodes, demonstrating significant improvements in training throughput. Across a variety of DLRM models, EMBark achieves an average 1.5x increase in training speed, with some configurations being up to 1.77x faster than existing methods.

EMBark significantly improves the efficiency of large-scale recommender system models by strengthening the embedding process, setting a new standard for deep learning recommender systems. To get more detailed insight into EMBark’s performance, you can view its research paper.

Image source: Shutterstock

NVIDIA’s EMBark revolutionizes training large-scale recommender systems.

BNB holders gained 177% in 15 months through Binance Rewards Program.

ETH ETF loses $242M despite holding $2K in Ether

Hong Kong regulators have set a sustainable finance roadmap for 2026-2028.

Web3 Foundation refocuses on global advocacy as the Polkadot ecosystem matures.

Beef.com Launches Infrastructure Blueprint To Build The Digital Backbone Of A Rancher-First Food Economy

Bybit TradFi Stock Festival Announces Trading Competition With 100,000 USDT Prize Pool

Nasdaq-Listed Company CIMG Signs Strategic Agreement To Acquire Core Assets Of IZUMi Finance

ChangeNOW settles cryptocurrency swaps in less than 1 minute.

Institutions are returning to Ethereum as staking records hit record highs.

Intelligence In The Age Of Crypto

Leading Enterprise-Grade Crypto Safekeeping Solutions For Institutions

Intelligence In The Age Of Crypto

Digital Casinos In The Age Of Crypto

Transacta partners with CryptoJets to support growing demand for cryptocurrency payments in civil aviation

Top Insights

Web3 Foundation refocuses on global advocacy as the Polkadot ecosystem matures.

Beef.com Launches Infrastructure Blueprint To Build The Digital Backbone Of A Rancher-First Food Economy

Bybit TradFi Stock Festival Announces Trading Competition With 100,000 USDT Prize Pool

Most Popular

EU Parliament approves landmark AI bill paving the way for safe and ethical AI development

Visa’s crypto chief says non-dollar stablecoins will rise in the coming years

Cryptocurrency Coalition Advocates Adoption of Bitcoin Emoji

NVIDIA’s EMBark revolutionizes training large-scale recommender systems.

Challenges of training recommendation systems

EMBark’s innovative approach

Includes cluster

Flexible 3D sharding method

Sharding Planner

Performance and Evaluation

Related Posts