Alvin Lang
March 18, 2025 20:51
NVIDIA introduces NEMO curator, a streaming pipeline according to GPUs for efficient video processing in the DGX cloud, and optimizes the development of AI models and reduces costs.
According to NVIDIA, the appearance of physical AI has greatly increased video content production, and a single autonomous vehicle produces more than 1TB of video every day by a single autonomous vehicle producing more than 1TB of video. To efficiently manage and utilize this extensive data, NVIDIA has launched the NEMO curator, a streaming pipeline due to the GPU available in NVIDIA DGX Cloud.
Challenge with traditional processing
Existing batch processing systems have suffered from underground data growth, which often led to an increase in the utilization and cost of GPUs. This system can accumulate large data volume for processing, which can cause non -efficiency and waiting time problems in the development of AI models.
GPU-Seller Streaming Solution
To solve these tasks, NEMO curator introduces a flexible streaming pipeline that utilizes GPU acceleration for large -scale video cue. This high -end pipeline integrates automatic scaling and load balancing technology to optimize throughput at various stages to maximize hardware utilization and reduce total ownership (TCO).
Optimized throughput and resource use
The streaming processing method allows you to pip the intermediate data directly between the steps to reduce the waiting time and improve the efficiency. By separating CPU -intensive tasks from GPU -intensive tasks, the system can fit better with the actual capacity of the available infrastructure, avoiding idle resources and guaranteeing balanced throughput.
Architecture and implementation
The Nemo Curator pipeline based on Ray Framework is divided into several stages, from video decoding to embedding calculations. Each stage uses the pool of the Ray Actor to maintain the optimal throughput by managing the input and output queue. This system dynamically adjusts the actor pool size to accommodate various stages, ensuring consistent flow and efficiency.
Performance and future prospects
Compared to the existing batch processing, the streaming pipeline achieves a 1.8x speed and processes a video of an hour per GPU in about 195 seconds. The NEMO curator pipeline showed an 89 times performance improvement compared to the baseline, and you can handle 720p video of about 1 million hours in 2,000 H100 GPUs a day. NVIDIA continues to cooperate with the initial access partner to further subdivide the system and expand its function.
For more insights, visit the NVIDIA blog.
Image Source: Shutter Stock