Polars announced the release of a new GPU engine powered by RAPIDS cuDF, which significantly improves data processing speeds on NVIDIA GPUs. According to the NVIDIA Technical Blog, this advancement will allow data scientists to process hundreds of millions of data rows in seconds on a single machine.
Growing Data Challenges
Existing data processing libraries, such as Pandas, are single-threaded and often impractical when dealing with data sets exceeding millions of rows. Distributed data processing systems can handle billions of rows, but they introduce complexity and overhead for smaller data sets. This leaves a gap in tools that can efficiently process tens to hundreds of millions of rows of data, which is commonly required for tasks such as model development, demand forecasting, and logistics in industries such as finance, retail, and manufacturing.
Polars, a fast-growing Python library designed for data scientists and engineers, aims to solve these challenges. It can seamlessly process hundreds of millions of rows on a single machine, using advanced query optimization to minimize unnecessary data movement and processing. Polars bridges the gap between single-threaded tools and complex distributed systems, providing a compelling solution for mid-scale data processing.
Bringing NVIDIA Accelerated Computing to Polars
Polars offers significant built-in acceleration over other CPU-only data manipulation tools by leveraging multi-threaded execution, advanced memory optimizations, and lazy evaluation. However, as data processing demands increase across industries, more performance is required. This is where accelerated computing becomes essential.
cuDF is part of the NVIDIA RAPIDS family of CUDA-X libraries, a GPU-accelerated DataFrame library that leverages the massive parallelism of GPUs to dramatically improve data processing performance. Working with NVIDIA, the Polars team has combined the speed of cuDF with the efficiency of Polars to achieve up to 13x performance improvements over CPU-based Polars. This integration allows users to maintain interactive experiences even as their data processing workloads scale to hundreds of millions or billions of rows.
The Polars GPU engine is built directly into the Polars Lazy API. Users can access GPU acceleration for their workflows by installing: polars(gpu)
Passing via pip (engine="gpu")
to collect Operational. This approach ensures efficient execution and minimal memory usage through Polars’ query optimizer, full compatibility with Polars’ ecosystem of data visualization, I/O, and machine learning libraries, and does not change any existing Polars code.
pip install polars(gpu) --extra-index-url=https://pypi.nvidia.com import polars as pl (transactions .group_by("CUST_ID") .agg(pl.col("AMOUNT").sum()) .sort(by="AMOUNT", descending=True) .head() .collect(engine="gpu"))
conclusion
The Polars GPU Engine, powered by RAPIDS cuDF, is now in open beta, giving data scientists and engineers a powerful tool for mid-scale data processing. Accelerating Polars workflows by up to 13x on NVIDIA GPUs, the engine efficiently processes datasets consisting of hundreds of millions of rows without the overhead of distributed systems. The Polars GPU Engine is fully integrated into the Polars API, making it easily accessible to all users.
Getting started with Polars GPU Engine
To learn more and get started with the Polars GPU engine, visit the official NVIDIA technology blog.
Image source: Shutterstock