Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
Home»ADOPTION NEWS»IBM Research Announces Innovations to Accelerate Enterprise AI Training
ADOPTION NEWS

IBM Research Announces Innovations to Accelerate Enterprise AI Training

By Crypto FlexsSeptember 23, 20243 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
IBM Research Announces Innovations to Accelerate Enterprise AI Training
Share
Facebook Twitter LinkedIn Pinterest Email

Jack Anderson
23 Sep 2024 03:32

IBM Research has introduced a new data processing technique that accelerates AI model training and significantly improves efficiency by leveraging CPU resources.





According to IBM Research, IBM Research has unveiled a groundbreaking innovation aimed at expanding the data processing pipeline for enterprise AI training. The advancement is designed to leverage the abundant capacity of CPUs to accelerate the creation of powerful AI models such as IBM’s Granite models.

Optimizing data preparation

Before training an AI model, a large amount of data needs to be prepared. This data often comes from various sources such as websites, PDFs, and news articles, and must go through several preprocessing steps. These steps include filtering out irrelevant HTML code, removing duplicates, and screening for abusive content. These tasks are important, but they are not limited by the availability of GPUs.

Petros Zerfos, principal research scientist for IBM Research’s Watsonx data engineering, emphasized the importance of efficient data processing. “A lot of the time and effort that goes into training these models is spent preparing the data for those models,” Zerfos said. His team has been drawing on expertise from a variety of domains, including natural language processing, distributed computing, and storage systems, to develop ways to improve the efficiency of the data processing pipeline.

CPU capacity utilization

Many steps in the data processing pipeline involve “embarrassingly parallel” computations, where each document can be processed independently. This parallelism allows the work to be distributed across multiple CPUs, which can significantly speed up data preparation. However, some steps, such as removing duplicate documents, require access to the entire data set, which cannot be done in parallel.

To accelerate IBM’s Granite model development, the team developed a process to rapidly provision and utilize tens of thousands of CPUs. This approach involved marshalling idle CPU capacity across IBM’s Cloud data center network to ensure high communication bandwidth between CPUs and data storage. Traditional object storage systems are often underperforming, leaving CPUs idle, so the team used IBM’s high-performance Storage Scale file system to efficiently cache active data.

AI Training Scaling

Last year, IBM scaled up to 100,000 vCPUs on IBM Cloud to process 14 petabytes of raw data, generating 40 trillion tokens for AI model training. The team automated these data pipelines using Kubeflow on IBM Cloud. Their method proved to be 24x faster than previous techniques with Common Crawl data processing.

All of IBM’s open source Granite code and language models are trained using data prepared through these optimized pipelines. IBM has also made a significant contribution to the AI ​​community by developing the Data Prep Kit, a toolkit hosted on GitHub that simplifies data preparation for large-scale language model applications, supporting pretraining, fine-tuning, and augmented search generation (RAG) use cases. Built on distributed processing frameworks such as Spark and Ray, the kit allows developers to build scalable custom modules.

For more information, visit the official IBM Research blog.

Image source: Shutterstock


Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

AAVE price prediction: $185-195 recovery target in 2-4 weeks

January 6, 2026

Is BTC Price Heading To $85,000?

December 29, 2025

Crypto’s Capitol Hill champion, Senator Lummis, said he would not seek re-election.

December 21, 2025
Add A Comment

Comments are closed.

Recent Posts

MEXC Adds 32 Tokenized Stocks From Ondo Finance, Expanding Blue-Chip Access For 40 Million Users

January 20, 2026

Bitmine Immersion Technologies (BMNR) Announces ETH Holdings Reach 4.203 Million Tokens, And Total Crypto And Total Cash Holdings Of $14.5 Billion

January 20, 2026

Pendle Announces Token Upgrade As Its DeFi Yield Platform Scales

January 20, 2026

Up To 5.2% APY With Instant Access

January 20, 2026

Hong Kong group warns SFC’s ‘hard start’ could throw cryptocurrency companies into chaos

January 20, 2026

XRP ETF Trading Volume Reaches Record High XRP Holders Can Earn Up to USD 9,000 per Day

January 20, 2026

Do you have at least 10,000 XRP? An expert reveals what this means for you.

January 19, 2026

DeadLock ransomware exploits the Polygon blockchain to silently spin up proxy servers.

January 19, 2026

3-Wave Correction Sets XRP Price on Bearish Course

January 19, 2026

Husky Inu AI (HINU) was set at $0.00025441, sending the cryptocurrency market trading slightly lower and the spot Bitcoin ETF posting its strongest week since October.

January 19, 2026

Cardano price has hit a supply wall near $0.40. Can the ADA maintain support?

January 18, 2026

Crypto Flexs is a Professional Cryptocurrency News Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of Cryptocurrency. We hope you enjoy our Cryptocurrency News as much as we enjoy offering them to you.

Contact Us : Partner(@)Cryptoflexs.com

Top Insights

MEXC Adds 32 Tokenized Stocks From Ondo Finance, Expanding Blue-Chip Access For 40 Million Users

January 20, 2026

Bitmine Immersion Technologies (BMNR) Announces ETH Holdings Reach 4.203 Million Tokens, And Total Crypto And Total Cash Holdings Of $14.5 Billion

January 20, 2026

Pendle Announces Token Upgrade As Its DeFi Yield Platform Scales

January 20, 2026
Most Popular

Michael Saylor’s MicroStrategy raises $500 million to buy more Bitcoin

March 14, 2024

Compound DAO Faces Possible Whale Control After Proposal Approval

July 30, 2024

Altcoins in ‘last exit pump’ for Bitcoin before final surrender, warns top analyst – here’s the timeline

April 27, 2024
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
© 2026 Crypto Flexs

Type above and press Enter to search. Press Esc to cancel.