Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • ADOPTION
  • TRADING
  • HACKING
  • SLOT
  • CASINO
Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • ADOPTION
  • TRADING
  • HACKING
  • SLOT
  • CASINO
Crypto Flexs
Home»ADOPTION NEWS»Together AI, Kernel Collection Boosts NVIDIA H200 and H100 GPU Cluster Performance
ADOPTION NEWS

Together AI, Kernel Collection Boosts NVIDIA H200 and H100 GPU Cluster Performance

By Crypto FlexsSeptember 6, 20243 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
Together AI, Kernel Collection Boosts NVIDIA H200 and H100 GPU Cluster Performance
Share
Facebook Twitter LinkedIn Pinterest Email

Jorg Hiller
Sep 6, 2024 07:14

Together AI enhances NVIDIA H200 and H100 GPU clusters with the Together Kernel Collection to dramatically improve AI training and inference performance.





According to together.ai, Together AI has announced a significant improvement to its GPU clusters by integrating NVIDIA H200 Tensor Core GPUs. The upgrade is accompanied by Together Kernel Collection (TKC), a custom kernel stack designed to optimize AI operations, delivering significant performance improvements for both training and inference tasks.

Improved performance with TKC

Together Kernel Collection (TKC) is designed to significantly accelerate common AI tasks. Compared to the standard PyTorch implementation, TKC delivers up to 24% speedup for commonly used training operators and up to 75% speedup for FP8 inference tasks. These improvements can reduce GPU time, leading to cost-effectiveness and faster time to market.

Training and inference optimization

TKC’s optimized kernels, such as multilayer perceptron (MLP) with SwiGLU activation, are essential for training large-scale language models (LLMs) such as Llama-3. These kernels are reported to be 22-24% faster than standard implementations, with potential improvements of up to 10% over the best existing baselines. Inference tasks benefit from a powerful FP8 kernel stack that Together AI has optimized to deliver over 75% speedup over the default PyTorch implementation.

Native PyTorch compatibility

TKC is fully integrated with PyTorch, allowing AI developers to seamlessly leverage optimizations within their existing frameworks. This integration simplifies the adoption of TKC, making it as easy as changing an import statement within PyTorch.

Production level testing

Together AI ensures that TKC undergoes rigorous testing to meet production-grade standards, ensuring high performance and stability for real-world applications. All Together GPU clusters (H200 or H100) are TKC ready out of the box.

NVIDIA H200: Faster performance and more memory

The NVIDIA H200 Tensor Core GPU, based on the Hopper architecture, is designed for high-performance AI and HPC workloads. According to NVIDIA, the H200 delivers 40 percent faster inference performance on the Llama 2 13B and 90 percent faster on the Llama 2 70B than its predecessor, the H100. The H200 features 141 GB of HBM3e memory and 4.8 TB/s of memory bandwidth, nearly doubling the capacity and 1.4x the bandwidth of the H100.

High-performance interconnectivity

Together GPU Clusters leverage the SXM form factor to deliver high bandwidth and fast data transfer, and support ultra-fast GPU-to-GPU communication via NVIDIA’s NVLink and NVSwitch technologies. Combined with NVIDIA Quantum-2 3200Gb/s InfiniBand Networking, this setup is ideal for large-scale AI training and HPC workloads.

Cost-effective infrastructure

Together AI offers significant cost savings with an infrastructure designed to be up to 75% more cost-effective than cloud providers like AWS. The company also offers flexible commitment options from one month to five years, ensuring adequate resources at every stage of the AI ​​development lifecycle.

Reliability and Support

Together AI’s GPU clusters come with a 99.9% uptime SLA and are backed by rigorous acceptance testing. The company’s White Glove Service provides end-to-end support from cluster setup to ongoing maintenance, ensuring peak performance of your AI models.

Flexible deployment options

Together AI offers multiple deployment options, including Slurm for high-performance workload management, Kubernetes for containerized AI workloads, and bare metal clusters running Ubuntu for direct access and ultimate flexibility. These options meet a variety of AI project needs, from large-scale training to production-level inference.

Together AI continues to support the entire AI lifecycle with high-performance NVIDIA H200 GPU clusters and the Together Kernel Collection. The platform is designed to optimize performance, reduce costs, and ensure stability, making it the ideal choice for accelerating AI development.

Image source: Shutterstock


Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

As you challenge the mixed technology signal, OnDo Price Hovers challenges the August Bullish predictions.

August 7, 2025

XRP Open Interests decrease by $ 2.4B after recent sale

July 30, 2025

KAITO unveils Capital Launchpad, a Web3 crowdfunding platform that will be released later this week.

July 22, 2025
Add A Comment

Comments are closed.

Recent Posts

FLOKI’s Valhalla MMORPG Storms U.S. Television With 60-Day National Commercial Blitz

August 11, 2025

A Global Initiative To Transform Crypto Education From The Ground Up

August 11, 2025

Cango Inc. Acquires 50 MW Bitcoin Mining Facility In Georgia, Laying Groundwork For Future Energy Strategy

August 11, 2025

SIM Mining Cloud Mining Allows Global Investors To Easily Earn BTC And DOGE Profits Using Just Their Smartphones (daily Income Of $23,999 USD)

August 11, 2025

MultiBank Group Delivers Record H1 Results With $209M Revenue And MBG Token Driving 7X Returns Since Launch.

August 11, 2025

The Animoca brand invests in a nice cat

August 11, 2025

Is Alt Season finally here, just as Ether Lee’s tearing and a small cap follows?

August 11, 2025

Flareonix airdrop is live! Under the share of 100m FXP today!

August 11, 2025

Carv can be used for transactions!

August 10, 2025

Ethereum (ETH), SEI (Sei), and Bonk (Bonk) gathered in July, but one token is prepared to dominate next.

August 10, 2025

Floki and OnDo expand their profits as Robinhood Listing strengthens.

August 10, 2025

Crypto Flexs is a Professional Cryptocurrency News Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of Cryptocurrency. We hope you enjoy our Cryptocurrency News as much as we enjoy offering them to you.

Contact Us : Partner(@)Cryptoflexs.com

Top Insights

FLOKI’s Valhalla MMORPG Storms U.S. Television With 60-Day National Commercial Blitz

August 11, 2025

A Global Initiative To Transform Crypto Education From The Ground Up

August 11, 2025

Cango Inc. Acquires 50 MW Bitcoin Mining Facility In Georgia, Laying Groundwork For Future Energy Strategy

August 11, 2025
Most Popular

Removing this obstacle could lead to a rebound in XRP price.

January 3, 2024

Polygon is busy, but why isn’t it making money?

June 1, 2024

FaZe Clan founder Banks says Ethereum is a ‘big percentage’ of his net worth

December 6, 2023
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
© 2025 Crypto Flexs

Type above and press Enter to search. Press Esc to cancel.