AI starts a cost -effective batch API for LLM request.

James Ding
June 11, 2025 19:34

Together, AI introduces a placement API that decreases by 50% to handle large language model requests. This service provides extended and asynchronous processing for a non -water -oriented workload.

The AI has unveiled a new batch API, a service designed to handle many large language models (LLM) requests at a significant reduction in costs. According to AI, the Batch API is an attractive option for business and developers, promising to provide enterprise -class performance in half of the real -time reasoning cost.

Why is the batch processing?

Batch processing allows you to handle AI workloads that do not require immediate response, such as synthetic data creation and offline summary. By treating these requests asynchronously during the peak time, the user can benefit from cost savings while maintaining a reliable output. Most of the places are completed in a few hours and the maximum treatment window is 24 hours.

Main advantage

50% cost reduction

The Batch API provides a 50%cost savings in a non -water -oriented workload compared to the real -time API call, allowing users to expand the AI reasoning without increasing the budget.

Large -scale processing

The user can submit up to 50,000 requests in a single batch file, and the batch work has a separate interest rate limit from real time. This service includes a real -time progress tracking through a variety of stages, from verification to completion.

Simple integration

The request is uploaded to the JSONL file and the progress is monitored through the placement API. When processing is complete, you can download the results.

Supported model

The Batch API supports 15 advanced models, including the DEEPSEEK-AI and Meta-Llama series, which are adjusted to handle various complex tasks.

Operating

Prepare your request: Request for formats of JSONL files with unique identifiers.
Upload and submission: Use the File API to upload the placement and create a task.
Monitor progress: Trace your work through various processing stages.
Download the results: The error is documented separately to search for structured results.

Rate restrictions and scale

The batch API works under a dedicated speed limit, allowing up to 10 million tokens per model and 50,000 requests per batch file, and up to 100MB per input file.

Price and best practices

Users receive a 50% discount without prepaid promise. The optimal batch size is 1,000 ~ 10,000 requests, and model selection should be based on work complexity. Monitoring is recommended for updates every 30-60 seconds.

Starting

To start using the batch API, the user must upgrade to the latest information. together Review Python Client, Batch API documents and explore the example cooking book provided online. This service is now available to all users, so it provides significant cost savings for mass processing of LLM requests.

Image Source: Shutter Stock

AI starts a cost -effective batch API for LLM request.

Michael Burry’s Short-Term Investment in the AI Market: A Cautionary Tale Amid the Tech Hype

BTC Rebound Targets $110K, but CME Gap Cloud Forecasts

TRX Price Prediction: TRON targets $0.35-$0.62 despite the current oversold situation.

10 Best Altcoin Prop Trading Firms 2025

Phemex Launches $6 Million, Multi-Venue Festival To Celebrate Its 6th Anniversary

Kraken strengthens its global strategy as Citadel joins a new wave of investment with $200 million in funding.

Unlock Instant Liquidity Without Selling Your Crypto

Ethereum price crashes to $3,000 amid market shakeup, with analysts warning of volatility ahead.

Michael Burry’s Short-Term Investment in the AI Market: A Cautionary Tale Amid the Tech Hype

Bessent called for a reconsideration of taxes on cryptocurrency staking rewards.

Introducing Filecoin Onchain Cloud: Verifiable, Developer-Owned Infrastructure

Vault12 Guard now uses the CXP industrial protocol to retrieve iOS credentials from Apple Password.

The XRP Community Is Exploding! Investors Are Achieving An Average Daily Return Of $3,777 Using Anchor Mining!

Bitcoin price risks hitting a deeper bottom — unless this happens.

Top Insights

10 Best Altcoin Prop Trading Firms 2025

Phemex Launches $6 Million, Multi-Venue Festival To Celebrate Its 6th Anniversary

Kraken strengthens its global strategy as Citadel joins a new wave of investment with $200 million in funding.

Most Popular

Bitcoin Surge Explodes $170M from Bearish Shorts with BTC Price Targeting $48K

Pantera Bitcoin Fund Achieves 1,000X Milestone Amid Surging Market

Dogecoin (DOGE) has formed a bullish setup that previously triggered a massive rally, Crypto Trader has revealed.

AI starts a cost -effective batch API for LLM request.

Why is the batch processing?

Main advantage

50% cost reduction

Large -scale processing

Simple integration

Supported model

Operating

Rate restrictions and scale

Price and best practices

Starting

Related Posts