Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
Home»ETHEREUM NEWS»The bigger the AI, the more lies there are.
ETHEREUM NEWS

The bigger the AI, the more lies there are.

By Crypto FlexsSeptember 27, 20245 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
The bigger the AI, the more lies there are.
Share
Facebook Twitter LinkedIn Pinterest Email

Researchers have found evidence that artificial intelligence models lie rather than admit the shame of not knowing something. This behavior becomes more evident as size and complexity increase.

A new study published in Nature shows that as LLMs get larger, they become less reliable for certain tasks. Although we don’t lie in exactly the same way that we recognize words, we do tend to give confident answers even if the answers aren’t true. Because they have been trained to believe that it is true.

This phenomenon, which researchers have dubbed “ultra-crepidarian” (a 19th-century word that basically means expressing an opinion about something one knows nothing about), sees LLMs venturing far beyond their knowledge base to provide answers. Explain. “(LLMs) fail proportionally more when they answer without knowing,” the study noted. That is, the model is not aware of its own ignorance.

This study, which examined the performance of several LLM families, including OpenAI’s GPT series, Meta’s LLaMA model, and BigScience’s BLOOM suite, highlights the disconnect between increasing model capabilities and reliable real-world performance.

Larger LLMs generally show improved performance on complex tasks, but these improvements do not necessarily translate into consistent accuracy, especially on simple tasks. This “difficulty mismatch” (the failure of LLMs at tasks easily perceived by humans) undermines the idea of ​​a stable operating domain for these models. Despite increasingly sophisticated training methods, including scaling model sizes and data volumes and shaping models based on human feedback, researchers have yet to find a reliable way to eliminate these inconsistencies.

The results of this study directly contradict existing beliefs about AI development. Traditionally, it was thought that increasing the size, amount of data, and computational power of the model would yield more accurate and reliable results. However, research shows that scaling can actually worsen reliability problems.

Larger models were shown to significantly reduce task avoidance. This means you’ll be less inclined to avoid difficult questions. While this may seem like a positive development at first glance, it also has important downsides. These models are also more likely to give incorrect answers. In the graph below, you can easily see how the model avoids the task (light blue) and instead throws incorrect results (red). Correct answers are displayed in dark blue.

“Current scaling and shaping trade evasion for more inaccuracy,” the researchers noted, but solving this problem isn’t as easy as training the model more carefully. “For the formed model, the evasion rate is much lower, but the inaccuracy is much higher,” the researchers said. However, models trained to prevent task execution can become lazy or nerfed, as users have mentioned in other top-rated LLMs such as ChatGPT or Claude.

The researchers found that this phenomenon was not because larger LLMs were unable to excel at simple tasks, but instead were trained to be better at complex tasks. It’s like someone who only ate delicious food suddenly had difficulty making home barbecue or traditional cakes. AI models trained on massive and complex datasets are prone to missing basic skills.

The problem is further complicated by the model’s apparent confidence. It is often difficult for users to distinguish between when an AI is providing accurate information and when it is confidently spewing out incorrect information. This overconfidence can lead to dangerous over-reliance on AI results, especially in critical fields such as healthcare or legal advice.

The researchers also noted that the reliability of the extended model fluctuated across different domains. Performance may increase in one area while performance may decrease in another, creating a whack-a-mole effect that makes it difficult to establish a “safe” workspace. “The proportion of evasive answers rarely increases faster than the proportion of incorrect answers. The readouts are clear: errors still occur more frequently. This represents a breakthrough in reliability,” the researchers wrote.

This study highlights the limitations of current AI training methods. Techniques such as reinforcement learning with human feedback (RLHF) to shape AI behavior may actually make the problem worse. This approach appears to reduce the model’s tendency to avoid tasks it cannot handle. Remember the infamous “Can’t AI language models do that?” This inadvertently encourages more frequent errors.

“As an AI language model, I…

I hope that the LLM will allow you to get down to the nitty-gritty and explore your innermost thoughts.

I want to see both the beautiful world and the ugly world contained within its billions of weight. A world that reflects ourselves.

— Hardmaru (@hardmaru) May 9, 2023

Rapid engineering, the art of writing effective queries against AI systems, appears to be a key skill in responding to these challenges. Even highly advanced models like GPT-4 are sensitive to how the question is phrased, and slight differences can lead to drastically different results.

This can be seen more easily when comparing different LLM series. For example, Claude 3.5 Sonnet requires a completely different prompt style than OpenAI o1 for best results. Inappropriate prompts can easily cause models to hallucinate.

Human oversight alone, long seen as a safeguard against AI mistakes, may not be enough to address these problems. Research shows that users often have difficulty correcting incorrect model output even in relatively simple domains, so relying on human judgment as a fail-safe may not be the ultimate solution for proper model training. “Users can recognize these high-difficulty cases, but still frequently make poor supervision errors,” the researchers observed.

The findings call into question the current trajectory of AI development. While the push for bigger, more capable models continues, this research suggests that bigger is not always better when it comes to AI reliability.

And now companies are focusing on data quality over quantity. For example, Meta’s latest Llama 3.2 model achieves better results than previous generations trained on more parameters. Fortunately, this allows them to less humanSo when you ask them to look stupid for the most basic thing in the world, they may admit defeat.

generally intelligent newsletter

A weekly AI journey explained by Gen, a generative AI model.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

Enterprise Ethereum finally has a privacy playbook.

June 7, 2026

Clear Signatures: Making Transaction Approvals More Secure on Ethereum

June 3, 2026

‘He’s full of shit’: JP Morgan’s Jamie Dimon takes aim at Coinbase CEO over clarity laws

May 30, 2026
Add A Comment

Comments are closed.

Recent Posts

Bybit Launches New Daily Treasure Hunt Season Featuring Football Match Tickets And XAUT Rewards

June 10, 2026

World Cup 2026 Prediction Markets Now Live On Whale.io With $90K In Prizes

June 10, 2026

Chris Jericho To Join And Co-Create Official Community Traits For Kokopi Koalas™ NFT Collection

June 9, 2026

Bancor reduced its stable fee to 0.001%. Can BNT bounce back?

June 9, 2026

Neura Closes Strategic Funding Round And Partnerships To Build Emotional AI With Persistent, User-Owned Memory

June 9, 2026

Phemex Kicks Off $7 Million Ultimate Championship, Bringing Trading Competition To Football Season

June 9, 2026

MEXC Prediction Markets Launches Combo To Enable Multi-Event Combination Trading

June 9, 2026

ZIGChain expands on-chain access by integrating Ondo tokenized stocks and ETFs.

June 8, 2026

Bitmine Immersion Technologies (BMNR) Announces ETH Holdings Reach 5.54 Million Tokens, And Total Crypto And Total Cash Holdings Of $9.6 Billion

June 8, 2026

MapleStory Universe Opens MSU Space And Launches Global Game Jam Competition As Part Of MSU 2.0 Expansion

June 8, 2026

Why is UK Financial Ltd’s trillion-dollar ERC-3643 conversion attracting major platforms?

June 7, 2026

Crypto Flexs is a Professional Cryptocurrency News Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of Cryptocurrency. We hope you enjoy our Cryptocurrency News as much as we enjoy offering them to you.

Contact Us : Partner(@)Cryptoflexs.com

Top Insights

Bybit Launches New Daily Treasure Hunt Season Featuring Football Match Tickets And XAUT Rewards

June 10, 2026

World Cup 2026 Prediction Markets Now Live On Whale.io With $90K In Prizes

June 10, 2026

Chris Jericho To Join And Co-Create Official Community Traits For Kokopi Koalas™ NFT Collection

June 9, 2026
Most Popular

Trader Earns $1.6 Million Trading This New Solana Meme Coin

January 28, 2024

Binance does not see a clear roadmap for getting its executives out of Nigeria.

June 4, 2024

It is a virtual sky rocket, but 93%of the supply is controlled by whales.

May 9, 2025
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
© 2026 Crypto Flexs

Type above and press Enter to search. Press Esc to cancel.