Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
  • DIRECTORY
  • CRYPTO
    • ETHEREUM
    • BITCOIN
    • ALTCOIN
  • BLOCKCHAIN
  • EXCHANGE
  • TRADING
  • SUBMIT
Crypto Flexs
Home»ETHEREUM NEWS»The bigger the AI, the more lies there are.
ETHEREUM NEWS

The bigger the AI, the more lies there are.

By Crypto FlexsSeptember 27, 20245 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr Email
The bigger the AI, the more lies there are.
Share
Facebook Twitter LinkedIn Pinterest Email

Researchers have found evidence that artificial intelligence models lie rather than admit the shame of not knowing something. This behavior becomes more evident as size and complexity increase.

A new study published in Nature shows that as LLMs get larger, they become less reliable for certain tasks. Although we don’t lie in exactly the same way that we recognize words, we do tend to give confident answers even if the answers aren’t true. Because they have been trained to believe that it is true.

This phenomenon, which researchers have dubbed “ultra-crepidarian” (a 19th-century word that basically means expressing an opinion about something one knows nothing about), sees LLMs venturing far beyond their knowledge base to provide answers. Explain. “(LLMs) fail proportionally more when they answer without knowing,” the study noted. That is, the model is not aware of its own ignorance.

This study, which examined the performance of several LLM families, including OpenAI’s GPT series, Meta’s LLaMA model, and BigScience’s BLOOM suite, highlights the disconnect between increasing model capabilities and reliable real-world performance.

Larger LLMs generally show improved performance on complex tasks, but these improvements do not necessarily translate into consistent accuracy, especially on simple tasks. This “difficulty mismatch” (the failure of LLMs at tasks easily perceived by humans) undermines the idea of ​​a stable operating domain for these models. Despite increasingly sophisticated training methods, including scaling model sizes and data volumes and shaping models based on human feedback, researchers have yet to find a reliable way to eliminate these inconsistencies.

The results of this study directly contradict existing beliefs about AI development. Traditionally, it was thought that increasing the size, amount of data, and computational power of the model would yield more accurate and reliable results. However, research shows that scaling can actually worsen reliability problems.

Larger models were shown to significantly reduce task avoidance. This means you’ll be less inclined to avoid difficult questions. While this may seem like a positive development at first glance, it also has important downsides. These models are also more likely to give incorrect answers. In the graph below, you can easily see how the model avoids the task (light blue) and instead throws incorrect results (red). Correct answers are displayed in dark blue.

“Current scaling and shaping trade evasion for more inaccuracy,” the researchers noted, but solving this problem isn’t as easy as training the model more carefully. “For the formed model, the evasion rate is much lower, but the inaccuracy is much higher,” the researchers said. However, models trained to prevent task execution can become lazy or nerfed, as users have mentioned in other top-rated LLMs such as ChatGPT or Claude.

The researchers found that this phenomenon was not because larger LLMs were unable to excel at simple tasks, but instead were trained to be better at complex tasks. It’s like someone who only ate delicious food suddenly had difficulty making home barbecue or traditional cakes. AI models trained on massive and complex datasets are prone to missing basic skills.

The problem is further complicated by the model’s apparent confidence. It is often difficult for users to distinguish between when an AI is providing accurate information and when it is confidently spewing out incorrect information. This overconfidence can lead to dangerous over-reliance on AI results, especially in critical fields such as healthcare or legal advice.

The researchers also noted that the reliability of the extended model fluctuated across different domains. Performance may increase in one area while performance may decrease in another, creating a whack-a-mole effect that makes it difficult to establish a “safe” workspace. “The proportion of evasive answers rarely increases faster than the proportion of incorrect answers. The readouts are clear: errors still occur more frequently. This represents a breakthrough in reliability,” the researchers wrote.

This study highlights the limitations of current AI training methods. Techniques such as reinforcement learning with human feedback (RLHF) to shape AI behavior may actually make the problem worse. This approach appears to reduce the model’s tendency to avoid tasks it cannot handle. Remember the infamous “Can’t AI language models do that?” This inadvertently encourages more frequent errors.

“As an AI language model, I…

I hope that the LLM will allow you to get down to the nitty-gritty and explore your innermost thoughts.

I want to see both the beautiful world and the ugly world contained within its billions of weight. A world that reflects ourselves.

— Hardmaru (@hardmaru) May 9, 2023

Rapid engineering, the art of writing effective queries against AI systems, appears to be a key skill in responding to these challenges. Even highly advanced models like GPT-4 are sensitive to how the question is phrased, and slight differences can lead to drastically different results.

This can be seen more easily when comparing different LLM series. For example, Claude 3.5 Sonnet requires a completely different prompt style than OpenAI o1 for best results. Inappropriate prompts can easily cause models to hallucinate.

Human oversight alone, long seen as a safeguard against AI mistakes, may not be enough to address these problems. Research shows that users often have difficulty correcting incorrect model output even in relatively simple domains, so relying on human judgment as a fail-safe may not be the ultimate solution for proper model training. “Users can recognize these high-difficulty cases, but still frequently make poor supervision errors,” the researchers observed.

The findings call into question the current trajectory of AI development. While the push for bigger, more capable models continues, this research suggests that bigger is not always better when it comes to AI reliability.

And now companies are focusing on data quality over quantity. For example, Meta’s latest Llama 3.2 model achieves better results than previous generations trained on more parameters. Fortunately, this allows them to less humanSo when you ask them to look stupid for the most basic thing in the world, they may admit defeat.

generally intelligent newsletter

A weekly AI journey explained by Gen, a generative AI model.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email

Related Posts

JPMorgan leverages both Ethereum and Solana for separate reasons for its institutional cash stack.

May 14, 2026

EEA Begins Treasury Deployment on Ethereum-Based Staking Infrastructure

May 10, 2026

Soldøgn Interop Summary ☀️ | Ethereum Foundation Blog

May 6, 2026
Add A Comment

Comments are closed.

Recent Posts

With Ethereum price stuck below $2,320, hopes for recovery are starting to fade.

May 16, 2026

Washington DC Summit As Real Estate Tokenization Enters Its Next Phase

May 15, 2026

Could BNB price fall above $750 if a double bottom pattern forms?

May 15, 2026

MEXC’s First USD1 Event Concludes With Over 160K Participants & $2.4 Billion In Futures Trading Volume

May 15, 2026

Eightco Holdings Inc. Updates Strategic Exposure Across AI, Digital Identity, Creator Economy

May 15, 2026

MapleStory Universe Marks One Year Of Live Ops, Surpasses 150M On-chain Transactions, Entering MSU 2.0 Phase

May 14, 2026

Base58Labs officially launches cryptocurrency arbitrage platform

May 14, 2026

MEXC Confirms Strong Asset Backing In Hacken-Audited May 2026 Proof Of Reserves Report

May 14, 2026

New Tokens Average At 2,341%, TradFi Futures Volume Climbs 55%: MEXC April Report

May 14, 2026

Cloudbet Expands Provably Fair Casino With 21 New Titles And 13 Originals

May 14, 2026

JPMorgan leverages both Ethereum and Solana for separate reasons for its institutional cash stack.

May 14, 2026

Crypto Flexs is a Professional Cryptocurrency News Platform. Here we will provide you only interesting content, which you will like very much. We’re dedicated to providing you the best of Cryptocurrency. We hope you enjoy our Cryptocurrency News as much as we enjoy offering them to you.

Contact Us : Partner(@)Cryptoflexs.com

Top Insights

With Ethereum price stuck below $2,320, hopes for recovery are starting to fade.

May 16, 2026

Washington DC Summit As Real Estate Tokenization Enters Its Next Phase

May 15, 2026

Could BNB price fall above $750 if a double bottom pattern forms?

May 15, 2026
Most Popular

BNB Incubation Alliance (BIA) to Hold 3rd Event at Singapore Token 2049

September 10, 2024

Raydium’s dominance of the Solana DEX has impacted RAY’s ATH.

January 19, 2025

Ether Leeum Price -Upside Brake can cause rally upside brakes.

April 22, 2025
  • Home
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
© 2026 Crypto Flexs

Type above and press Enter to search. Press Esc to cancel.