Article

2025-08-13 Science & Technology

AI LLMs' ‘’simulated reasoning’’ abilities are a ‘’brittle mirage,’’ researchers find

[ArsTechnica] In recent months, the AI industry has started moving toward so-called simulated reasoning models that use a "chain of thought" process to work through tricky problems in multiple logical steps. At the same time, recent research has cast doubt on whether those models have even a basic understanding of general logical concepts or an accurate grasp of their own "thought process."

Similar research shows that these "reasoning" models can often produce incoherent, logically unsound answers when questions include irrelevant clauses or deviate even slightly from common templates found in their training data.

In a recent pre-print paper, researchers from the University of Arizona summarize this existing work as "suggest[ing] that LLMs are not principled reasoners but rather sophisticated simulators of reasoning-like text." To pull on that thread, the researchers created a carefully controlled LLM environment in an attempt to measure just how well chain-of-thought reasoning works when presented with "out of domain" logical problems that don't match the specific logical patterns found in their training data.

The results suggest that the seemingly large performance leaps made by chain-of-thought models are "largely a brittle mirage" that "become[s] fragile and prone to failure even under moderate distribution shifts," the researchers write. "Rather than demonstrating a true understanding of text, CoT reasoning under task transformations appears to reflect a replication of patterns learned during training."

Posted by Elmerert Hupens2660 2025-08-13 00:00|| E-Mail|| Front Page|| ||Comments [100 views ] Top

#1 I'm beginning to think that AI is just bull.

Posted by Grom the Affective 2025-08-13 02:53|| 2025-08-13 02:53|| Front Page || Comments Top

#2 AI brings a gargantuan productivity increase in fields related to esthetics. Commercial art, illustrations for ad campaigns, commercial jingles etc is AI's strength.

But AI is unreliable when it comes to facts.

If thoroughly checking AI output for accuracy and suitability is easy then AI is an appropriate tool with a disruptive impact.

If thoroughly checking AI output is hard, perhaps harder than creating on original work using other tools AI will decrease productivity.

The standard SF trope that an AI would resemble an unemotional, fact oriented, rational human mind has turned out not to be true.

AI is more like a simulation of a excentric gifted artist's mind during a fever dream, a product of complex computational activity without consciousness.

Posted by Elmerert Hupens2660 2025-08-13 04:11|| 2025-08-13 04:11|| Front Page || Comments Top

#3 "the same guy who shot the iconic Sydney Sweeney x American Eagle campaign just made a full Jack Daniels commercial using Higgsfield Seedance Pro

it used to cost 1 million, now $20 with AI"

Posted by Elmerert Hupens2660 2025-08-13 06:01|| 2025-08-13 06:01|| Front Page || Comments Top

#4 LLMs are not what researchers call Artificial General Intelligence. Efforts toward that incorporate multiple agents with different computational inputs and algorithms - think, a self-directing robot with many sensors of different kinds, a shared internal netowrk of inferences from those different modules etc. In other words, how human brains / minds work.

LLMs and the generative approach to making new sentences etc. from patterns learned via LLMs, are not AGI. The AI label stuck because language is key for humans - key, but not the whole story by any means.

Posted by Robin Burk 2025-08-13 07:27|| 2025-08-13 07:27|| Front Page || Comments Top

#5 For code:
Here’s how I[sic] use LLMs to help me write code

Posted by Skidmark 2025-08-13 08:17|| 2025-08-13 08:17|| Front Page || Comments Top

#6 The one thing that no AI is capable of is answering the "Why" question.

There has to be self awareness to motivate an independent activity to accomplish a goal. Anyone know of any such thing?

That's what all the algorithms and programs do. The hardest thing in programming is deciding what you (the customer) wants to accomplish and why. Lots of discussion about those.

Posted by alanc 2025-08-13 09:10|| 2025-08-13 09:10|| Front Page || Comments Top

10:48 Besoeker
10:40 Rob06
10:37 Skidmark
10:29 Skidmark
10:28 Skidmark
10:13 Skidmark
10:01 MikeKozlowski
09:50 Mullah Richard
09:37 Skidmark
09:13 Warthog
09:10 alanc
09:01 Skidmark
08:58 Warthog
08:49 Skidmark
08:46 alanc
08:28 Frank G
08:21 Mercutio
08:20 SteveS
08:17 Skidmark
08:12 Skidmark
08:08 Skidmark
08:08 SteveS
08:01 Frank G
07:53 Frank G
Merry-Go-Blog