Study Finds Prominent AI Models Exhibit Irrational Behaviour in Logic Puzzles

A study by researchers from University College London (UCL) revealed that leading AI models, including ChatGPT and Meta’s Llama, displayed irrational behaviour and made simple mistakes when solving classic logic puzzles designed to test human reasoning. The research raises concerns about the reasoning capabilities of current AI technologies.

Researchers from University College London (UCL) conducted a study on the reasoning capabilities of seven prominent AI models, including ChatGPT, Meta’s Llama, Claude 2, and Google Bard (now called Gemini). The study revealed that these large language models frequently exhibited irrational behavior and simple mistakes while solving logic puzzles designed to test human reasoning.

The AIs were tested using 12 classic logic puzzles such as the Monty Hall Problem, the Linda Problem, the Wason Task, and the AIDS Task. Though humans also struggle with these puzzles, the AI models displayed irrational responses distinct from those typically shown by humans. Notably, some AI models even refused to answer certain logic questions, citing ethical concerns.

Meta’s Llama 2 highlighted these issues by refusing to respond to questions like the Linda Problem due to perceived “harmful gender stereotypes,” affecting its performance. The best-performing AI was ChatGPT 4-0, which correctly answered 69.2% of the time, while the worst was Meta’s Llama 2 7b, with a 77.5% error rate.

These findings, published in Royal Society Open Science, indicate that current AI models do not yet possess human-like reasoning abilities and raise questions about their application in critical fields such as medicine and diplomacy.

Trending

UN experts warn against market-driven AI development amid global concerns

IBM launches free AI training programme with skill credential in just 10 hours

GamesBeat Next 2023: Emerging leaders in video game industry to convene in San Francisco

Study Finds Prominent AI Models Exhibit Irrational Behaviour in Logic Puzzles

UN experts warn against market-driven AI development amid global concerns

IBM launches free AI training programme with skill credential in just 10 hours

GamesBeat Next 2023: Emerging leaders in video game industry to convene in San Francisco

Alibaba Cloud unveils cutting-edge modular datacentre technology at annual Apsara conference

Dentistry.One unveils innovative SmileScan AI tool for oral health monitoring

Inbolt secures €15 million in Series A round to propel expansion and technological advancements

IBM launches free AI training programme with skill credential in just 10 hours

GamesBeat Next 2023: Emerging leaders in video game industry to convene in San Francisco

Alibaba Cloud unveils cutting-edge modular datacentre technology at annual Apsara conference

Trending

Study Finds Prominent AI Models Exhibit Irrational Behaviour in Logic Puzzles

Related News