Polski
русский
Українська

Six of the most advanced AI models failed research-level math tests: the best result turned out to be only 2%

Inna VasilyukNews
The most advanced AI solved only 2% of the tasks. Source: Freepik

Mathematicians developed new problems to test the reasoning skills of six most advanced artificial intelligence models . However, AIs failed almost all of the tests.

Modern AI models have difficulty solving research-level math problems. And even the most advanced AI systems are able to solve only 2% of the hundreds of problems they face, LiveScience writes.

According to the Epoch AI research institute, it usually takes hours or days for mathematicians with doctoral degrees to solve complex problems. And the most advanced AI models got less than 2% of them right.

A number of AI tests have been developed over the past decade, and in many cases, AI models easily pass these tests, scientists say. For example, in the standard MMLU (Measuring Massive Multitask Language Understanding) test, modern AI models answer 98% of math problems correctly.

Most of these tests are aimed at checking the ability of artificial intelligence to perform high school and college-level math, writes Elliot Glazer, a mathematician at Epoch AI.

However, a new set of tests called FrontierMath is aimed at a higher level of reasoning. Epoch AI developed the questions with the help of math professors. According to the developers, the tests cover a wide range of subfields, from number theory to algebraic geometry.

The scientists' findings show that currently, artificial intelligence models do not have mathematical reasoning at the research level. However, as AI develops, these comparative tests will provide a way to see if their reasoning abilities are deepening.

"By regularly evaluating state-of-the-art models and collaborating with the AI research community, we aim to deepen our understanding of AI’s capabilities and limitations," the team of scientists said.

Only verified information is available on OBOZ.UA Telegram channel and Viber. Do not fall for fakes!

Other News

DIU discloses new data on losses of DPRK soldiers in Kursk region

DIU discloses new data on losses of DPRK soldiers in Kursk region

North Korean mercenaries continue to be sent for assaults
The web shows what Podil looked like in Kyiv in 1942. A unique panoramic photo

The web shows what Podil looked like in Kyiv in 1942. A unique panoramic photo

Today, this area looks completely different