Currency

War in Ukraine Ukrainian armed forces counter-offensive Military aid to Ukraine Vladimir Zelensky

Six of the most advanced AI models failed research-level math tests: the best result turned out to be only 2%

Inna Vasilyuk News22.11.2024 08:47

The most advanced AI solved only 2% of the tasks. Source: Freepik

Mathematicians developed new problems to test the reasoning skills of six most advanced artificial intelligence models . However, AIs failed almost all of the tests.

Modern AI models have difficulty solving research-level math problems. And even the most advanced AI systems are able to solve only 2% of the hundreds of problems they face, LiveScience writes.

According to the Epoch AI research institute, it usually takes hours or days for mathematicians with doctoral degrees to solve complex problems. And the most advanced AI models got less than 2% of them right.

Six of the most advanced AI models failed research-level math tests: the best result turned out to be only 2%

A number of AI tests have been developed over the past decade, and in many cases, AI models easily pass these tests, scientists say. For example, in the standard MMLU (Measuring Massive Multitask Language Understanding) test, modern AI models answer 98% of math problems correctly.

Most of these tests are aimed at checking the ability of artificial intelligence to perform high school and college-level math, writes Elliot Glazer, a mathematician at Epoch AI.

However, a new set of tests called FrontierMath is aimed at a higher level of reasoning. Epoch AI developed the questions with the help of math professors. According to the developers, the tests cover a wide range of subfields, from number theory to algebraic geometry.

The scientists' findings show that currently, artificial intelligence models do not have mathematical reasoning at the research level. However, as AI develops, these comparative tests will provide a way to see if their reasoning abilities are deepening.

"By regularly evaluating state-of-the-art models and collaborating with the AI research community, we aim to deepen our understanding of AI’s capabilities and limitations," the team of scientists said.

Only verified information is available on OBOZ.UA Telegram channel and Viber. Do not fall for fakes!

Study by Scientists

Other News

It became known what the coach told Usyk throughout the fight with Fury. Video

It became known what the coach told Usyk throughout the fight with Fury. Video

26.12.2024 22:31

HACCU bans sale of Medvedchuk's property

The value of Medvedchuk's property may have been underestimated: sale of ex-MP's belongings banned

26.12.2024 22:01

Most Europeans consider aid to Ukraine insufficient, but oppose increasing it – YouGov poll

Most Europeans consider aid to Ukraine insufficient, but oppose increasing it – YouGov poll

26.12.2024 21:27

DIU discloses new data on losses of DPRK soldiers in Kursk region

DIU discloses new data on losses of DPRK soldiers in Kursk region

North Korean mercenaries continue to be sent for assaults

26.12.2024 21:05

The web shows what Podil looked like in Kyiv in 1942. A unique panoramic photo

The web shows what Podil looked like in Kyiv in 1942. A unique panoramic photo

Today, this area looks completely different

26.12.2024 20:39

Toyota RAV4 Dark Side

The new Toyota RAV4 surprised with its design. Photo

26.12.2024 20:12

Hyundai Inster

The best cars of 2025: which inexpensive models outperformed everyone

26.12.2024 20:09

'Servants of the Moscow Church are in Russia, and we have...' Lomachenko throws a tantrum over interview with OCU head

"Servants of the Moscow Church are in Russia, and we have..." Lomachenko throws a tantrum over interview with OCU head

26.12.2024 20:07

Will leave an impression for a lifetime: 10 most beautiful islands in the world

Will leave an impression for a lifetime: 10 most beautiful islands in the world

26.12.2024 20:06

How to cook pancakes so that they do not have dry edges: recipe

How to cook pancakes so that they do not have dry edges: recipe

26.12.2024 20:04

Appetizer recipe

The fastest and most delicious New Year's appetizer of jamon, with cheese and herbs: 2 minutes to prepare

26.12.2024 20:01

Volkswagen ID.Unyx

A new inexpensive VW crossover with a powerful engine has appeared in Ukraine. Photo

26.12.2024 19:57

Is it necessary to remove the foam when cooking potatoes: we explain an important nuance

Is it necessary to remove the foam when cooking potatoes: we explain an important nuance

26.12.2024 19:49

"Ukrzaliznytsia has significantly increased the number of seats on trains

"Ukrzaliznytsia" has significantly increased the number of train tickets: how much do tickets cost

In which trains additional wagons appeared

26.12.2024 19:40

'Because he is Ukrainian?' Russian Olympic champion outraged by Usyk's first place in the honorary ranking

"Because he is Ukrainian?" Russian Olympic champion outraged by Usyk's first place in the honorary ranking

Mekhontsev criticized journalists

26.12.2024 19:08

Ukrainian Armed Forces clarify the results of a strike on a Russian army command post in Kursk region: many officers killed

Ukrainian Armed Forces clarify the results of a strike on a Russian army command post in Kursk region: many officers killed

26.12.2024 18:39

"Ukrposhta" told which services will become more expensive in 2025

Parcels and more to rise in price: "Ukrposhta" will increase a number of tariffs in 2025

26.12.2024 18:12

Kherson in the nineteenth century: the first known photos of the city

Kherson in the nineteenth century: the first known photos of the city

26.12.2024 17:43

A Ukrainian brand dressed Beyoncé and her 12-year-old daughter for a historic performance during the halftime show of the NFL's Christmas game. Photos and videos

26.12.2024 17:14

Azerbaijan Airlines plane was shot down by a Russian missile fired during drone activity over Grozny – Euronews

Azerbaijan Airlines plane was shot down by a Russian missile fired during drone activity over Grozny – Euronews

26.12.2024 16:55