Google says its Gemini AI outperforms both GPT-4 and expert humans

Matthew Sparkes in New Scientist:

Three versions of Gemini have been created for different applications, called Nano, Pro and Ultra, which increase in size and capability. Google declined to answer questions on the size of Pro and Ultra, the number of parameters they include or the scale or source of their training data. But its smallest version, Nano, which is designed to run locally on smartphones, is actually two models: one for slower phones that has 1.8 billion parameters and one for more powerful devices that has 3.25 billion parameters. Comparing the capabilities of AI models is an inexact science, but GPT-4 is rumoured to include up to 1.7 trillion parameters and Meta’s LLAMA-2 has 70 billion.

The mid-range Pro version of Gemini beats some other models, such as OpenAI’s GPT3.5, but the more powerful Ultra exceeds the capability of all existing AI models, Google claims. It scored 90 per cent on the industry-standard MMLU benchmark, where an “expert level” human is expected to achieve 89.8 per cent.

This is the first time an AI has beaten humans at the test, and is the highest score for any existing model. The test involves a broad range of tricky questions on topics including logical fallacies, moral problems in everyday scenarios, medical issues, economics and geography.

More here.