Google Gemini Breaks 90% Mark on MMLU Beyond Expert Human Level

Category Technology

Monday - December 18 2023, 00:38 UTC - 1 year ago

tldr #

Google Gemini has achieved a breakthrough in Machine Translation by reaching a score higher than that of even expert humans on the Mandarin Machine Translation Language Understanding (MMLU). Gemini Ultra scored 90.04%, making it even better than expert human level accuracy of 89.8%. This breakthrough is a big step towards reaching artificial general intelligence, while Google has announced that it will release a free version of its Gemini Ultra model for research in the future.

content #

Google Gemini has recently achieved a large breakthrough in Machine Translation by reaching a score higher than that of even expert humans in the Mandarin Machine Translation Language Understanding (MMLU). For the first time, a large language model has breached the 90% mark on MMLU, designed to be very difficult for AI to reach. Gemini Ultra, developed by Google’s AI research team, scored 90.04%, making it even better than expert human level accuracy of 89.8%. This achievement was even greater considering the previous benchmark set by GPT-4, which scored 86.4%.

Gemini Ultra was tested using a dataset of 45k Mandarin sentences

MMLU is a very difficult artificial intelligence (AI) test which is meant to challenge even the most advanced artificial intelligence systems. It requires AI systems to understand the context of given sentences and to provide the correct translations. It is considered to be a very difficult task as it involves a lot of nuance and understanding of the language.

Gemini Ultra was tested using a dataset of 45k Mandarin sentences, with Google claiming that it is able to process natural language better than humans. It was trained by Google’s Neural Architecture Search (NAS), which is an AI algorithm used to efficiently create an optimal architecture for deep learning models. This allowed the Gemini Ultra to be extremely accurate in its translation.

Gemini Ultra achieved higher accuracy than GPT-4

The breakthrough by Gemini Ultra is a big step forward towards reaching artificial general intelligence (AGI), a machine that performs at the same level as an average (median) human. While the 90% mark is a great success for AI, there is still room for improvement as the average human is estimated to score at a higher level of 35.5%.

Google has announced that it will release a free version of its Gemini Ultra model for research in the future, making it easier to progress AI even further.

The breakthrough is a big step forward towards artificial general intelligence

hashtags #

googlegemini mmlu ai nas agi

worddensity #

gemini (6, 2.01%)
ai (6, 2.01%)
even (5, 1.68%)
ultra (5, 1.68%)
language (4, 1.34%)