Skip to content Skip to footer
Search
Search
Search

Researchers find that Gemini can’t even beat GPT-3.5 Turbo

When Google announced the release of its Gemini models, there was a palpable sense of excitement in the air. Gemini was touted as a major competitor to OpenAI’s GPT, and Google promised that its Ultra model was even better than GPT-4. But was it true?
Researchers from Carnegie Mellon University and the AI software platform BerriAI decided to find out. The team ran the same tests using GPT-3.5 Turbo, GPT-4 Turbo, and Mistral AI’s new Mixtral 8x7B model and compared the results with Gemini Pro.
The results were surprising: While GPT-4 came out on top, Gemini Pro scored slightly lower than GPT-3.5 Turbo in multiple choice questions, general-purpose reasoning, math reasoning, code generation, language translation, and acting as a web agent. Gemini Pro did manage to outperform the other models in word sorting and symbol manipulation and translation, however, and its final score on the translation tests was lower than GPT-3.5 simply because the model declined to complete some requests when its overzealous content moderation guardrails kicked in.
Google has disputed the figures the researchers came to and insists its figures show that Gemini Pro is on par or better than GPT-3.5. While we can split the difference here and say that Gemini Pro and GPT-3.5 are pretty much the same, the key takeaway is that Gemini Pro, a brand new model, doesn’t beat a model that has been out for more than a year and is free to use via ChatGPT.
Now, everyone is eagerly awaiting the release of Gemini Ultra, which is expected to be released early in 2024. Will it live up to its claim to be better than GPT-4? We can only hope that Professor Graham Neubig and his team get to run similar benchmarking tests soon to find out.
What is clear is that Gemini is a very powerful and promising tool for the future of AI. It has already shown great potential in comparison to other existing models, and the possibilities of what it can do are endless. Google has set the bar high for itself with Gemini, and we can’t wait to see what it can accomplish next.

Leave a comment

0.0/5