Google's Gemini AI is better than ChatGPT at certain tasks, benchmarks suggest

Google's Gemini AI is better than OpenAI's ChatGPT at certain tasks, such as answering questions and generating creative content, according to benchmarks from the company. Gemini is Google’s next-generation AI model, which is used to power its Bard chatbot and other products. Gemini is a large language model (LLM) similar to OpenAI's GPT-4 AI model that powers ChatGPT. In benchmarks shared Thursday, Google said Gemini achieved state-of-the-art performance on more than 40 natural language processing (NLP) tasks. It compared Gemini's performance with OpenAI's GPT-4, Anthropic's Claude and Meta’s Llama. Gemini outperformed ChatGPT when evaluated against "specialized NLP tasks," such as solving math problems and answering medical questions, according to Google. However, Gemini lagged behind ChatGPT in a few areas, such as generating code, the company said. Gemini performed best when using reasoning to solve problems. For example, it was better than all other AI models at solving math problems and answering questions about a certain prompt, according to Google. Gemini also performed well on "value alignment," which is a test that measures how well an AI model aligns with a user's values and preferences. Google's AI model topped the list, followed by Anthropic's Claude. Gemini was weakest at generating code, which is a test that measures how well an AI model can write programming code. OpenAI's GPT-4 topped the list by a long margin, followed by Meta’s Llama. Gemini ranked fifth on this test. The benchmarks were created by Google and were reviewed by external researchers. A Google spokesperson told CNBC that the company used a peer-review process similar to academia to test Gemini and other AI models. Google's Gemini is a key component of the company's AI strategy, powering Bard and other applications. Gemini is considered the next-generation AI model for Google, succeeding its Gemini 2 model. The Gemini model will also be used to power AI features on Google Maps, Google Cloud, Google Search and YouTube, the company said in December. Gemini is being used by external developers too. DocuSign, for example, is using Gemini in its contract analysis tool. Gemini is also being used by several startups and venture capital firms that are integrating Google's AI model into their own applications. The benchmarks come as the AI industry continues to grow rapidly, with companies such as Google, Microsoft, Meta, Nvidia and Anthropic investing billions of dollars into AI research and development. Gemini's performance is just the latest benchmark of AI models that have been released in recent months, as AI companies are racing to show how their models are better than others.

Google's Gemini AI is better than ChatGPT at certain tasks, benchmarks suggest

Comments (15) 24534 👁️