In Google I/O, AI never causes wrong AI

Admin May 20, 2025

0 0 2 minutes read

This year, Google I/O 2025 has a focus: artificial intelligence.

We’ve covered all the biggest news that has emerged in the annual developer conference: a new AI video generation tool called Flow. $250 AI hypersubscription plan. New changes for Gemini. Virtual shopping trial feature. It is crucial to launch the search tool AI mode to all users in the United States.

However, in the case of nearly two hours when Google leaders talk about AI, one word we haven’t heard is “illusion.”

Hallucination remains one of the most stubborn and related issues in AI models. The term refers to the facts and inaccuracies of invention, in which large language models “illusions” are “replies”. According to the large AI brands’ own metrics, hallucinations are getting worse and worse – some models spend more than 40% of their time.

However, if you look at Google I/O 2025, you won’t know that there is this problem. You would think that models like Gemini never hallucinate. It’s certainly surprising to see the added warnings attached to each Google AI overview. (“AI response may include errors”.)

Mixable light speed

Google recently recognized that hallucination problems occur in part of AI mode and the deep search function of Gemini. We were told that the model would check its own work before providing the answers – but without more details, it sounds more like a blind man than a real fact check.

For AI skeptics, Silicon Valley’s level of confidence in these tools seems to be different from the actual results. Real users notice that when AI tools fail, simple tasks like counting, spelling checking, or answering questions like “Will you freeze at 27 degrees Fahrenheit?” fail.

Google is eager to remind viewers that its latest AI model, Gemini 2.5 Pro, is on many AI rankings. But when it comes to authenticity and the ability to answer simple questions, AI chatbots are rated on the curve.

The Gemini 2.5 Pro is Google’s smartest AI model (according to Google), but gets only 52.9% in the Feature SimpleQA benchmark test. According to OpenAI research paper, SimpleQA test is “benchmark for evaluation” Language models answer brief, fact-seeking questions.” (emphasized ours.)

Google representatives declined to discuss simple benchmarks or general hallucinations, but did point out Google’s official interpreter on AI modes and AI overviews. This is what it says:

[AI Mode] Using large language models to help answer queries, in rare cases, sometimes confidently providing inaccurate information, often referred to as “illusion.” As with the AI overview, in some cases this experiment may misunderstand web content or miss context, and any automation system in the search may happen…

We also use novel methods and the model’s reasoning ability to improve facts. For example, in collaboration with the Google DeepMind research team, we use proxy enhanced learning (RL) in custom training to reward the model to generate statements that it knows, more likely to be accurate (not hallucinations), and also supported by inputs.

Is it wrong for Google to remain optimistic? After all, hallucinations have not been proven to be a solution. However, it seems increasingly clear from the research that the hallucination of LLM is not a solution Now.

This hasn’t stopped companies like Google and Openai from entering the era of AI search, which is likely to be a wrong age unless we’re hallucinating.

theme
Artificial Intelligence Google Gemini