The researchers say the accuracy problem of AI generation will not disappear anytime soon.

As we all know, generated AI chatbots make many mistakes. Hopefully you don’t follow Google’s AI recommendations to add glue to your pizza recipes, or eat a rock or two a day to keep you healthy.
These mistakes are called hallucinations: essentially, the things that model constitute. Will this technology become better? Even researchers studying AI are not optimistic, it will happen soon.
This is one of the discoveries of the Twenty Twelve Artificial Intelligence Expert Team released this month by the Artificial Intelligence Association. The team also surveyed more than 400 members of the association.
This panel of academic and industry experts seems to be more cautious than you might see about the hype about developers being only a few years (or months, depending on who you ask). This includes not only making the facts correct and avoiding strange mistakes. If developers are going to produce a model that can satisfy or exceed artificial general intelligence, the reliability of AI tools needs to be greatly increased. Researchers seem to think that improvements at this scale are unlikely to happen soon.
“We tend to be a little cautious until it really works, until it actually does not really work,” said Carnegie Mellon University’s computer science professor and panelist.
In recent years, artificial intelligence has developed rapidly
The report aims to support research on artificial intelligence to produce technology that helps people, Francesca Rossi, president of AAAI, wrote in the introduction. Trust and reliability issues are serious, not only in providing accurate information, but also in avoiding bias and ensuring that future AI does not cause serious unintended consequences. “We all need to work together to advance AI in a responsible way to ensure technological advancement supports human progress and align with human values,” she wrote.
Conitzer said the acceleration of AI, especially since Openai launched Chatgpt in 2022, has been excellent. “It’s amazing in some ways, many of these technologies work much better than most of us think,” he said.
John Glongstun, an assistant professor of computer science at Cornell University, told me that some areas of AI research “hype does have advantages.” This is especially true in mathematics or science, where users can check the results of the model.
“This technology is amazing,” Houston said. “I’ve been working in this field for over a decade and it shocked me how great it became and how fast it became.”
Experts say that despite these improvements, there are still major issues worthy of research and consideration.
Will the chatbot start to be straightforward?
Despite progress in improving the credibility of information from generating AI models, more work is needed. The latest report from Colombia News Review found that chatbots are unlikely to refuse to answer questions they cannot answer accurately, to provide error information on them, and to identify (and provide fabricated links) sources to support these false assertions.
AAAI reports that increasing reliability and accuracy can be “is the largest area of AI research today.”
The researchers point out that there are three main ways to improve the accuracy of AI systems: fine-tuning, such as enhancing learning through human feedback; a generation of searching statements where the system collects specific documents and extracts answers from them; and a chain of thought that breaking down the problem into smaller steps that can examine hallucinations.
Will these things make your chatbot respond quickly and more accurate? The report said it was impossible: “The facts are not resolved.” About 60% of respondents indicated that the facts or credibility issues would be resolved soon.
In the generated AI industry, it has been optimistic that scaling existing models will make them more accurate and reduce hallucinations.
“I think hope is always too optimistic,” Houstu said. “I haven’t seen any evidence of a truly accurate, highly factual language model in the past few years.”
Conitzer said that despite the prone of big language models such as Anthropic’s Claude or Meta’s Llama, users can mistakenly think they are more accurate because they come up with the answer with confidence.
“If we see someone responding confidently or sounding confident, we think this person does know what they are talking about,” he said. “A AI system that may just claim to be confident in something completely nonsense.”
AI users’ courses
Realizing the limitations of generating AI is essential to using it correctly. Gongstun’s advice to model users like Chatgpt and Google’s Gemini is simple: “You have to check the results.”
He said that general large language models do a poor job of retrieving factual information consistently. If you ask for something, you should probably follow up by looking for answers in search engines (rather than relying on AI summary of search results). When you do this, first of all, you may be better off doing this.
Gesten said his most use of AI models is to automate tasks he can do and to check accuracy, such as formatting information tables or writing code. “The broader principle is that I find these models the most useful for automation that you already know how to do,” he said.
Read more: 5 Ways to Stay Clever When Using Gen AI
Is artificial general intelligence around the corner?
The priority in the AI development industry is a clear race that can create a race that is often called artificial general intelligence or AGI. This is a model that is usually capable of having human mind or better.
The report’s investigation found strong perceptions of the AGI competition. It is worth noting that more than three-quarters (76%) of respondents said that expanding current AI technologies, such as large language models, is unlikely to produce AGI. Most researchers doubt whether the current journey to AGI will work.
The same overwhelming majority believe that if a private entity develops a system of artificial general intelligence, it should be publicly owned (82%). This coincides with the focus on the moral and potential drawbacks of creating systems that can transcend humanity. Most researchers (70%) said they opposed stopping AGI research until the development of security and control systems. “These answers seem to imply preferences in continuing to explore the subject within certain safeguards,” the report said.
The conversation around AGI is complicated, Houston said. In a sense, we have created systems with general intelligent forms. Large language models (such as Openai’s Chatgpt) are capable of performing various human activities, while older AI models can only do one thing, such as playing chess. The question is whether it can do a lot consistently on a human level.
“I think we’re very far from that,” Gaston said.
He said these models lack the built-in concept of truth and the ability to handle truly open creative tasks. “I don’t see a path to using current technology to achieve a firm operation in a human environment,” he said. “I think there is a lot of research progress to get there.”
Conitzer says that the definition of what exactly constitutes AGI is tricky: Usually, people mean things that can do most tasks better than humans, but some say it is just things that can accomplish a range of tasks. “A stricter definition really makes us completely redundant,” he said.
Despite researchers’ skepticism about the coming of AGI, Conitzer warns that AI researchers don’t necessarily expect all of us to see tremendous technological advancements in the past few years.
“We didn’t see how fast it’s happening in, so you might be wondering if we’re going to see it coming if it keeps getting faster and faster,” he said.