Humans launch the world’s first “hybrid reasoning” AI model

The difference between traditional and inference models is similar to the two types of thinking described by Nobel Prize-winning economist Michael Kahneman Think quickly: Fast, instinctive system 1 thinking and slower deliberate system 2 thinking.
By querying large neural networks, the model that makes Chatgpt possible, called large language models or LLM, can generate instantaneous responses to prompts. These outputs may be very clever and coherent, but may not answer questions that require step-by-step reasoning, including simple arithmetic.
If the plan is directed to propose a plan that must be followed, the LLM can be forced to imitate the deliberate reasoning. However, this technique is not always reliable, and models are often difficult to solve problems that require extensive, careful planning. Openai, Google, and now humans are using a machine learning method called reinforcement learning to get their latest models to learn to produce reasoning that points to the correct answer. This requires collecting additional training data from humans about solving specific problems.
Pennsylvania says Cloud’s inference model received additional data about business applications, including writing and fixing code, using computers, and answering complex legal questions. “What we’re getting in improving is…technical subjects or topics that require long-term reasoning,” Payne said. “What we get from our clients is the great interest in deploying the model into the actual workload.”
Anthropic says Claude 3.7 is particularly good at solving coding problems that require step-by-step reasoning, surpassing Openai’s O1 in some benchmarks such as SWE-Bench. The company today released a new tool called Claude Code, designed for this AI-assisted encoding.
“The model is already good at coding,” Payne said. However, “other ideas would be beneficial for situations where a very complex plan might require—for example, you’re looking for a company’s code base.”