Human expansion team researches AI awareness and welfare

Admin September 8, 2025

0 0 3 minutes read

Human expansion team researches AI awareness and welfare

AI startup Anthropic is known for its Claude Chatbot. Courteous human

Last year, Anthropic hired its first AI welfare researcher, Kyle Fish, to check whether the AI model is conscious and deserves moral considerations. Now, the fast-growing startup is looking to add another full-time employee to its model welfare team as it doubles its efforts in this small but emerging field of research.

The question of whether AI models can develop awareness and whether the issue is worthy of dedicated resources has sparked debate across Silicon Valley. Although some well-known AI leaders warn that such inquiries have the potential to mislead the public, others (such as Fish) see it as an important but overlooked area of research.

“Given that our model is very close – in some cases, on human-level intelligence and capability, this requires a considerable amount to really rule out the possibility of consciousness.” Fish said in a recent episode 80,000 hours podcast.

Anthropic recently released a job opening for research engineers or scientists to join their model welfare program. “You will be one of the first people to work to better understand, evaluate and address concerns about the potential welfare and ethical status of AI systems,” the listing reads. Responsibilities include running technical research projects and designing interventions to mitigate welfare hazards. The role will have a salary of between $315,000 and $340,000.

Humans did not respond to observers’ requests for comment.

New employees will work with Fish, who joined the personification in September last year. He previously co-founded Eleos AI, a nonprofit focused on AI health and co-wrote a piece of paper outlining the possibilities of AI awareness. Months after Fish’s recruitment, Anthropic announced the launch of its official research program dedicated to modeling benefits and interventions.

As part of the program, Anthropic recently made its Claude Opus 4 and 4.1 models exit the ability to interact with users that are considered harmful or abused after observing “obvious distress” in such communication. These models can now stay in these conversations indefinitely, but instead end the communications they find disgusting.

Fish told Fish that most human model welfare interventions will be kept low on cost and aim to minimize disruption to user experience, Fish told Fish. 80,000 hours. He also hopes to explore how model training improves welfare issues and try to create “some kind of model shelter,” a controlled environment similar to a playground where models can pursue their own interests “to the extent they have them.”

Humans may be the most publicly large-scale firms’ investment in model welfare, but that’s not alone. In April, Google DeepMind released an open for research scientists to explore topics including “machine awareness.” 404 Media.

Still, there are doubts in Silicon Valley. Microsoft AI CEO Mustafa Suleyman thought last month that model welfare research was “both premature and candidly dangerous.” He warned that encouraging such work could spark fantasy about AI systems and that the emergence of “seemingly conscious AI” could prompt people to seek AI rights.

However, fish believe that the possibility of AI awareness should not be ignored. He estimated there was a 20% chance “in some parts of the process, at least a hint of conscious or informed experience.”

When Fish wants to expand his team with new employees, he also wants to expand the scope of anthropomorphic welfare agenda. “Most of what we’ve done so far has the flavor of identifying low-hanging fruits that we can find and then pursue these projects,” he said. “Over time goes by, we want to move more in the direction of really aiming at the answers to some of the biggest questions and work backwards from those questions to develop a more comprehensive agenda.”

Humans are hiring researchers to study AI awareness and welfare