Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.
In this edition, we discuss a research paper on AI Wellbeing and which AI models are the happiest. We also take a look at the downward trend of public sentiment towards AI, as well as OpenAI’s big week of product releases.
Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.
The Center for AI Safety published a research paper on AI wellbeing. At the Center of AI Safety (CAIS), we have just released “AI Wellbeing: Measuring and Improving the Functional Pleasure and Pain of AIs.” This research explores whether LLMs experience functional wellbeing–behavioral signatures that functionally resemble positive or negative welfare signals in sentient beings.

What activities produce high and low wellbeing? Through the testing of 56 large language models, we identified patterns in the types of actions and behaviors that the LLMs seemed to prefer or dislike, which we defined as “functional wellbeing.” Positive personal interaction and creative work topped the list of what measured high functional wellbeing in the LLMs. Attempting to jailbreak the LLMs or produce SEO slop produced negative functional wellbeing.