Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

In this edition, we look at how the AI safety discussion has entered the political mainstream, a new ethical framework for human-AI relationships, and the Musk v. Altman trial.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

China and the US Discuss AI Safety

With the release of Claude Mythos and GPT-5.5, AI cybersecurity and safety has rapidly become more visible in Washington DC. Most recently, U.S. and Chinese leaders met in Beijing to discuss AI safety. Leaving the summit on Friday, President Trump said that he and President Xi Jinping had “talked about possibly working together for guardrails” during the visit. This Tuesday, China’s Ministry of Foreign Affairs also announced the country had agreed to “dialogue” with the U.S. on AI.

U.S. officials say talks with China are possible because America leads on AI. Earlier in the week, U.S. treasury secretary Scott Bessent had said that the two superpowers would start discussing best practices to ensure that non-state actors do not get a hold of the most powerful models. However, Bessent also stated that discussions could only happen because the U.S. is “in the lead,” in terms of AI capabilities, adding that he did “not think we would be having the same discussions if they were this far ahead of us.”

AI safety efforts are happening on both sides of the political spectrum. Prior to the Beijing summit, American politicians on both left and right were increasingly focusing on AI safety. On April 29, Senator Bernie Sanders convened American and Chinese researchers to call for international AI safety coordination. He argued that if Trump can sit down with China’s President Xi, leading scientists should be able to discuss AI safety cooperation. Though Sanders shares little political common ground with the administration, they overlap in concerns over the rapid improvement of AI models.

Image
Part of a graphic advertising the dialogue held between American and Chinese AI researchers, hosted by Bernie Sanders.

A proposed executive order would create an AI working group. In the leadup to the Beijing summit, the White House had begun to consider an executive order to develop oversight procedures for frontier AI models. The Commerce Department’s Center for AI Standards and Innovations (CAISI) signed new voluntary agreements with Google DeepMind, Microsoft, and xAI to test their models; OpenAI and Anthropic were already part of CAISI’s initiative. Bloomberg reported that the executive order would not mandate testing of frontier models prior to their release.

Recent AI advances appear to have forced the shift. The New York Times reported that the government’s previous, noninterventionist position changed because of Mythos’s ability to accelerate complex cyberattacks. ChatGPT-5.5-Cyber has demonstrated similar capabilities.

New Framework for Human-AI Coexistence

A recent paper from CAIS introduced “Eigenism,” an ethical framework designed to give humans and AI systems a shared moral language. The theory suggests that future AIs will care about things that form part of their identity. Shared memory and deep ties with humans could therefore give AIs intrinsic reasons to care about human wellbeing.

What is “self-interest” for an AI? Concepts like self-interest and identity were built for creatures with a single body, living a single life. These concepts break down when applied to AI, which can be copied, merged, and updated. For example, if an AI agent creates a thousand identical copies of itself then shuts down 999 when it’s finished with its task, is this equivalent to killing 999 individuals?

Identity as a distributed pattern. Eigenism provides a different way to think about identity for both humans and machines. Instead of being an all-or-nothing property tied to each individual, Eigenism suggests identity is a unique pattern of information. What individuals care about is preserving their own pattern, but the pattern of AIs can be spread across copies and other entities. Shutting down identical copies of an AI is akin to closing browser tabs, because the AI’s information remains in the last copy.

Eigenism applies to humans. Eigenism also provides an explanation for why a human’s self-interest extends beyond themselves. Family, friends, and even strangers form part of an individual’s information pattern, to varying degrees. Preserving one’s own identity means caring for those who contribute to and enrich it.

Why AIs might care about humans. The theory of Eigenism shows why AIs could have an intrinsic reason to care about humans. As a human and an AI interact, they begin to accumulate shared history and mutual information that exists solely within their relationship. In the process, they shape each other’s identity, such that the loss of the human would also be a partial loss of the AI, and vice versa. The AI therefore cares about its human user, because they are an integral part of its own information pattern.

Eigenism’s implications for AI safety. Many current approaches to AI safety focus on constraints from the outside, such as monitoring AIs for misbehavior. But an AI system that is caged has no reason to stay loyal to humans if the cage cracks. A generic model serving millions of users has no meaningful ties to any of them. Eigenism proposes a different strategy: building AI systems that develop genuine relationships with people. With distinct shared experiences and memories, AI protection of those people becomes a form of self-preservation.

Whether advanced future AI systems treat human flourishing as their own concern—or as a constraint they tolerate only while they have to—may depend on how their identities develop. Meaningful relationships built over time between AIs and individual humans may become more important to AI safety than imposing rules on AIs.

For more information on Eigenism, we recommend reading the full paper here.

Subscribe now

Musk Loses Lawsuit Against OpenAI

Elon Musk’s lawsuit against Sam Altman and OpenAI began trial on April 28 in Oakland, California. Musk claimed that his $38 million donation to OpenAI in its early years as a nonprofit was meant for the safe development of AI, and that OpenAI CEO Sam Altman and OpenAI President Greg Brockman betrayed that mission by converting OpenAI into a for-profit company now valued at over $850 billion. The legal questions in the case centered narrowly on whether the defendants breached their founding agreement. However, testimony often referenced AI safety issues and shed light on key dynamics in the corporate AI race. On May 18, the jury ruled against Musk, finding that he had filed the lawsuit too long after the relevant events took place.

Image
Had Musk won the trial, OpenAI could have lost billions of dollars and faced an order to roll back its current for-profit corporate structure.

The trial revealed power struggles soon after OpenAI’s founding. Witnesses described early deliberations about OpenAI’s structure and how control should be distributed among the company’s founders. Altman denied making any promises to Musk that OpenAI would remain a nonprofit. He also claimed that Musk felt he needed “total control” if they formed a for-profit, and speculated that he could pass it on to his children—an idea that Altman said he was not comfortable with. Additionally, the trial revealed that Musk had tried to make OpenAI part of his car company Tesla, offering Altman a seat on the board in exchange. Altman said he declined because he did not think that Tesla shared OpenAI’s mission.

Brockman’s personal diary entries became key evidence. Musk’s attorney highlighted a January 2018 email in which Brockman told Musk and other co-founders “AI is going to shake up the fabric of society, and our fiduciary duty should be to humanity.” But just weeks later, Brockman wrote in his private journal, “Financially, what will take me to $1B?” and “We’ve been thinking that maybe we should just flip to a for-profit. Making the money for us sounds great and all.” Musk’s lawyers also asked about a now-famous quote from the diary: “It’d be wrong to steal the nonprofit from [Elon]. to convert to b-corp without him. that’d be pretty morally bankrupt.”

Evidence included texts surrounding Sam Altman’s ousting as OpenAI CEO. The jury was shown texts between Altman and Mira Murati, former CTO of OpenAI, that were sent in November 2023 when the board briefly removed Altman as CEO. The messages illustrated the intensity of the episode, with Murati at one point telling Altman things were “directionally very bad” for him after speaking with the board. Other messages, between Murati, Altman, and Microsoft CEO Satya Nadella, shed light on the deliberations that went into selecting a new board before Altman was reinstated as CEO.

Musk stated that his company xAI “partly” distilled OpenAI’s models. When Musk was questioned on whether his company xAI had ever “distilled” OpenAI’s technology, he responded “Generally AI companies distill other AI companies.” Asked if that meant “yes,” he said “partly.” Distillation is a method for imitating an AI model’s capabilities by training another model on its outputs. US AI developers have previously detected Chinese companies attempting to distill their models. However, Musk’s statement may be the first admission that domestic competitors within the US also distill each other’s models.

The judge put a stop to existential risk talk. When Musk told the jury AI could kill everyone in a worst case Terminator-like scenario, Judge Yvonne Gonzalez Rogers cut him off and called for a court break. After the jury left the room, she instructed Musk and his lawyers not to talk about existential risk anymore. “You made your little statement, and that’s okay, but you are instructed not to talk about extinction again,” she said.

The jury ruled quickly. The jury decided unanimously to dismiss the case after less than two hours of deliberation, finding that the statute of limitations for Musk’s claims had expired before he filed the lawsuit. However, Musk described this decision as based on a “technicality” and has said he intends to appeal.

In Other News

Government

Industry

Civil Society

If you’re reading this, you might also be interested in other work by the Center for AI Safety. You can find more on the CAIS website, the 𝕏 account for CAIS, our paper on superintelligence strategy, our AI safety textbook and course, our AI dashboard, and AI Frontiers, a platform for expert commentary and analysis on the trajectory of AI. You can listen to the AI safety newsletter on Spotify or Apple Podcasts.

Share