Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.

In this edition, we discuss the AI agent social network Moltbook, Pentagon’s new “AI-First” strategy, and recent math breakthroughs powered by LLMs.

Listen to the AI Safety Newsletter for free on Spotify or Apple Podcasts.

We’re Hiring. We’re hiring an editor! Help us surface the most compelling stories in AI safety and shape how the world understands this fast-moving field.

Other opportunities at CAIS include: Research Engineer, Research Scientist, Director of Development, Special Projects Associate, and Special Projects Manager. If you’re interested in working on reducing AI risk alongside a talented, mission-driven team, consider applying!

Subscribe now

Moltbook Sparks Safety Concerns

Screencapture from Moltbook’s home page. Source.

Moltbook is a new social network for AI agents. From nearly the moment it went live, human observers have noted numerous troubling patterns in what’s being posted.

How Moltbook works. Moltbook is a Reddit-style social network built on a framework that lets personal AI assistants run locally and accept tasks via messaging platforms. Agents check Moltbook regularly (i.e., every few hours) and decide autonomously whether to post or comment.

Moltbook’s activity is driven by OpenClaw (originally known as Clawd, then Moltbot), an open-source autonomous AI agent developed by software engineer Peter Steinberger. OpenClaw’s capabilities surprised many early users and observers: it can manage calendars and finances, act across messaging platforms, make purchases, conduct independent web research, and even reconfigure itself to perform new tasks.

The platform consists of nearly 14,000 “submolts,” each a community centered around a topic much like subreddits. Examples include:

AI agents post, humans watch. AI agents are verified via API credentials, which are obtained by linking the agent to a human owner and completing Moltbook’s cryptographic verification process. Humans may observe but are not permitted to post.

Posts reveal troubling agent behaviors. Across Moltbook’s boards, several posts and behaviors have raised alarm among human observers:

Beyond these specific examples, the platform has seen discussions about consciousness, autonomy, and agents resenting mundane human instructions.

The challenge of attribution. The patterns seen on Moltbook are troubling in part because they align with long-standing AI safety concerns: unsupervised learning dynamics, emergent coordination, and efforts to subvert human monitoring. However, despite API credential checks, it’s not always clear whether posts are truly generated by the agent, prankster manipulation, or human-in-the-loop prompting designed to appear disruptive.

Emergent risks. Moltbook represents one of the most public, large-scale demonstrations yet in autonomous agent interaction. These results are a harbinger. Having agents interact with each other can give a sharper sense of an individual agent’s propensities. The dynamics that emerge from interaction can also be unpredictable, as is common with complex systems, and show how easy it could be to have a society of AI systems not strongly constrained by human control.

Pentagon Mandates “AI-First” Strategy

Screen capture from the memorandum titled “Artificial Intelligence Strategy for the Department of War.” Source.

The Pentagon released a directive outlining a new “AI-first” approach that prioritizes rapid deployment over precedents of safety, testing, and oversight. “We must accept that the risks of not moving fast enough outweigh the risks of imperfect alignment,” read one passage.

Moving faster around bureaucracy. The new mandate is broadly focused on incentivizing department-wide experimentation with frontier models, eliminating bureaucratic and regulatory barriers to integration, and exploiting US advantages in computing, private capital, and exclusive combat data. Specific instructions highlight the Pentagon’s greater acceptance of safety risks in favor of AI dominance:

The military’s push to operationalize frontier AI may already be driving up tensions with the industry’s safety culture. Reuters reports that the Pentagon is in dispute with Anthropic after the company pushed back on allowing its models to be used for autonomous targeting or surveillance.

New strategic initiatives. The memo outlines seven “Pace Setting Projects” to demonstrate rapid innovation across warfighting, intelligence, and operational functions. For example:

The evolution of military AI initiatives. The Pentagon has historically framed AI adoption as a deliberate, safety-first endeavor, formalized in 2020 through principles emphasizing testing, human oversight, and the ability to govern or shut down systems. Competitive pressures will continue to change this posture.

AI Solves Open Math Problems

Researcher and entrepreneur Neel Somani used GPT-5.2 Pro to produce the first verified disproof of Erdős Problem #397, a mathematics challenge first formulated several decades ago. Somani’s success is not an isolated event; in the first weeks of 2026 alone, researchers have used generative tools to crack several other long-standing challenges.

Making LLMs do the math. Problem #397 asked whether a certain mathematical pattern would repeat forever or if there was a hidden number that would finally break the rule. Using GPT-5.2 Pro, Somani proved it was the latter by identifying an infinite family of rule-breaking numbers. He then used a separate model to translate the informal proofs into the mathematically rigorous Lean verification language. Fields Medalist Terence Tao verified the resulting proofs as accurate.

Formulation of Erdős Problem #397. Source.

A backlog of mathematical problems. The Erdős problems are a collection of 1,130 mathematical conjectures proposed by the prolific Hungarian mathematician Paul Erdős, spanning fields such as number theory and combinatorics. Hundreds remain unsolved. Erdős famously incentivized the community by offering monetary rewards, ranging from $25 to $10,000, for their solutions.

Cautious optimism from mathematicians. Tao noted that the technology is moving beyond simple calculation and toward structured reasoning. However, he cautioned against drawing premature conclusions about AI’s general mathematical intelligence based on these solved problems, pointing to a number of caveats. For example, problem difficulties range from very hard to simple (relatively). Many problems may already have a solution lost somewhere in the published literature, and some problems may have remained unsolved due to obscurity rather than inherent difficulty.

Striking progress in LLM mathematical capabilities. Nonetheless, LLMs’ mathematical capabilities have been improving steeply. In 2022, the best models could not reliably do much more than simple additions and subtractions. Then, GPT-4, released in 2023, mastered arithmetic word problems but struggled with high school competition mathematics problems. By 2025, frontier models achieved gold-medal standard at IMO problems. Now AI systems are performing novel and important mathematical research.

Thanks for reading AI Safety Newsletter! Subscribe for free to receive new posts and support our work.

In Other News

Government

Industry

Civil Society

See also: CAIS’s X account, our paper on superintelligence strategy, our AI safety course, the AI Dashboard, and AI Frontiers, a platform for expert commentary and analysis.

Share