CAIS Blog

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Press Release

•

Jun 23, 2025

•

2 min read

Josué Estrada Joins Center for AI Safety as Chief Operating Officer

Written by:

AI Risks

•

Sep 15, 2024

•

2 min read

Submit Your Toughest Questions for Humanity's Last Exam

CAIS and Scale AI are excited to announce the launch of Humanity's Last Exam, a project aimed at measuring how close we are to achieving expert-level AI systems. The exam is aimed at building the world's most difficult public AI benchmark gathering experts across all fields. People who submit successful questions will be invited as coauthors on the paper for the dataset and have a chance to win money from a $500,000 prize pool.

Written by:

Dan Hendrycks, Alexandr Wang

AI Risks

•

Sep 9, 2024

•

5 min read

Superhuman Automated Forecasting

This post describes a superhuman forecasting AI called FiveThirtyNine, which generates probabilistic predictions for any query by retrieving relevant information and reasoning through it. We explain how the system works, its performance compared to human forecasters, and its potential applications in improving decision-making and public discussions.

Written by:

Long Phan, Andrew Zeng, Mantas Mazeika, Adam Khoja, Dan Hendrycks

AI Risks

•

May 10, 2024

•

AI Safety, Ethics, and Society

AI Safety, Ethics and Society is a textbook and online course providing a non-technical introduction to how current AI systems work, why many experts are concerned that continued advances in AI could pose severe societal-scale risks, and how society can manage and mitigate these risks.

Written by:

AI Risks

•

Apr 29, 2024

•

5 min read

Representation Engineering: a New Way of Understanding Models

Representation engineering is an exciting new field which explores how we can better understand traits like honesty, power seeking, and morality in LLMs. We show that these traits can be identified by looking at model activations, and these same traits can also be controlled. This method differs from mechanistic approaches which focus on bottom-up interpretations of node to node connections. In contrast, representation engineering looks at larger chunks of representations and higher-level mechanisms to understand models in a 'top-down' fashion.

Written by:

Izzy Barrass, Long Phan

AI Risks

•

Apr 10, 2024

•

43 min read

A Bird's Eye View of the ML Field

The internal dynamics of the ML field are not immediately obvious to the casual observer. This post will present some important high-level points that are critical to beginning to understand the field, and is meant as background for our later posts.

Written by:

Dan Hendrycks

Thomas Woodside

AI Risks

•

Mar 6, 2025

•

9 min read

Cybersecurity and AI: The Evolving Security Landscape

Advances in AI could increase the risk of cyberattacks, yet AI also promises to improve cyber defenses. A coordinated effort between technology and regulatory sectors is crucial for leveraging AI's potential to strengthen cyber defenses and address security shortcomings.

Written by:

Steve Newman

AI Risks

•

Mar 6, 2024

•

5 min read

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Written by:

Izzy Barrass, Adam Khoja, Oliver Zhang

AI Risks

•

Oct 29, 2024

•

8 min read

Devising ML Metrics

Metrics drive the ML field, but defining these metrics is difficult. Successful benchmarks aren't the inevitable result of annotating a large enough dataset. Instead, effective ML benchmarks produce clear evaluations, have minimal barriers to entry, and concretize an important phenomena.

Written by:

Dan Hendrycks

Thomas Woodside

AI Risks

•

Feb 8, 2024

•

7 min read

Biosecurity and AI: Risks and Opportunities

Advances in AI and DNA synthesis promise to revolutionize medicine… but could enable bioterrorism. A thoughtful mix of public health measures and restricted access to advanced capabilities can manage this risk while also alleviating natural viral threats.

Written by:

Steve Newman

AI Risks

•

Jul 21, 2023

•

2 min read

Leading AI Companies Join White House's Voluntary Commitment to Enhance AI Safety

The Center for AI Safety state its support for the White House's securing of voluntary commitments from leading AI companies.

Written by:

AI Risks

•

Jun 4, 2023

•

2 min read

Existing Policy Proposals Targeting Present and Future Harms

We highlight three regulatory suggestions – improved legal liability frameworks, increased scrutiny on the development cycle of AI products, and the importance of human oversight in high-risk AI systems – advocated by institutions like the AI Now Institute and the European Union.

Written by:

Deeper-dive examinations of relevant AI safety topics

Josué Estrada Joins Center for AI Safety as Chief Operating Officer

Submit Your Toughest Questions for Humanity's Last Exam

Superhuman Automated Forecasting

AI Safety, Ethics, and Society

Representation Engineering: a New Way of Understanding Models

A Bird's Eye View of the ML Field

Cybersecurity and AI: The Evolving Security Landscape

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Devising ML Metrics

Biosecurity and AI: Risks and Opportunities

Leading AI Companies Join White House's Voluntary Commitment to Enhance AI Safety

Existing Policy Proposals Targeting Present and Future Harms

Keep up to date with AI Safety

Thank you!