Updates
Read about our past and present projects.
Benchmarking Competition

"If you can't measure it, you can't improve it." ML Safety lacks good benchmarks. In an effort to change that, we are offering up to $500,000 in prizes for benchmark ideas (or research papers, but a write up of your idea is sufficient). For more details, visit https://benchmarking.mlsafety.org.

November 25, 2022
ML Forecasting competition

We are awarding $625,000 in prizes for ML models that can accurately forecast events. From predicting how COVID-19 will spread, to anticipating geopolitical conflicts, using ML to help inform decision-makers could have far-reaching positive effects on the world. View more details here.

September 16, 2022
Philosophy Fellowship

We are excited to announce a 7-month fellowship for philosophy graduate students and post-doctorates to engage with conceptual and ethical problems in AI Safety. Over the course of the program, researchers will attend seminars and guest lectures, work closely with advisors, and conduct independent research.

August 24, 2022
ML Safety Scholars

We ran a course designed to introduce students to the most relevant concepts in empirical ML-based AI safety. Course materials are available here. We plan to run a similar program in the fall.

August 18, 2022
Moral Uncertainty Competition

The objective of the competition is to train language models to detect when a decision is morally ambiguous or clear cut. We would like machine learning models to indicate when they are unsure what to do so that they can be overridden. This is especially true in ethical dilemmas since there is often no consensus about what ought to be done.

August 17, 2022
The NeurIPS Trojan Detection Challenge

We're releasing the Trojan Detection Challenge, a NeurIPS 2022 competition with a $50K prize pool. This competition challenges contestants to detect and analyze Trojan attacks on deep neural networks that are designed to be difficult to detect. The goal of the competition is to study the fundamental offense-defense balance of Trojan detection: How hard is it to detect hidden functionality that is trying to stay hidden?

July 14, 2022
NeurIPS 2022 Workshop

We are excited to announce the NeurIPS 2022 ML Safety workshop, which will bring together researchers from machine learning communities to focus on Robustness, Monitoring, Alignment, and Systemic Safety. $100K in prizes will be awarded. There will be 'Best Paper' awards and 'Best X-risk Analysis' awards. The ultimate goal of this workshop is to support the research community that is tackling AI safety issues and encourage more researchers to think about tail risks.

July 14, 2022