The Preparedness project provides code for evaluating AI models using nanoeval and alcatraz, with evals like PaperBench and upcoming SWELancer, MLE-bench, and SWE-bench.