In the quest to create the hardest benchmark possible, Scale AI and the Center for AI Safety (CAIS) have teamed up to create ...