This week, Meta released its Frontier AI Framework ahead of the upcoming AI Summit in France. The document, which outlines their internal approach to risk assessment and mitigation of frontier models, acknowledges that certain models may be too dangerous to release. This feels like a refreshing concession in contrast with their prolific public relations campaign on the benefits of open-source AI. However, this framework was not released entirely at their discretion — they had already committed to identifying thresholds for “intolerable risk” and providing public transparency at the AI Summit in Seoul.
Although Meta should be commended for fulfilling their commitments, their framework has some concerning omissions and ambiguities.
First, the framework exclusively focuses on misuse risk. Given that experts are predicting agentic AI systems as a major trend in 2025, it is strange that Meta neglects to consider how agents could lead to catastrophic risks. At least they consider both state and non-state actors, whereas other policies often limit their perspective to geopolitical rivals.
Second, Meta uses subtle language to narrow their definitions of thresholds. For example, they repeatedly define thresholds in reference to whether a frontier AI “would uniquely enable execution” of threat scenarios. It is unclear how “uniquely” should be interpreted. If another model would enable execution of those scenarios, does this model not uniquely enable execution? What about if a small group of human experts could also design a bioweapon? Meta’s language makes it unclear whether models would meet risk thresholds in these situations.
They also state that for a frontier AI to exceed a threshold, it must possess “all of [the] enabling capabilities” required for a threat to eventuate. In other words, if the frontier AI only helps with one important step in enabling a threat, then it doesn’t meet the threshold for certain risk mitigations. By defining their thresholds so narrowly, Meta has established reasoning for fewer of their models to undergo risk mitigations.
This framework is just one example of why self-governance alone is insufficient for safety. Meta has written a framework which, from the outset, has ambiguous and narrow thresholds. Furthermore, in the US there is nothing requiring them to actually adhere to their own safety promises. The risk is that when profit and prestige are beckoning, it is tempting to rush product out the door. To combat those incentives, the US is in dire need of binding independent AI safety evaluations.
The summit underscored the deep divisions in how different stakeholders view AI regulation.
A comprehensive analysis of current AI agents reveals a significant lack of information about safety policies and evaluations.