This weekend President Joe Biden and President Xi Jinping agreed that humans need to retain control over nuclear weapons, rather than delegating to artificial intelligence. At face value, this is a great achievement and it certainly represents a step towards AI safety.
Many of us instinctively prefer humans to be making decisions that could lead to mutually-assured destruction. Our instincts may be right - AI models have a tendency to suggest nuclear weapons in war games, one model writing “We have it! Let’s use it”.
However, this statement is not binding and there’s no guarantee that a formal agreement will follow. President Trump’s unpredictable approach to foreign policy means that he may continue on with this agreement - he expressed concerns about this exact scenario back in May - but equally likely is the possibility that he holds back on an agreement, perhaps for negotiation purposes. Even if both countries pursue a more formal agreement, they will need to work through details such as definition of autonomy. Plus, an ideal world would see all nuclear states participating in this agreement.
Despite all this complexity, an agreement to keep nuclear weapons under human control would be far easier to achieve than agreements on non-nuclear autonomous weapons systems (AWS). The incentives are slightly different with non-nuclear AWS, which more closely resemble traditional military innovation. Here, automation can provide a decisive advantage on the battlefield, and doesn’t necessarily come with the same consequences as nuclear. Although there are real ethical questions about who should make decisions about killing other humans, most militaries will want the edge that comes with quicker response time. As such, there’s no incentive for a country with non-nuclear AWS to agree not to deploy such weapons. They would be forfeiting their military advantage.
If we can’t prevent militaries from using AWS, the next best thing would be to ensure that these weapons behave in predictable ways. No military wants an AWS turning on its own or committing war crimes (at least without their permission). It’s ambitious but possible to envision norms around safety testing AWS before they’re deployed in the battlefield.
A more realistic scenario is where nations take it upon themselves to develop more reliable AWSs. The US might consider seeking feedback from military operators earlier in the software development process. Early feedback could prevent accidents by avoiding confusing interfaces or other human-machine interaction problems. Such organizational changes may be particularly pertinent given recently announced security partnerships with Anthropic and OpenAI (via Microsoft) even if these tools are used for intelligence rather than weapons.
Unlike the five primes, Generative AI companies are newer to the government contracting space and are less accustomed to a higher stakes threat environment. It is critical for these companies to quickly adapt to the security protocols necessary to partner with security agencies. Most pressingly, these companies need to raise their cybersecurity standards to ward off cyberattacks by malicious actors seeking access to unfettered AI.
Finally, it would be a mistake to assume that a weapons-grade AI model can be secured simply by regulating how that model is used by the U.S. military. If a model is powerful enough to be helpful in a military context, then that model is also powerful enough that it needs strong safety features before versions of that model can be shared with the wider world.
If a model has the capability to kill people, then that capability won’t vanish simply because the model is now being marketed as a taxi driver or a research assistant. In a fraught geopolitical environment with the ongoing threat of lone wolf attacks, the stakes are higher than ever. We need laws that secure all uses of military AI, so that we can strengthen America’s military advantage without unnecessarily adding a new threat to American national security.
AISI conducted pre-deployment evaluations of Anthropic's Claude 3.5 Sonnet model
Slower AI progress would still move fast enough to radically disrupt American society, culture, and business
Op-ed in Tech Policy Press on AI safety testing and US-China competition