This week, the White House announced that it had secured “voluntary commitments” from seven leading A.I. companies to manage the risks posed by artificial intelligence.
Getting the companies — Amazon, Anthropic, Google, Inflection, Meta, Microsoft and OpenAI — to agree to anything is a step forward. They include bitter rivals with subtle but important differences in the ways they’re approaching A.I. research and development.
Meta, for example, is so eager to get its A.I. models into developers’ hands that it has open-sourced many of them, putting their code out into the open for anyone to use. Other labs, such as Anthropic, have taken a more cautious approach, releasing their technology in more limited ways.
But what do these commitments actually mean? And are they likely to change much about how A.I. companies operate, given that they aren’t backed by the force of law?
Given the potential stakes of A.I. regulation, the details matter. So let’s take a closer look at what’s being agreed to here and size up the potential impact.
Commitment 1: The companies commit to internal and external security testing of their A.I. systems before their release.
Each of these A.I. companies already does security testing — what is often called “red-teaming” — of its models before they’re released. On one level, this isn’t really a new commitment. And it’s a vague promise. It doesn’t come with many details about what kind of testing is required, or who will do the testing.
In a statement accompanying the commitments, the White House said only that testing of A.I. models “will be carried out in part by independent experts” and focus on A.I. risks “such as biosecurity and cybersecurity, as well as its broader societal effects.”
It’s a good idea to get A.I. companies to publicly commit to continue doing this kind of testing, and to encourage more transparency in the testing process. And there are some types of A.I. risk — such as the danger that A.I. models could be used to develop bioweapons — that government and military officials are probably better suited than companies to evaluate.
I’d love to see the A.I. industry agree on a standard battery of safety tests, such as the “autonomous replication” tests that the Alignment Research Center conducts on prereleased models by OpenAI and Anthropic. I’d also like to see the federal government fund these kinds of tests, which can be expensive and require engineers with significant technical expertise. Right now, many safety tests are funded and overseen by the companies, which raises obvious conflict-of-interest questions.
Commitment 2: The companies commit to sharing information across the industry and with governments, civil society and academia on managing A.I. risks.
This commitment is also a bit vague. Several of these companies already publish information about their A.I. models — typically in academic papers or corporate blog posts. A few of them, including OpenAI and Anthropic, also publish documents called “system cards,” which outline the steps they’ve taken to make those models safer.
But they have also held back information on occasion, citing safety concerns. When OpenAI released its latest A.I. model, GPT-4, this year, it broke with industry customs and chose not to disclose how much data it was trained on, or how big the model was (a metric known as “parameters”). It said it declined to release this information because of concerns about competition and safety. It also happens to be the kind of data that tech companies like to keep away from competitors.
Under these new commitments, will A.I. companies be compelled to make that kind of information public? What if doing so risks accelerating the A.I. arms race?
I suspect that the White House’s goal is less about forcing companies to disclose their parameter counts and more about encouraging them to trade information with one another about the risks that their models do (or don’t) pose.
But even that kind of information-sharing can be risky. If Google’s A.I. team prevented a new model from being used to engineer a deadly bioweapon during prerelease testing, should it share that information outside Google? Would that risk giving bad actors ideas about how they might get a less guarded model to perform the same task?
Commitment 3: The companies commit to investing in cybersecurity and insider-threat safeguards to protect proprietary and unreleased model weights.
This one is pretty straightforward, and uncontroversial among the A.I. insiders I’ve talked to. “Model weights” is a technical term for the mathematical instructions that give A.I. models the ability to function. Weights are what you’d want to steal if you were an agent of a foreign government (or a rival corporation) who wanted to build your own version of ChatGPT or another A.I. product. And it’s something A.I. companies have a vested interest in keeping tightly controlled.
There have already been well-publicized issues with model weights leaking. The weights for Meta’s original LLaMA language model, for example, were leaked on 4chan and other websites just days after the model was publicly released. Given the risks of more leaks — and the interest that other nations may have in stealing this technology from U.S. companies — asking A.I. companies to invest more in their own security feels like a no-brainer.
Commitment 4: The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their A.I. systems.
I’m not really sure what this means. Every A.I. company has discovered vulnerabilities in its models after releasing them, usually because users try to do bad things with the models or circumvent their guardrails (a practice known as “jailbreaking”) in ways the companies hadn’t foreseen.
The White House’s commitment calls for companies to establish a “robust reporting mechanism” for these vulnerabilities, but it’s not clear what that might mean. An in-app feedback button, similar to the ones that allow Facebook and Twitter users to report rule-violating posts? A bug bounty program, like the one OpenAI started this year to reward users who find flaws in its systems? Something else? We’ll have to wait for more details.
Commitment 5: The companies commit to developing robust technical mechanisms to ensure that users know when content is A.I. generated, such as a watermarking system.
This is an interesting idea but leaves a lot of room for interpretation. So far, A.I. companies have struggled to devise tools that allow people to tell whether or not they’re looking at A.I. generated content. There are good technical reasons for this, but it’s a real problem when people can pass off A.I.-generated work as their own. (Ask any high school teacher.) And many of the tools currently promoted as being able to detect A.I. outputs really can’t do so with any degree of accuracy.
I’m not optimistic that this problem is fully fixable. But I’m glad that companies are pledging to work on it.
Commitment 6: The companies commit to publicly reporting their A.I. systems’ capabilities, limitations, and areas of appropriate and inappropriate use.
Another sensible-sounding pledge with lots of wiggle room. How often will companies be required to report on their systems’ capabilities and limitations? How detailed will that information have to be? And given that many of the companies building A.I. systems have been surprised by their own systems’ capabilities after the fact, how well can they really be expected to describe them in advance?
Commitment 7: The companies commit to prioritizing research on the societal risks that A.I. systems can pose, including on avoiding harmful bias and discrimination and protecting privacy.
Committing to “prioritizing research” is about as fuzzy as a commitment gets. Still, I’m sure this commitment will be received well by many in the A.I. ethics crowd, who want A.I. companies to make preventing near-term harms like bias and discrimination a priority over worrying about doomsday scenarios, as the A.I. safety folks do.
If you’re confused by the difference between “A.I. ethics” and “A.I. safety,” just know that there are two warring factions within the A.I. research community, each of which thinks the other is focused on preventing the wrong kinds of harms.
Commitment 8: The companies commit to develop and deploy advanced A.I. systems to help address society’s greatest challenges.
I don’t think many people would argue that advanced A.I. should not be used to help address society’s greatest challenges. The White House lists “cancer prevention” and “mitigating climate change” as two of the areas where it would like A.I. companies to focus their efforts, and it will get no disagreement from me there.
What makes this goal somewhat complicated, though, is that in A.I. research, what starts off looking frivolous often turns out to have more serious implications. Some of the technology that went into DeepMind’s AlphaGo — an A.I. system that was trained to play the board game Go — turned out to be useful in predicting the three-dimensional structures of proteins, a major discovery that boosted basic scientific research.
Overall, the White House’s deal with A.I. companies seems more symbolic than substantive. There is no enforcement mechanism to make sure companies follow these commitments, and many of them reflect precautions that A.I. companies are already taking.
Still, it’s a reasonable first step. And agreeing to follow these rules shows that the A.I. companies have learned from the failures of earlier tech companies, which waited to engage with the government until they got into trouble. In Washington, at least where tech regulation is concerned, it pays to show up early.