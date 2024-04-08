OpenAI has advocated for an iterative deployment approach to artificial intelligence (AI) models, in its submission to the US Department of Commerce’s National Telecommunications and Information Administration’s (NTIA) consultation on the risks, benefits, and potential policy-related to dual-use foundation models for which the model weights are widely available. These are open foundation models that can be fine-tuned by developers using widely available computing. The weight of the model refers to the numerical parameters within an AI model that help determine its output in response to inputs. This weight changes based on its learning over time.

While OpenAI believes in the promise of the open AI ecosystem, it believes that its own iterative approach has helped it to study and mitigate the risks of its models, in ways that would be impossible if the weights had been released directly. The company explains that when it deployed GPT-2, it carried out a staged release of the model (gradual release of a family of models over time) to give people time to assess the properties of these models and discuss their societal implications. It also gave OpenAI time to evaluate the impacts of the release after each stage. Once satisfied with the absence of misuse, OpenAI released the full model weights. Similarly, for GPT-3, it released the model via the OpenAI API (an Application Programming Interface, which allows developers to build apps on existing technology).

Some context:

Released on February 21, NTIA’s consultation sought inputs on a range of topics including varying levels of openness of AI models, how to define the wide availability of model weights, the benefits and risks of open models versus those of closed models, and the role of the government in setting standards for the risks associated with these models.

The consultation notably didn’t define what an “open” AI model meant. The consultation acknowledges the challenge of defining “open” or “widely available” foundation models. It says that there is a need for more information to detail the relationship between openness and the wide availability of both model weights and open foundation models. However, it said that companies like Google and Meta have “fostered an ecosystem of increasingly ‘open’ advanced AI models,” which effectively implied that the authority accepts the two companies’ open-source models under the scope of its definition.

Risk assessment should be expected based on investments in the model:

OpenAI states that in the case of highly capable foundational models that require significant resources to create, the developers should assess their model’s potential to pose catastrophic risks, and, if the model’s risk level is found to be high, put appropriate mitigations in place before deploying or releasing it. It says that such models have been made with heavy investments and that as such, the cost of assessing the risks they pose would only be a fraction of the model’s overall development costs. Investing in the assessment of these models, it argues, “strikes an appropriate balance between risk management and innovation.” It emphasizes that highly capable AI models should be assessed for the risks they pose irrespective of whether the model’s weights are intended to be released widely or through an API.

However, the company finds that less resource-intensive foundational models tend not to pose catastrophic risks, even with likely advances in finetuning and model-modification techniques. Since these models have been made with lesser investment, assessing them for risks would “cost a substantial fraction of the budget of small training runs, which could lead to a chilling effect on innovation and competition.” As such, OpenAI argues that such risk assessment should not be expected from less resource-intensive models.

How risk assessment differs for open weight models:

OpenAI explains that it has created a “Preparedness Framework” that is used to evaluate OpenAI’s models’ capabilities in high-risk domains like cybersecurity, autonomous operation, individualized persuasion, and CBRN (Chemical, Biological, Radiological, and Nuclear) threats. Based on this framework, OpenAI’s models are put into four risk categories, namely, low, medium, high, and critical, any model that falls in the high or critical risk category is not deployed by OpenAI. It says that such assessment tools are useful to evaluate the ex-ante risks from any type of model release, including open model weight releases.

It clarifies that there are specific factors to consider when assessing open-weight models:

Downstream modification: Open foundational models can be modified by developers who build on top of them. As such, the testing conditions for open-weight models should include how the model can be modified, especially by malicious actors.

Open foundational models can be modified by developers who build on top of them. As such, the testing conditions for open-weight models should include how the model can be modified, especially by malicious actors. Developers cannot rely on system-level safeguards: OpenAI says that developers of open foundational models will be unable to rely on system safeguards to reduce the risk of their model’s misuse, as safeguards can often be removed by a malicious downstream user who possesses the model weights. It explains that while this is currently not a challenge given that most models today do not pose severe risks. If such a model were to exist in the future, “then the path to reduce the risk of an open-weights release may rely on increasing the resilience of the external environment into which the model is released.”

OpenAI admits that the science of risk assessment is still nascent. It suggests that the government has an important role to play in helping the AI ecosystem mature its risk and capability evaluation practices. This can be done by “convening experts from the offensive cybersecurity, critical infrastructure, and AI worlds to agree on a set of priority AI cyber threat models, and build out rigorous and empirical testbeds for assessing them.”

Need for resilience to AI misuse:

The company says that the need for societal resilience has greater significance than any company’s model release decisions. While frontier AI capabilities are currently only accessed by a few players, this would change over time. OpenAI says that the US and other countries currently have the opportunity to invest in measures that will limit the consequences of AI misuse. This could include strengthening resilience to AI-accelerated cyberattacks and biological threats.

If a model is rigorously shown to pose severe risks to public safety or national security, then the model’s developer would also have to work toward building awareness (such as via notifying infrastructure providers or limiting API deployment) about the same prior to its release. “This mirrors the norm of “responsible disclosure” from the cyber domain, where security researchers will temporarily embargo the release of vulnerabilities they find to give time for defenders to patch their systems,” OpenAI explains.

