Last summer, artificial intelligence powerhouse OpenAI promised the White House it would rigorously safety test new versions of its groundbreaking technology to make sure the AI wouldn’t inflict damage – like teaching users to build bioweapons or helping hackers develop new kinds of cyberattacks.
But this spring, some members of OpenAI’s safety team felt pressured to speed through a new testing protocol, designed to prevent the technology from causing catastrophic harm, to meet a May launch date set by OpenAI’s leaders, according to three people familiar with the matter who spoke on the condition of anonymity for fear of retaliation.
Even before testing began on the model, GPT-4 Omni, OpenAI invited employees to celebrate the product, which would power ChatGPT, with a party at one of the company’s San Francisco offices.
“They planned the launch after-party prior to knowing if it was safe to launch,” one of the people said, speaking on the condition of anonymity to discuss sensitive company information. “We basically failed at the process.”
The previously unreported incident sheds light on the changing culture at OpenAI, where company leaders such as CEO Sam Altman have been accused of prioritizing commercial interests over public safety – a stark departure from the company’s roots as an altruistic nonprofit. It also raises questions about the federal government’s reliance on self-policing by tech companies – through the White House pledge, as well as an executive order on AI passed in October – to protect the public from abuses of generative AI, which executives think has the potential to remake virtually every aspect of human society, from work to war.
Andrew Strait, a former ethics and policy researcher at Google DeepMind, now associate director at the Ada Lovelace Institute in London, said allowing companies to set their own standards for safety is inherently risky.
“We have no meaningful assurances that internal policies are being faithfully followed or supported by credible methods,” Strait said.
EMPLOYEES DISMAYED
President Biden has said that Congress needs to create new laws to protect the public from AI risks.
“President Biden has been clear with tech companies about the importance of ensuring that their products are safe, secure, and trustworthy before releasing them to the public,” said White House spokeswoman Robyn Patterson. “Leading companies have made voluntary commitments related to independent safety testing and public transparency, which he expects they will meet.”
OpenAI is one of more than a dozen companies that made voluntary commitments to the White House last year, a precursor to the AI executive order. Among the others are Anthropic, the company behind the Claude chatbot; Nvidia, the $3 trillion chips juggernaut; Palantir, the data analytics company that works with militaries and governments; Google DeepMind; and Facebook parent company Meta. The pledge requires them to safeguard increasingly capable AI models; the White House said it would remain in effect until similar regulation came into force.
OpenAI’s newest model, GPT-4o, was the company’s first big chance to apply the framework, which calls for the use of human evaluators, including post-Ph.D. professionals trained in biology and third-party auditors, if risks are deemed sufficiently high. But testers compressed the evaluations into a single week, despite complaints from employees.
Though they expected the technology to pass the tests, many employees were dismayed to see OpenAI treat its vaunted new preparedness protocol as an afterthought. In June, several current and former OpenAI employees signed a cryptic open letter demanding that AI companies exempt their workers from confidentiality agreements, freeing them to warn regulators and the public about safety risks of the technology.
Meanwhile, former OpenAI executive Jan Leike resigned days after the GPT-4o launch, writing on X that “safety culture and processes have taken a backseat to shiny products.” And former OpenAI research engineer William Saunders, who resigned in February, said in a podcast interview he had noticed a pattern of “rushed and not very solid” safety work “in service of meeting the shipping date” for a new product.
A representative of OpenAI’s preparedness team, who spoke on the condition of anonymity to discuss sensitive company information, said the evaluations took place during a single week, which was sufficient to complete the tests, but acknowledged that the timing had been “squeezed.”
We “are rethinking our whole way of doing it,” the representative said. “This (was) just not the best way to do it.”
In a statement, OpenAI spokesperson Lindsey Held said the company “didn’t cut corners on our safety process, though we recognize the launch was stressful for our teams.” To comply with the White House commitments, the company “conducted extensive internal and external” tests and held back some multimedia features “initially to continue our safety work,” she added.
‘LET’S NOT DO IT AGIN’
OpenAI announced the preparedness initiative as an attempt to bring scientific rigor to the study of catastrophic risks, which it defined as incidents “which could result in hundreds of billions of dollars in economic damage or lead to the severe harm or death of many individuals.”
The term has been popularized by an influential faction within the AI field who are concerned that trying to build machines as smart as humans might disempower or destroy humanity. Many AI researchers argue these existential risks are speculative and distract from more pressing harms.
“We aim to set a new high-water mark for quantitative, evidence-based work,” Altman posted on the social media platform X in October, announcing the company’s new team.
OpenAI has launched two new safety teams in the last year, which joined a long-standing division focused on concrete harms, like racial bias or misinformation.
The Superalignment team, announced in July, was dedicated to preventing existential risks from far-advanced AI systems. It has since been redistributed to other parts of the company.
Leike and OpenAI co-founder Ilya Sutskever, a former board member who voted to push out Altman as CEO in November before quickly recanting, led the team. Both resigned in May. Sutskever has been absent from the company since Altman’s reinstatement, but OpenAI did not announce his resignation until the day after the launch of GPT-4o.
According to the OpenAI representative, however, the preparedness team had the full support of top executives.
Realizing that the timing for testing GPT-4o would be tight, the representative said, he spoke with company leaders, including Chief Technology Officer Mira Murati, in April and they agreed to a “fallback plan.” If the evaluations turned up anything alarming, the company would launch an earlier iteration of GPT-4o that the team had already tested.
A few weeks prior to the launch date, the team began doing “dry runs,” planning to have “all systems go the moment we have the model,” the representative said. They scheduled human evaluators in different cities to be ready to run tests, a process that cost hundreds of thousands of dollars, according to the representative.
Prep work also involved warning OpenAI’s Safety Advisory Group – a newly created board of advisers who receive a scorecard of risks and advise leaders if changes are needed – that it would have limited time to analyze the results.
OpenAI’s Held said the company committed to allocating more time for the process in the future.
“I definitely don’t think we skirted on [the tests],” the representative said. But the process was intense, he acknowledged. “After that, we said, ‘Let’s not do it again.’ ”
Washington Post writer Razzan Nakhlawi contributed to this report.
Send questions/comments to the editors.
Join the Conversation
We believe it’s important to offer commenting on certain stories as a benefit to our readers. At its best, our comments sections can be a productive platform for readers to engage with our journalism, offer thoughts on coverage and issues, and drive conversation in a respectful, solutions-based way. It’s a form of open discourse that can be useful to our community, public officials, journalists and others. Read more...
We do not enable comments on everything — exceptions include most crime stories, and coverage involving personal tragedy or sensitive issues that invite personal attacks instead of thoughtful discussion.
For those stories that we do enable discussion, our system may hold up comments pending the approval of a moderator for several reasons, including possible violation of our guidelines. As the Maine Trust’s digital team reviews these comments, we ask for patience.
Comments are managed by our staff during regular business hours Monday through Friday and limited hours on Saturday and Sunday. Comments held for moderation outside of those hours may take longer to approve.
By joining the conversation, you are agreeing to our commenting policy and terms of use. More information is found on our FAQs.
You can modify your screen name here.
Show less
Join the Conversation
Please sign into your Press Herald account to participate in conversations below. If you do not have an account, you can register or subscribe. Questions? Please see our FAQs.