AI Safety Newsletter #36: Voluntary…

May 30, 2024

Plus, a Senate AI Policy Roadmap, and Chapter 1: An Overview of Catastrophic Risks

2 Comments

May 30, 2024Edited

I wish the "Voluntary Commitments are Insufficient" section had a bit more nuance here. I basically agree with the central point that they should only be considered one among many defense mechanisms and binding legislation is going to be much stronger. I would even agree that current RSPs, even if perfectly adhered to, are likely insufficient to prevent bad outcomes.

Still, this is the state of the art for frontier lab self-governance (as far as I can tell). It's hard to make it look attractive or worthwhile to companies (or regulators!) when, even after they do it, you say "eh that didn't mean anything anyway." Especially if you then have a new ask that you want to promote as enthusiastically as RSPs were once promoted. It's frustrating and discouraging when the goalposts are always moving.

I have a fair amount of experience thinking about the role of voluntary commitments in other industries, and saying they can just be broken without serious repercussions is also an oversimplification. True, they aren't binding in the way that contracts are. But there are a huge host of costs that can come from violating past voluntary commitments that don't make them trivial to abandon.

- It's really hard to abandon them without looking hypocritical and untrustworthy (to the public, to regulators, to employees)

- It opens up liability for deceptive advertising, if the safety practices were used to promote e.g. an AI product

- If the company is publicly traded, it can open up companies to liability for misleading shareholders.

- In large bureaucracies, lock-in effects make it easy to create new teams/procedures/practices/cultures and much harder to change them.

- In many, many industries, voluntary self-governance measures are themselves the first step to safety practices being codified in law or enforced by regulators.

I just wish the RSP-skeptical discourse was leaning away from "RSPs aren't enough" and toward "RSPs are great, but the work isn't done yet." Making sure the incentive gradient is rewarding steps in the right direction, not trivializing them.

Expand full comment

Reply (1)

Chris

Jun 14, 2024

You make excellent points. I feel like there are parallels with the challenges that AI safety scientists and climate scientists face. Both groups contend with limited resources, including time and funding, which are crucial for convincing the leaders of a powerful, bureaucratically entrenched, and inversely incentivized system to enact necessary changes. Given how rapidly the existential threats are approaching, I can sympathize with the frustration and desparation of those working to shift priorities before it is too late, but unfortunately, we need these players to cooperate to move forward. The ball is in their court. Either way, it's a gamble at this point. We either slow down to accommodate their lack of urgency and risk running out of time or we keep the early expectations above anything they feel they can accommodate and risk alienating them into deprioritizing it further. I agree RSPs are the first step and can lead to more, but how long will it take to get to step 2 if it took this long to even set up these guidelines? Like with climate change, we're already on borrowed time.

Expand full comment