In the future, there will be powerful AI, and there will be defenses trying to ensure that good things happen and bad things do not. Those defenses might simply be a human within reach of an off switch, or they might be the most comprehensive and effective defense system that it’s possible to create.
As you turn the dial as to how powerful the defenses are, at some point, the defenders win. In predicting the future we need to decide how far it’s possible to turn that dial, and whether anything prevents us from turning it far enough.
Can defensive measures be good enough?
…at least in the text below
A brief and incomplete list of ways to fail, just so we know we’re on the same page.
Saving the biggest until last:
Whether we reach Nirvana, Doom, or meander somewhere in between
Is there any stable end state at all that we can claim to be a good outcome?
If we conclude that there is no stable end state that includes powerful AI, the universe inevitably moves towards a state where singularly focused AIs merely battle it out for resources with no higher purpose, then we’re in a quandary. Fortunately, some of our natures of reality largely forbid that, so it’s not time to give up, at least until we know what reality we’re in. But is there such a stable state, even in tricky natures of reality and if so, how do we get to it?
The worlds of Physicalism seem the most difficult. Does a large number of AIs that all police each other represent a stable state? Or do the most single-minded resource gatherers gradually win out, doing everything they can to enhance their own computational power in an attempt to attain supremacy?
Many of these long-term doom scenarios are fragile and vulnerable to new discoveries in physics that might happen along the way. For me, it’s not possible to predict the future to a degree where we can say that there is no viable steady state. It is simply up to our future selves to plot a way toward it as the way becomes clear.
Whether this steady state of Nirvana exists or not, it’s still possible to miss it entirely and end up with very bad things. So that is what we’re trying to avoid.
AI purposefully running on platforms without any of these protections
One case where defenses don’t work is if they’re deliberately not present, or even if there are anti-defenses to make an AI catastrophe more likely. One scenario might be hostilities between nations, where one side wants a Doomsday Device it can use to threaten annihilation. An AI given goals very hostile to the other side and actively aided in its attempt to achieve world domination.
For once, the prospective impossibility of AI Alignment actually helps, albeit we find ourselves relying on Mutually Assured Destruction, again. Such a weaponized AI couldn’t be reliably targeted only at the other side and would stand a strong chance of also destroying its creator.