This is an experiment that I keep thinking of as people keep saying* “What’s the problem? We can just spin up the AI in a safe unconnected space and figure out whether it’s good or not before we let it interact with other systems”.
AI-box experiment - RationalWiki
Worth a read for anyone who is considering AI safety and alignment.
* For example, my colleague wrote just now: “It’s also interesting to think about “it may not be ‘safe’ to let the bots all loose in the wild”, but what about putting them all in a room w/ no doors/windows and let them hash through things and we pull out what we want/find valuable - and then, toss in the “how do we make this safe/reliable/etc.” into such a room and let them churn on it?”