A vibe of acceleration
A lot of the vibe of early CFAR (say 2013 to 2015) was that of pushing our limits to become better, stronger, faster. How to get more done in a day, how to become superhumanly effective.
We were trying to save the world, and we were in a race against Unfriendly AI. If CFAR made some of the people in this small community that focused on the important problems 10% more effective and more productive, then we would be that much closer to winning. [ 1 ]
(This isn’t actually what CFAR was doing if you blur your eyes and look at the effects, instead of following the vibe or specific people’s narratives. What CFAR was actually doing was mostly community building and culture propagation. But this is what the vibe was.)
There was sort of a background assumption that augmenting the EA team, or the MIRI team, increasing their magnitude, was good and important and worthwhile.
A notable example that sticks out in my mind: I had a meeting with Val, in which I said that I wanted to test his Turbocharging Training methodology, because if it worked “we should teach it to all the EAs.” (My exact words, I think.)
This vibe wasn’t unique to CFAR. A lot of it came from LessWrong. And early EA as a whole had a lot of this.
I think that partly this was tied up with a relative optimism that was pervasive in that time period. There was a sense that the stakes were dire, but we were going to meet it with grim determination. And there was a kind of energy in the air, if not an endorsed belief, that we would become strong enough, we would solve the problems, and eventually we would win, leading into transhuman utopia.
Like, people talked about x-risk, and how we might all die, but the emotional narrative-feel of the social milieu was more optimistic: that we would rise to the occasion, and things would be awesome forever.
That shifted in 2016, with AlphaZero and some other stuff, when a MIRI leadership’s timelines shortened considerably. There was a bit of “timelines fever”, and a sense of pessimism that has been growing since. [ 2 ]
I still have a lot of that vibe myself. I’m very interested in getting Stronger, and faster, and more effective. I certainly have an excitement about interventions to increase magnitude.
But, personally, I’m also much more wary of the appeal of that kind of thing and much less inclined to invest in magnitude-increasing interventions.
That sort of orientation makes sense for the narrative of running a race: “we need to get to Friendly AI before Unfriendly AI arrives.” But given the world, it seems to me that that sort of narrative frame is mostly a bad fit for the actual shape of the problem.
Our situation is that…
1) No one knows what to do, really. There are some research avenues that individual people find promising, but there’s no solution-machine that’s clearly working: no approach that has a complete map of the problem to be solved.
2) There’s much less of a clean and clear distinction between “team FAI” and “team AGI”. It’s less the case that “the world saving team” is distinct from the forces driving us towards doom.
A large fraction of the people motivated by concerns of existential safety work for the leading AGI labs, sometimes directly on capabilities, sometimes on approaches that are ambiguously safety or capabilities, depending on who you ask.
And some of the people who seemed most centrally in the “alignment progress” cluster, the people whom I would have been most unreservedly enthusiastic to boost, have produced results that seem to have been counterfactual to major hype-inducing capability advances. I don’t currently know that to be true, or (conditioning on it being true) know that it was net-harmful. But it definitely undercuts my unreserved enthusiasm for providing support for Paul. (My best guess is that it is still net-positive, and I still plan to seize opertunities I see to help him, if they arise, but less confidently than I would have 2 years ago.)
Going faster and finding ways to go faster is an exploit move. It makes sense when there are some systems (“solution machines“) that are working well, that are making progress, and we want them to work better, to make more progress. But there’s nothing like that currently making systematic progress on .
We’re in an exploration phase, not an execution phase. The thing that the world needs is people who are stepping back and making sense of things, trying to understand the problem well enough to generate ideas that have any hope of working. [ 3 ] Helping the existing systems, heading in the direction that they’re heading, to go faster…is less obviously helpful.
The world has much much more traction on developing AGI than it does on developing FAI. There’s something like a machine that can just turn the crank on making progress towards AGI. There’s no equivalent machine that can take in resources and make progress on safety.
Because of that, it seems plausible that interventions that make people faster, that increase their magnitude instead refining their direction, disproportionately benefit capabilities.
I’m not sure that that’s true. It could be that capabilities progress marches to the drumbeat of hardware progress, and everyone including the outright capabilities researchers moving faster relative to growth in compute is a net gain. It effectively gives humanity more OODA loops on the problems. Maybe increasing everyone’s productivity is good.
I’m not confident in either direction. I’m ambivalent about the sign of those sorts of interventions. And that uncertainly is enough reason for me to think that investing tools to increase people’s magnitude is not a good bet.
Does this mean that I’m giving up on personal growth or helping people around me become better? Emphatically not.
But it does change what kinds of interventions I’m focusing on.
I’m conscious of deferentially promoting the kinds of tech and the cultural memes that seem like they provide us more capacity for orienting, more spaciousness, more wisdom, more carefulness of thought. Methods that help us refine our direction, instead of increase our magnitude.
A heuristic that I use for assessing practices and techniques that I’m considering investing in or spreading: “Would I feel good if this was adopted wholesale by DeepMind or OpenAI?”
Sometimes the answer is “yes”. DeepMind employees having better emotional processing skills, or having a habit of building lines of retreat, seems positive for the world. That would give the individuals and the culture more capacity to reflect, to notice subtle notes of discord, to have flexibility instead from a the tunnel vision of defensiveness or fear.
These days, I’m aiming to develop and promote tools, practices, and memes, that seem good by that heuristic.
I’m more interested in finding ways to give people space to think, than I am in helping them be more productive. Space to think seems more robustly beneficial.
I’m writing this up in large part because it seems like many younger EAs are still acting in accordance with the operational assumption that “making EAs faster and more effective is obviously good.” Indeed, it seems so straightforward, that they don’t seriously question it. “EA is good, so EAs being more effective is good.”
If, you, dear reader, are one of them, you might want to consider these questions over the coming weeks, and ask how you could distinguish between the world where your efforts are helping and the world where they’re making things worse.
I used to think that way. But I don’t anymore. It seems like “effectiveness” in the way that people typically mean it is of ambiguous sign, and actually what we’re bottleneck on is wayfinding.
[ 1 ] – As a number of people noted at the time, the early CFAR workshop was non-trivially a productivity skills program. Certainly epistemology, calibration, and getting maps to reflect the territory were core to the techniques, and ethos. But also a lot of the content was geared towards being more effective, not being blocked, setting habits, and getting stuff done, and only indirectly about figuring out what’s true. (notable examples: TAPs, CoZE as exposure therapy, Aversion Factoring, Propagating Urges, GTD) To a large extent, CFAR was about making participants go faster and hit harder. And there was a sense of enthusiasm
[ 2 ] – The high point of optimism was probably early 2015, when Elon Musk donated 10 million to the future of life institute (“to the community” as Anna put it, at my CFAR workshop of that year). At that point I think people expected him to join the fight.
And then Elon founded OpenAI instead.
I think that this was the emotional turning point for some of the core leaders of the AI-risk cause, and that shift in emotional tenor leaked out into community culture.
[ 3 ] – To be clear, I’m not necessarily recommending stepping back from engagement with the world. Getting orientation usually depends on close, active, contact with the territory. But it does mean that our goal should be less to affect the world, and more to just improve our own understanding enough that we can take action that reliably produces good results.