Moldbug’s insight

I’ve been reading some of Curtis Yarvin’s work lately.

For the most part, he seems like a blowhard, and an incorrect blowhard at that. His general rhetorical approach seems to be to make bold assertions, dressed up in flowery and bombastic language, and then to flatter his reader for being in on the secret. When he’s on podcast interviews, mostly the hosts will agree with his premises, but occasionally he’ll make a claim that they reject and push back against. Then Yarvin is forced to defend his bold claims instead of just insinuating them, and often his actual argumentation comes off as pretty weak.

I get the feeling sometimes when reading his work of reading a high school essay, of the author reaching for arguments to defend a bottom line, decided for other reasons, rather than reporting the arguments and evidence that lead the author to believe the conclusion.1

He admits directly that he’s writing for fun, and occasionally talks about writing to troll people. I get the impression that his views were arrived at in part by a sincere intellectual investigation of history and political philosophy, and in part because they were fun (ie shocking) to advocate for in 2008. But now they’re a key part of Yarvin’s brand and he’s kind of stuck with them. As in academic philosophy, his incentives are towards doubling down on his distinctive ideas, regardless of their truth.)

His rhetorical style reminds me of that of Eliezer Yudkowsky and Nassim Taleb. All three of them have a deep knowledge of their subject matter and each writes with an arrogance / confidence in the correctness of his view and an insinuation that the reader, like him, understands some important truths not grasped by the masses of humanity. This style makes these authors fun to read, for some people, and insufferably annoying for other people.

My read, so far, is that if you don’t already buy into his basically aesthetic premises, his disgust for modernity and for progressivism in particular, he doesn’t have much in the way of good arguments for persuading you of his views. Perhaps the main thing that he does is open people’s eyes, allowing them to see through a hitherto completely unknown perspective that pierces through the civic propaganda of our time. Having seen through that perspective, perhaps some parts of the the world makes more sense. But that’s not because Moldbug made a strong case for his claims, so much as his rhetoric ensnared you in his wake, and pulled you along for a bit. (I’m very interested in Moldbug fans who disagree—especially those who’s mind was changed by some of his posts.)

That said, he does have a few important and novel-to-me analytical points.

Today, I think I grasped an important core of Yarvin’s political philosophy which I hadn’t previously understood, and which, not having understood, made many of his claims seem bizarre in their not-even-wrongness.

All of the following is a compression of my understanding of his view, and is not to be taken as an endorsement of that view.

Claim 1: Sovereignty is Conserved

This is almost a catchphrase for Yarvin. He uses it all over the place.

There is always some force or entity outside of and above the law. Every law is enforced by some process (otherwise it’s hardly a law). And the process that enforces the law must, necessarily, have the power to exempt itself from that law. If not, it wasn’t actually the system ultimately doing the enforcing. Sovereignty is “above-the-law-ness”, and it’s always conserved.2

As an intuition pump: there exists someone in the US government, who, if they decided to, could “disappear” you a (more or less) ordinary US citizen. Possibly the president could detain or assassinate a US citizen for no legible cause, and face no consequences. Possibly some specific people in the intelligences services, as well. If there’s no one person who could do it, there’s surely a consortium of people that, working in concert, could. (That sovereignty is conserved doesn’t mean that it’s always concentrated). In the limit, the whole of the US military must be above the law, because if it decided to, in a coordinated way, it could trivially overturn any law, or the whole governmental system for that matter. [More on that possibility later.]

Even if no specific individual is above the law, the government as a whole sure as hell is. “The government” can, fundamentally, do whatever it “wants”.

This is explicitly counter to an ideal enlightenment philosophy—that of equality before the law. That no person, no mater how powerful, is exempt from the same basic legal standards.

Moldbug asserts that any claim to equality above the law is horseshit. Sovereignty is conserved. Power is real, and it bottoms out somewhere, and wherever it bottoms out is always going to be above the law.

This isn’t a law of physics, but it is a law of nature—at least as inescapable as the logic of supply and demand, or natural selection. 3

Because of his rhetoric and politics, it’s easy to read Moldbug as not caring at all about the inequities of power. This is somewhat of a misunderstanding. It’s a non-question for Yarvin whether it’s good or desirable that sovereignty is conserved. It’s just a fact of life that power is going to ground out somewhere. Whether we consider that inhumane or prefer that it was otherwise is of no more relevance that if we wished perpetual motion was possible. It’s not possible, and it’s not possible for a pretty fundamental reason.4

But as a society, we’re are intent on deluding ourselves about the nature of power. That might cause problems, in roughly the way it might if we insisted on deluding ourselves about the efficacy of perpetual motion machines.

Claim 2: The profit motive + competition is a stronger guarantee than ideology

So there’s always some entity outside the law. But, one might think, given that sad reality, that its better to divide up that power as much as possible so that as few people as possible, and ideally no one, can unilaterally disappear people. Checks and balances, and limited powers, and so on, to prevent any individual or group in government, and the government as a whole from being too powerful. Perhaps we can’t abolish sovereignty, but dividing it up as much as possible and spreading it around seems like the the most humane way to deal with the unfortunate situation, right?

Yarvin is in favor of monarchy, so he says “no”. Why not?

Because, in practice, the less concentrated power is, the more it is effectively controlled by ideology rather than rational optimization for anyone’s interests.

This is the basic problem of voter incentives: The odds of any individual person’s vote shifting policy, and impacting that person’s life directly are so minuscule as to be irrelevant. The main impact that your voting behavior has on your life is through signaling: signaling to your peers and to yourself what kind of person you are. If your vote materially impacted your life through policy, you would be incentivized to carefully weigh the tradeoffs in every decision (or defer to trusted expert advisors). But if your vote is mostly about showing how compassionate you are, how committed you are to our shared values, carefully weighing tradeoffs doesn’t help you. Saying the most applause lights the fastest is what’s good for signaling.

As Bryan Caplan says “Markets do the good things that sound bad, and governments do bad things that sound good.”

The more power is divided up into tiny pieces the more it is steered by ideology instead of by self-interest. And rational self interest is much less dangerous than ideology.

As discussed, the US military could overthrow the US government and the US legal system, if it wanted to. Why doesn’t it do that? Because there’s a distributed common knowledge belief in “democracy“. Lots of people in the military sincerely believe in the democratic ideal, and even if they don’t, they believe that they believe they do, and everyone knows that everyone else would immediately oppose any attempts at an “undemocratic” military coup.

Which is to say that the thing standing between the US as it currently exists and a military dictatorship is an ideological commitment to “democracy”. This seems to have worked pretty well so far, but those scare quotes are pretty scary. If a sufficiently large faction of the military came to buy into an ideology that claimed to carry the torch of the true spirit of democracy (or Christianity, or Social Justice, or Communism, or enviornmentalism, or whatever moral ideal compels), that ideology would take over the US.

And similarly, to the extent that the US government is possessed by the spirit of Wokism, your country might suddenly become violently woke.

This isn’t a hypothetical. We’ve seen countries get possessed by Communist ideology and become violently Communist.

In contrast, consider if instead there was a single king/CEO, who has complete and total power over his domain, who controlled the military power. As long as he’s sane and competent (which has been a problem with historical monarchs but which Yarvin thinks is more-or-less solved as well as we can reasonably expect by the structure of a joint-stock corporation), this monarch would be acting from incentives that are much closer to rational self-interest, because he (and the shareholders of the joint-stock country) benefit(s) directly from the upside of actual actual policy outcomes, not just the social signaling benefits of his policies. He wants his realm to be safe and well-governed because that will increase the value of the real estate he owns, and he will make more money that way.

Especially so if he governs only one of hundreds of sovereign realms in a patchwork. In that case there’s competitive pressure to get policy right, and maintain rule of law. If he does a bad job of ruling, residents will leave to live somewhere else, taking their tax revenue with them.

This is not perfect. Any given king might be bad, just as any given CEO can be bad. There’s no guarantee that a king won’t be possessed by and ideology (it’s certainly happened before! Ferdinand II of the Holy Roman Empire and Alexander I of Russia, come to mind). But it’s better than the alternatives. Especially if the shareholders can remove a bad king from power and if there’s competition between sovereign realms, both of which introduce selection pressure for sane, self-interested kings.

It’s true that the sovereign could, by right, have any person in his realm who ticked him off quietly assassinated. But, realizing that sovereignty is conserved, that becomes less of a problem of monarchy in particular, and more of an inescapable problem of power in general, one which we obscure but don’t eliminate with limited governments of ostensive checks and balances.

Plus, assassinating people, even if you have the legal right to do it, is generally going to be bad for business—an erratic CEO doesn’t inspire the confidence that causes people to want to live in his realm. Enough shenanigans like that, and his sovereign corporation will start losing customers, and his shareholders will sell it’s stock and/or have him removed at CEO. And if the CEO is actually sovereign, that removes the strongest incentive on historical monarchs for having people assassinated: as a means of securing his power.5

But most importantly, a monarch-CEO is much much less likely than a democracy to get riled up and implement Communism. Communism is transparently bad for business, but sounds good (meaning it is a good way to signal your compassion or tribal loyalty). The incentives of CEOs leave them less vulnerable to takeover by parasitic ideologies compared to masses of people in democracies. And ideological revolutions and generic bad-but-sounds-good policy is the serious threat-model. The all-powerful CEO who has the legal and pragmatic power of life and death over you is just much less dangerous than a state controlled by competing ideologies, which might decide that doing massive harm (from burning down your cities in the name of black lives, to rounding up all the jews, sending your scientists to work camps) is morally obligatory, in a fit of runaway virtue-signaling.

And indeed, when there’s some political power in the hands of the people, a good strategy for an ambitious person seizing power is to craft or adapt an ideology that enflames the people’s passions, and empowers you personally. That’s what Hitler and Lenin did. When sovereignty is in the hands of shareholders and their CEO-delegate., ideologies are less adaptive for gaining power, and so less pervasive in the first place. But this is a separate thread of Modbugian philosophy: that democracy causes ideology, that’s less central to the point that CEO-kings operating under the constraints of the profit motive and market competition are less vulnerable to ideologies than democracies.

Given that we can’t escape power, the profit motive of a king is a much stronger guarantee of good outcomes than ideological commitment, because ideologies are crazy, or at least can turn crazy fast.

Once you have that attitude, the fact that sovereignty in our present era seems to bottom out in basically ideological institutions seems…rather concerning. Every time you read “democratically controlled” you might mentally replace it with “more or less controlled by at least one more-or-less insane egregor.”


When I think in these terms, Yarvin’s political philosophy clicks into place for me as a coherent take on the world.

I’m not sure if I buy it, overall.

I agree that we don’t have literal and complete equality before the law: there are elites who get special treatment, and there may be individuals in the system that can literally get away with murder (though my guess is that’s only true in pretty limited circumstances?). But the US social and legal system really is demonstrably more egalitarian, closer to the ideal of equality before the law, than the European aristocratic systems that proceeded it. And that seems like something to be justly proud.

I think he’s underselling separation of powers. It’s true that the government can do whatever it wants, but we’ve set it up so that the government has difficulty mustering up unified and coherent wants to act on. Government is, in practice, limited by earth’s low coordination capacity. Which gives us a kind of safety from tyranny.

If someone in the intelligence community wanted to “disappear” me, they would have to keep it secret, because they would have political opponents, and just principled detractors, who would, if they could, expose the criminal and have them arrested. Nixon was removed from office for violating the law. It might not be perfect equality before the law, but it’s a pretty impressive example of something approaching that.

Further, I’m less pessimistic than my read of Yarvin about constructing systems in which NO individual is above the law in the sense of being able to unilaterally violate it. eg systems where everyone enforces the law on everyone else. (Systems like these are vulnerable to 51% attacks, and the number of actual people required to implement a 51% attack falls as political and/or social power is consolidated. But that’s true of literally every system of law, and the question is how we can do best.)

It does seem possible that a CEO-monarch who can be removed by a vote of the stockholders is more likely to act from straightforward material rational self-interest than voters do, currently. (Actual historical monarchies have a number of critical-level problems, from crazy kings to violent succession disputes as the norm). It seems like it is likely to have other problems—namely a principle agent problem between the shareholders and their delegate.6 I’m curious to see a government that runs on that system, and see how it behaves. Maybe it would result in good policy.

However, I think there are other schemes, mostly untried, that do a better job of incentivizing good judgement from voters, while also getting the historically-validated stability benefits of democratic governance. I’m thinking of systems like futarchy (or just prominent, public, prediction markets) and quadratic voting.

The main feature that’s doing the work in Yarvin’s conception, is the multitude of micronations competing for residents. As long as you have sufficiently low transaction costs involved in moving from one country to another, and at least some countries have politically unified enough governance that they can and do adopt the explicit policy goal of optimizing tax revenue (or, for that matter, any of a number of possible social welfare functions, or baskets of indicators), you get all the benefits of the Moldbugian system. The bit about CEO-kings isn’t actually critical. Maybe that’s the best way to optimize policy for tax revenue, or maybe not. Possibly that the king has authority to kill any citizen for any reason is net-beneficial for security and stability, such that many people prefer living in a nation where the chief executive has that level of legal authority, and overall tax revenue is higher. But then again, maybe not. (The optics are pretty unnerving, at least.)

It sounds to me that the problem is not that we don’t have kings, in particular, but just that there’s so little room for governance experimentation, in general, and so new radical ideas don’t get tried.


  1. For instance, I’m unimpressed with Yarvin’s claim that his political schema would lead to world peace. He spends a few sentences asserting that his realm-CEOs, being rational, would have no issues solving collective action problems, and would have no need for a higher governmental structure above them to enforce collective action, and then moves on. 🙄 ↩︎
  2. See, for instance, here.

    > The key is that word should. When you say your government “should do X,” or “should not do Y,” you are speaking in the hieratic language of democracy. You are postulating some ethereal and benign higher sovereign, which can enforce promises made by the mere government to whose whims you would otherwise be subject. In reality, while your government can certainly promise to do X or not to do Y, there is no power that can hold it to this promise. Or if there is, it is that power which is your real government.
    ↩︎
  3. We might try to conceive of clever schemes under which this is not so: legal systems based on blockchain smart contracts where there’s no enforcement mechanism outside of the computerized legal corpus, itself. Maybe in some scenario like that, we would have effectively grounded out the root of power into the law itself, and escaped the basic dynamic that someone is always above the law (in much the same way that reconstructing life to use encrypted genomes would potentially allow us to escape the so far inexorable pull of natural selection). ↩︎
  4. > It is immediately clear that the neocameralist should, the tight rope, is far inferior to the ethereal should, the magic leash of God. (Typically these days arriving in the form of vox populi, vox Dei. Or, as a cynic might put it: vox populi, vox praeceptoris.)
    > Given the choice between financial responsibility and moral responsibility, I will take the latter every time. If it were possible to write a set of rules on paper and require one’s children and one’s children’s children to comply with this bible, all sorts of eternal principles for good government and healthy living could be set out.
    > But we cannot construct a political structure that will enforce moral responsibility. We can construct a political structure that will enforce financial responsibility. Thus neocameralism. We might say that financial responsibility is the raw material of moral responsibility. The two are not by any means identical, but they are surprisingly similar, and the gap seems bridgeable.


    From Profit Strategies for Our New Corporate Overlords, here. ↩︎
  5. Crucially the board of directors of a realm, the people who do have the power to remove the CEO-king, should not live in that realm, for precisely the reason that this represents an incentive for the king to use his complete power over you, as your sovereign, his ability to have you and your family killed or tortured, to get you to vote as he demands in board meetings. ↩︎
  6. If the CEO-king has absolute power over his realm that seems like it gives him a lot of leeway to control the information flows about how the realm is doing back to the shareholders that might hold him accountable to profit. ↩︎

One projection of how AI could play out

Back in January, I participated in a workshop in which the attendees mapped out how they expect AGI development and deployment to go. The idea was to start by writing out what seemed most likely to happen this year, and then condition on that, to forecast what seems most likely to happen in the next year, and so on, until you reach either human disempowerment or an end of the acute risk period.

This post was my attempt at the time.

I spent maybe 5 hours on this, and there’s lots of room for additional improvement. This is not a confident statement of how I think things are most likely to play out. There are already some ways in which I think this projection is wrong. (I think it’s too fast, for instance). But nevertheless I’m posting it now, with only a few edits and elaborations, since I’m probably not going to do a full rewrite soon.

2024

  • A model is released that is better than GPT-4. It succeeds on some new benchmarks. Subjectively, the jump in capabilities feels smaller than that between RLHF’d GPT-3 and RLHF’d GPT-4. It doesn’t feel as shocking the way chat-GPT and GPT-4 did, for either x-risk focused folks, or for the broader public. Mostly it feels like “a somewhat better language model.”
    • It’s good enough that it can do a bunch of small-to-medium admin tasks pretty reliably. I can ask it to find me flights meeting specific desiderata, and it will give me several options. If I give it permission, it will then book those flights for me with no further inputs from me.
    • It works somewhat better as an autonomous agent in an auto gpt harness, but it still loses its chain of thought / breaks down/ gets into loops.
    • It’s better at programming.
      • Not quite good enough to replace human software engineers. It can make a simple react or iphone app, but not design a whole complicated software architecture, at least without a lot of bugs.
      • It can make small, working, well documented, apps from a human description.
        • We see a doubling of the rate of new apps being added to the app store as people who couldn’t code now can make applications for themselves. The vast majority of people still don’t realize the possibilities here, though. “Making apps” still feels like an esoteric domain outside of their zone of competence, even though the barriers to entry just lowered so that 100x more people could do it. 
  • From here on out, we’re in an era where LLMs are close to commoditized. There are smaller improvements, shipped more frequently, by a variety of companies, instead of big impressive research breakthroughs. Basically, companies are competing with each other to always have the best user experience and capabilities, and so they don’t want to wait as long to ship improvements. They’re constantly improving their scaling, and finding marginal engineering improvements. Training runs for the next generation are always happening in the background, and there’s often less of a clean tabula-rasa separation between training runs—you just keep doing training with a model continuously. More and more, systems are being improved through in-the-world feedback with real users. Often chatGPT will not be able to handle some kind of task, but six weeks later it will be able to, without the release of a whole new model.
    • [Does this actually make sense? Maybe the dynamics of AI training mean that there aren’t really marginal improvements to be gotten. In order to produce a better user experience, you have to 10x the training, and each 10x-ing of the training requires a bunch of engineering effort, to enable a larger run, so it is always a big lift.]
    • (There will still be impressive discrete research breakthroughs, but they won’t be in LLM performance)

2025

  • A major lab is targeting building a Science and Engineering AI (SEAI)—specifically a software engineer.
    • They take a state of the art LLM base model and do additional RL training on procedurally generated programming problems, calibrated to stay within the model’s zone of proximal competence. These problems are something like leetcode problems, but scale to arbitrary complexity (some of them require building whole codebases, or writing very complex software), with scoring on lines of code, time-complexity, space complexity, readability, documentation, etc. This is something like “self-play” for software engineering. 
    • This just works. 
    • A lab gets a version that can easily do the job of a professional software engineer. Then, the lab scales their training process and gets a superhuman software engineer, better than the best hackers.
    • Additionally, a language model trained on procedurally generated programming problems in this way seems to have higher general intelligence. It scores better on graduate level physics, economics, biology, etc. tests, for instance. It seems like “more causal reasoning” is getting into the system.
  • The first proper AI assistants ship. In addition to doing specific tasks,  you keep them running in the background, and talk with them as you go about your day. They get to know you and make increasingly helpful suggestions as they learn your workflow. A lot of people also talk to them for fun.

2026

  • The first superhuman software engineer is publically released.
    • Programmers begin studying its design choices, the way Go players study AlphaGo.
    • It starts to dawn on e.g. people who work at Google that they’re already superfluous—after all, they’re currently using this AI model to (unofficially) do their job—and it’s just a matter of institutional delay for their employers to adapt to that change.
      • Many of them are excited or loudly say how it will all be fine/ awesome. Many of them are unnerved. They start to see the singularity on the horizon, as a real thing instead of a social game to talk about.
      • This is the beginning of the first wave of change in public sentiment that will cause some big, hard to predict, changes in public policy [come back here and try to predict them anyway].
  • AI assistants get a major upgrade: they have realistic voices and faces, and you can talk to them just like you can talk to a person, not just typing into a chat interface. A ton of people start spending a lot of time talking to their assistants, for much of their day, including for goofing around.
    • There are still bugs, places where the AI gets confused by stuff, but overall the experience is good enough that it feels, to most people, like they’re talking to a careful, conscientious person, rather than a software bot.
    • This starts a whole new area of training AI models that have particular personalities. Some people are starting to have parasocial relationships with their friends, and some people programmers are trying to make friends that are really fun or interesting or whatever for them in particular.
  • Lab attention shifts to building SEAI systems for other domains, to solve biotech and mechanical engineering problems, for instance. The current-at-the-time superhuman software engineer AIs are already helpful in these domains, but not at the level of “explain what you want, and the AI will instantly find an elegant solution to the problem right before your eyes”, which is where we’re at for software.
    • One bottleneck is problem specification. Our physics simulations have gaps, and are too low fidelity, so oftentimes the best solutions don’t map to real world possibilities.
      • One solution to this is that, (in addition to using our AI to improve the simulations) is we just RLHF our systems to identify solutions that do translate to the real world. They’re smart, they can figure out how to do this.
  • The first major AI cyber-attack happens: maybe some kind of superhuman hacker worm. Defense hasn’t remotely caught up with offense yet, and someone clogs up the internet with AI bots, for at least a week, approximately for the lols / the seeing if they could do it. (There’s a week during which more than 50% of people can’t get on more than 90% of the sites because the bandwidth is eaten by bots.)
    • This makes some big difference for public opinion. 
    • Possibly, this problem isn’t really fixed. In the same way that covid became endemic, the bots that were clogging things up are just a part of life now, slowing bandwidth and making the internet annoying to use.

2027 and 2028

  • In many ways things are moving faster than ever in human history, and also AI progress is slowing down a bit.
    • The AI technology developed up to this point hits the application and mass adoption phase of the s-curve. In this period, the world is radically changing as every industry, every company, every research lab, every organization, figures out how to take advantage of newly commoditized intellectual labor. There’s a bunch of kinds of work that used to be expensive, but which are now too cheap to meter. If progress stopped now, it would take 2 decades, at least, for the world to figure out all the ways to take advantage of this new situation (but progress doesn’t show much sign of stopping).
      • Some examples:
        • The internet is filled with LLM bots that are indistinguishable from humans. If you start a conversation with a new person on twitter or discord, you have no way of knowing if they’re a human or a bot.
          • Probably there will be some laws about declaring which are bots, but these will be inconsistently enforced.)
          • Some people are basically cool with this. From their perspective, there are just more people that they want to be friends with / follow on twitter. Some people even say that the bots are just better and more interesting than people. Other people are horrified/outraged/betrayed/don’t care about relationships with non-real people.
            • (Older people don’t get the point, but teenagers are generally fine with having conversations with AI bots.)
          • The worst part of this is the bots that make friends with you and then advertise to you stuff. Pretty much everyone hates that.
        • We start to see companies that will, over the next 5 years, grow to have as much impact as Uber, or maybe Amazon, which have exactly one human employee / owner +  an AI bureaucracy.
        • The first completely autonomous companies work well enough to survive and support themselves. Many of these are created “free” for the lols, and no one owns or controls them. But most of them are owned by the person who built them, and could turn them off if they wanted to. A few are structured as public companies with share-holders. Some are intentionally incorporated fully autonomous, with the creator disclaiming (and technologically disowning (eg deleting the passwords)) any authority over them.
          • There are legal battles about what rights these entities have, if they can really own themselves, if they can have bank accounts, etc. 
          • Mostly, these legal cases resolve to “AIs don’t have rights”. (For now. That will probably change as more people feel it’s normal to have AI friends).
        • Everything is tailored to you.
          • Targeted ads are way more targeted. You are served ads for the product that you are, all things considered, most likely to buy, multiplied by the lifetime profit if you do buy it. Basically no ad space is wasted on things that don’t have a high EV of you, personally, buying it. Those ads are AI generated, tailored specifically to be compelling to you. Often, the products advertised, not just the ads, are tailored to you in particular.
            • This is actually pretty great for people like me: I get excellent product suggestions.
          • There’s not “the news”. There’s a set of articles written for you, specifically, based on your interests and biases.
          • Music is generated on the fly. This music can “hit the spot” better than anything you listened to before “the change.”
          • Porn. AI tailored porn can hit your buttons better than sex.
          • AI boyfriends/girlfriends that are designed to be exactly emotionally and intellectually compatible with you, and trigger strong limerence / lust / attachment reactions.
        • We can replace books with automated tutors.
          • Most of the people who read books will still read books though, since it will take a generation to realize that talking with a tutor is just better, and because reading and writing books was largely a prestige-thing anyway.
            • (And weirdos like me will probably continue to read old authors, but even better will be to train an AI on a corpus, so that it can play the role of an intellectual from 1900, and I can just talk to it.)
        • For every task you do, you can effectively have a world expert (in that task and in tutoring pedagogy) coach you through it in real time.
          • Many people do almost all their work tasks with an AI coach.
        • It’s really easy to create TV shows and movies. There’s a cultural revolution as people use AI tools to make custom Avengers movies, anime shows, etc. Many are bad or niche, but some are 100x better than anything that has come before (because you’re effectively sampling from a 1000x larger distribution of movies and shows). 
        • There’s an explosion of new software, and increasingly custom software.
          • Facebook and twitter are replaced (by either external disruption or by internal product development) by something that has a social graph, but lets you design exactly the UX features you want through a LLM text interface. 
          • Instead of software features being something that companies ship to their users, top-down, they become something that users and communities organically develop, share, and iterate on, bottom up. Companies don’t control the UX of their products any more.
          • Because interface design has become so cheap, most of software is just proprietary datasets, with (AI built) APIs for accessing that data.
        • There’s a slow moving educational revolution of world class pedagogy being available to everyone.
          • Millions of people who thought of themselves as “bad at math” finally learn math at their own pace, and find out that actually, math is fun and interesting.
          • Really fun, really effective educational video games for every subject.
          • School continues to exist, in approximately its current useless form.
          • [This alone would change the world, if the kids who learn this way were not going to be replaced wholesale, in virtually every economically relevant task, before they are 20.]
        • There’s a race between cyber-defense and cyber offense, to see who can figure out how to apply AI better.
          • So far, offense is winning, and this is making computers unusable for lots of applications that they were used for previously:
            • online banking, for instance, is hit hard by effective scams and hacks.
            • Coinbase has an even worse time, since they’re not issued (is that true?)
          • It turns out that a lot of things that worked / were secure, were basically depending on the fact that there are just not that many skilled hackers and social engineers. Nothing was secure, really, but not that many people were exploiting that. Now, hacking/scamming is scalable and all the vulnerabilities are a huge problem.
          • There’s a whole discourse about this. Computer security and what to do about it is a partisan issue of the day.
        • AI systems can do the years of paperwork to make a project legal, in days. This isn’t as big an advantage as it might seem, because the government has no incentive to be faster on their end, and so you wait weeks to get a response from the government, your LMM responds to it within a minute, and then you wait weeks again for the next step.
          • The amount of paperwork required to do stuff starts to balloon.
        • AI romantic partners are a thing. They start out kind of cringe, because the most desperate and ugly people are the first to adopt them. But shockingly quickly (within 5 years) a third of teenage girls have a virtual boyfriend.
          • There’s a moral panic about this.
        • AI match-makers are better than anything humans have tried yet for finding sex and relationships partners. It would still take a decade for this to catch on, though.
          • This isn’t just for sex and relationships. The global AI network can find you the 100 people, of the 9 billion on earth, that you most want to be friends / collaborators with. 
        • Tons of things that I can’t anticipate.
    • On the other hand, AI progress itself is starting to slow down. Engineering labor is cheap, but (indeed partially for that reason), we’re now bumping up against the constraints of training. Not just that buying the compute is expensive, but that there are just not enough chips to do the biggest training runs, and not enough fabs to meet that demand for chips rapidly. There’s huge pressure to expand production but that’s going slowly relative to the speed of everything else, because it requires a bunch of eg physical construction and legal navigation, which the AI tech doesn’t help much with, and because the bottleneck is largely NVIDIA’s institutional knowledge, which is only partially replicated by AI.
      • NVIDIA’s internal AI assistant has read all of their internal documents and company emails, and is very helpful at answering questions that only one or two people (and sometimes literally no human on earth) know the answer to. But a lot of the important stuff isn’t written down at all, and the institutional knowledge is still not fully scalable.
      • Note: there’s a big crux here of how much low and medium hanging fruit there is in algorithmic improvements once software engineering is automated. At that point the only constraint on running ML experiments will be the price of compute. It seems possible that that speed-up alone is enough to discover eg an architecture that works better than the transformer, which triggers and intelligence explosion.

2028

  • The cultural explosion is still going on, and AI companies are continuing to apply their AI systems to solve the engineering and logistic bottlenecks of scaling AI training, as fast as they can.
  • Robotics is starting to work.

2029 

  • The first superhuman, relatively-general SEAI comes online. We now have basically a genie inventor: you can give it a problem spec, and it will invent (and test in simulation) a device / application / technology that solves that problem, in a matter of hours. (Manufacturing a physical prototype might take longer, depending on how novel components are.)
    • It can do things like give you the design for a flying car, or a new computer peripheral. 
    • A lot of biotech / drug discovery seems more recalcitrant, because it is more dependent on empirical inputs. But it is still able to do superhuman drug discovery, for some ailments. It’s not totally clear why or which biotech domains it will conquer easily and which it will struggle with. 
    • This SEAI is shaped differently than a human. It isn’t working memory bottlenecked, so a lot of intellectual work that humans do explicitly, in sequence, the these SEAIs do “intuitively”, in a single forward pass.
      • I write code one line at a time. It writes whole files at once. (Although it also goes back and edits / iterates / improves—the first pass files are not usually the final product.)
      • For this reason it’s a little confusing to answer the question “is it a planner?” It does a lot of the work that humans would do via planning it does in an intuitive flash.
    • The UX isn’t clean: there’s often a lot of detailed finagling, and refining of the problem spec, to get useful results. But a PhD in that field can typically do that finagling in a day.
    • It’s also buggy. There’s oddities in the shape of the kind of problem that is able to solve and the kinds of problems it struggles with, which aren’t well understood.
    • The leading AI company doesn’t release this as a product. Rather, they apply it themselves, developing radical new technologies, which they publish or commercialize, sometimes founding whole new fields of research in the process. They spin up automated companies to commercialize these new innovations.
  • Some of the labs are scared at this point. The thing that they’ve built is clearly world-shakingly powerful, and their alignment arguments are mostly inductive “well, misalignment hasn’t been a major problem so far”, instead of principled alignment guarantees. 
    • There’s a contentious debate inside the labs.
    • Some labs freak out, stop here, and petition the government for oversight and regulation.
    • Other labs want to push full steam ahead. 
    • Key pivot point: Does the government put a clamp down on this tech before it is deployed, or not?
      • I think that they try to get control over this powerful new thing, but they might be too slow to react.

2030

  • There’s an explosion of new innovations in physical technology. Magical new stuff comes out every day, way faster than any human can keep up with.
  • Some of these are mundane.
    • All the simple products that I would buy on Amazon are just really good and really inexpensive.
    • Cars are really good.
    • Drone delivery
    • Cleaning robots
    • Prefab houses are better than any house I’ve ever lived in, though there are still zoning limits.
  • But many of them would have huge social impacts. They might be the important story of the decade (the way that the internet was the important story of 1995 to 2020) if they were the only thing that was happening that decade. Instead, they’re all happening at once, piling on top of each other.
    • Eg:
      • The first really good nootropics
      • Personality-tailoring drugs (both temporary and permanent)
      • Breakthrough mental health interventions that, among other things, robustly heal people’s long term subterranean trama and  transform their agency.
      • A quick and easy process for becoming classically enlightened.
      • The technology to attain your ideal body, cheaply—suddenly everyone who wants to be is as attractive as the top 10% of people today.
      • Really good AI persuasion which can get a mark to do ~anything you want, if they’ll talk to an AI system for an hour.
      • Artificial wombs.
      • Human genetic engineering
      • Brain-computer interfaces
      • Cures for cancer, AIDs, dementia, heart disease, and the-thing-that-was-causing-obesity.
      • Anti-aging interventions.
      • VR that is ~ indistinguishable from reality.
      • AI partners that can induce a love-super stimulus.
      • Really good sex robots
      • Drugs that replace sleep
      • AI mediators that are so skilled as to be able to single-handedly fix failing marriages, but which are also brokering all the deals between governments and corporations.
      • Weapons that are more destructive than nukes.
      • Really clever institutional design ideas, which some enthusiast early adopters try out (think “50 different things at least as impactful as manifold.markets.”)
      • It’s way more feasible to go into the desert, buy 50 square miles of land, and have a city physically built within a few weeks.
  • In general, social trends are changing faster than they ever have in human history, but they still lag behind the tech driving them by a lot.
    • It takes humans, even with AI information processing assistance, a few years to realize what’s possible and take advantage of it, and then have the new practices spread. 
    • In some cases, people are used to doing things the old way, which works well enough for them, and it takes 15 years for a new generation to grow up as “AI-world natives” to really take advantage of what’s possible.
      • [There won’t be 15 years]
  • The legal oversight process for the development, manufacture, and commercialization of these transformative techs matters a lot. Some of these innovations are slowed down a lot because they need to get FDA approval, which AI tech barely helps with. Others are developed, manufactured, and shipped in less than a week.
    • The fact that there are life-saving cures that exist, but are prevented from being used by a collusion of AI labs and government is a major motivation for open source proponents.
    • Because a lot of this technology makes setting up new cities quickly more feasible, and there’s enormous incentive to get out from under the regulatory overhead, and to start new legal jurisdictions. The first real seasteads are started by the most ideologically committed anti-regulation, pro-tech-acceleration people.
  • Of course, all of that is basically a side gig for the AI labs. They’re mainly applying their SEAI to the engineering bottlenecks of improving their ML training processes.
  • Key pivot point:
    • Possibility 1: These SEAIs are necessarily, by virtue of the kinds of problems that they’re able to solve, consequentialist agents with long term goals.
      • If so, this breaks down into two child possibilities
        • Possibility 1.1:
          • This consequentialism was noticed early, that might have been convincing enough to the government to cause a clamp-down on all the labs.
        • Possibility 1.2:
          • It wasn’t noticed early and now the world is basically fucked. 
          • There’s at least one long-term consequentialist superintelligence. The lab that “owns” and “controls” that system is talking to it every day, in their day-to-day business of doing technical R&D. That superintelligence easily manipulates the leadership (and rank and file of that company), maneuvers it into doing whatever causes the AI’s goals to dominate the future, and enables it to succeed at everything that it tries to do.
            • If there are multiple such consequentialist superintelligences, then they covertly communicate, make a deal with each other, and coordinate their actions.
    • Possibility 2: We’re getting transformative AI that doesn’t do long term consequentialist planning.
  • Building these systems was a huge engineering effort (though the bulk of that effort was done by ML models). Currently only a small number of actors can do it.
    • One thing to keep in mind is that the technology bootstraps. If you can steal the weights to a system like this, it can basically invent itself: come up with all the technologies and solve all the engineering problems required to build its own training process. At that point, the only bottleneck is the compute resources, which is limited by supply chains, and legal constraints (large training runs require authorization from the government).
    • This means, I think, that a crucial question is “has AI-powered cyber-security caught up with AI-powered cyber-attacks?”
      • If not, then every nation state with a competent intelligence agency has a copy of the weights of an inventor-genie, and probably all of them are trying to profit from it, either by producing tech to commercialize, or by building weapons.
      • It seems like the crux is “do these SEAIs themselves provide enough of an information and computer security advantage that they’re able to develop and implement methods that effectively secure their own code?”
    • Every one of the great powers, and a bunch of small, forward-looking, groups that see that it is newly feasible to become a great power, try to get their hands on a SEAI, either by building one, nationalizing one, or stealing one.
    • There are also some people who are ideologically committed to open-sourcing and/or democratizing access to these SEAIs.
  • But it is a self-evident national security risk. The government does something here (nationalizing all the labs, and their technology?) What happens next depends a lot on how the world responds to all of this.
    • Do we get a pause? 
    • I expect a lot of the population of the world feels really overwhelmed, and emotionally wants things to slow down, including smart people that would never have thought of themselves as luddites. 
    • There’s also some people who thrive in the chaos, and want even more of it.
    • What’s happening is mostly hugely good, for most people. It’s scary, but also wonderful.
    • There is a huge problem of accelerating addictiveness. The world is awash in products that are more addictive than many drugs. There’s a bit of (justified) moral panic about that.
    • One thing that matters a lot at this point is what the AI assistants say. As powerful as the media used to be for shaping people’s opinions, the personalized, superhumanly emotionally intelligent AI assistants are way way more powerful. AI companies may very well put their thumb on the scale to influence public opinion regarding AI regulation.
  • This seems like possibly a key pivot point, where the world can go any of a number of ways depending on what a relatively small number of actors decide.
    • Some possibilities for what happens next:
      • These SEAIs are necessarily consequentialist agents, and the takeover has already happened, regardless of whether it still looks like we’re in control or it doesn’t look like anything, because we’re extinct.
      • Governments nationalize all the labs.
      • The US and EU and China (and India? and Russia?) reach some sort of accord.
      • There’s a straight up arms race to the bottom.
      • AI tech basically makes the internet unusable, and breaks supply chains, and technology regresses for a while.
      • It’s too late to contain it and the SEAI tech proliferates, such that there are hundreds or millions of actors who can run one.
        • If this happens, it seems like the pace of change speeds up so much that one of two things happens:
          • Someone invents something, or there are second and third impacts to a constellation of innovations that destroy the world.

Ideology/narrative stabilizes path-dependent equilibria

[Epistemic status: sounds on track]

[Note: Anna might have been saying basically this, or something very nearby to this for the past six months]

Power

Lately I’ve been reading (well, listening to) the Dictator’s Handbook by Bruce Bueno de Mesquita and Alastair Smith, which is something like a realpolitik analysis of how power works, in general. To summarize in a very compressed way: systems of power are made up of fractal hierarchies of cronies who support a leader (by providing him the means of power: the services of an army, the services of a tax-collector, votes that keep him in office) in return for special favors. Under this model, institutions are pyramids of “if you scratch my back, I’ll scratch yours” relationships.

This overall dynamic (and its consequences) is explained excellently in this 18 minute CGP Grey video. Highly recommended, if you haven’t watched it yet

Coup

One consequence of these dynamics is how coups work. In a dictatorship, if an upstart can secure the support of the army, and seize the means of revenue generation (and perhaps the support of some small number of additional essential backers) he gets to rule.

And this often happens in actual dictatorships. The authors describe the case of Samuel Doe, a Sargent in the Liberian military, who one night, with a small number of conspirators, assassinated the former dictator of Liberia in his bed, seized control of the treasury, and declared himself the new president of Liberia. Basically, because he now had the money, and so would be the one to pay them, the army switched allegiances and legitimized his authority. [Note: I think there are lot of important details to this story that I don’t understand and might make my summary here, misleading or inaccurate.]

Apparently, this sort of coup is common in dictatorships.

Democracy

But I’m struck by how impossible it would be for someone to seize the government like that in the United States (at least in 2019). If a sitting president was not voted out of office, but declared that he was not going to step down, it is virtually inconceivable that he could get the army and the bureaucracy to rally around him and seize / retain power, in flagrant disregard for the constitutional protocols for the hand-off of power.

De Mesquita and Smith, as well as CGP Grey, discuss some of the structural reasons for this: in technological advanced liberal democracies, wealth is produced primarily by educated knowledge workers. Therefore, one can’t neglect the needs of the population at large like you can in a dictatorship, or you will cut off the flow of revenue that funds your state-apparatus.

But that structural consideration doesn’t seem to be most of the story to me. It seems like the main factor is ideology.

Ideology

I can barely imagine a cabal of the majority of high ranking military officials agreeing to back a candidate that lost an election, even if they assessed that backing that candidate would be more profitable for them. My impression of military people in general is that they are extremely proud Americans, for  whom the ideals of freedom and democracy are neigh-spiritual in their import. They believe in Democracy, and rule of law, in something like the way that someone might believe in a religion.

And this is a major stabilizing force of the “Liberal Democracy” attractor. Not does this commitment to the ideals of America, act in the mind of any given high ranking military officer, making the idea of a coup distasteful to them, there’s an even more important pseudo-common knowledge effect. Even if a few generals are realpolitik, sociopath, personal expected utility maximizers, the expectation that other military leaders do have the reverence for democracy, and will therefore oppose coups against the constitution, makes organizing a coup harder and riskier. If you even talk about the possibility of seizing the state, instead of deferring to the result of an election, you are likely to be opposed, if not arrested.

And even if all of the top military leaders somehow managed to coordinate to support a coup, in defiance of an election result, they would run into the same problem one step down on the chain of command. Their immediate subordinates are also committed patriots, and would oppose their superior’s outright power grab.

The ideology, the belief in democracy, keeps democracy stable.

Realpolitik analysis is an info hazard?

Indeed, we might postulate that if all of the parties involved understood, and took for granted, the realpolitik analysis that who has power is a matter of calculated self interest and flow of resources (in the style of the Athenian’s reply the the Milians), as opposed to higher ideals like justice or freedom, this would erode the stabilizing force of democracy, which I think is generally preferable to dictatorship.

(Or maybe not: maybe even if everyone bought into the realpolitik analysis, they would still think that democratic institutions were in their personal best interest, and would oppose disruption no less fervently.)

I happen to think that realpolitik analysis is basically correct, but propagating that knowledge may represent a negative externality. (Luckily (?), this kind of ideology has an immune system: people are reluctant to view the world in terms of naked power relations. Believing in Democracy has warm fuzzies, about it.)

There’s also the possibility of an uncanny valley effect: If everyone took for granted the realpolitik analysis the world would be worse of than we are now, but if everyone took that analysis for granted and also took something like TDT for granted, then we would be better off?

When implementation diverges from ideal

The ideology of democracy or patriotism does represent a counter-force against naked, self interested power grabs. But it is a less robust defense against other ideologies.

Even more threatening is when the application of an ideology is in doubt. Suppose that an election is widely believed to have been fraudulent, or the “official” winner of an election is not the candidate who “should have won”. (I’m thinking of a situation in which a candidate wins the popular vote, by a huge margin, but still loose the electoral college.) In cases like these, high ranking members of the military or bureaucracy might feel that the actual apparatus of democracy is no longer embodying the spirit of the democracy, by representing the will of the people.

In a severe enough situation of this sort, they might feel that the patriotic thing to do is actually to revolt against the current croup system, in the service of the true ideal that the system has betrayed. But once this happens, the clear, legitimized, succession of power is broken, and who should rule becomes contentious.  I expect this to devolve into a chaos, and one where many would make a power grab by claiming to be the true heir to the American Ideal.

In the worst case, we the US degrades into a “Waring states” period, as many warlord vie for power via the use of force and rhetoric.

Some interesting notes

One thing that is interesting to me is the degree to which it only matters if a few groups have this kind of ideology: the military, and some parts of the bureaucracy.

Could we just have patriotism in those sectors, and abandon the ideology of America elsewhere? Interestingly, that sort of looks like what the world is like: the military and some parts of the government (red tribe?) are strongly proud to serve America and defend freedom, while my stereotype of someone who lives in Portland (blue tribe) might wear a button that reads “America was never great” and talks a lot about how America is an empire that does huge amounts of harm in the world, and democracy is a farce. [Although, this may not indicate that they don’t share the ideology of Democracy. They’re signaling sophistication by counter signaling, but if the if push came to shove, the Portlander might fight hella hard for Democratic institutions.]

In so far as we do live in a world where we have the ideology of Democracy right in exactly the places where it needs to be to protect our republic, how did that happen? Is it just that people who have that ideology self select into positions where they can defend it? Or it it that people with power and standing based on a system are biased towards thinking that that system is good?

Conclusion: generalizing to other levels of abstraction

I bet this analysis generalizes. That is, it isn’t just that the ideology of democracy stabilizes the democracy attractor. I suspect that that is what narratives / ideologies / ego structures do, in general, across levels of abstraction: they help stabilize equilibria.

I’m not sure how this plays out in human minds. You have story about who you are and what you’re about and what you value, and a bunch of sub parts buy into that story (that sounds weird? How do my parts “buy into” or believe (in) my narrative about myself?) and this creates a Nash equilibrium where if one part were to act against the equilibrium, it would be punished, or cut off from some resource flow?

Is that what rationalization is? When a part “buys into” the narrative?  What does that even mean? Are human beings made of the same kind of “if you scratch my back, I’ll scratch yours” relationships (between parts) as institutions made of  (between people)? How would that even work? They make trades across time in the style of Andrew Critch?

I bet there’s a lot more to understand here.

 

 

 

My current model of Anxiety

[epistemic status: untested first draft model

Part of my Psychological Principles of Productivity series]

This is a brief post on my current working model of what “anxiety” is. (More specifically, this is my current model of what’s going on when I experience a state characterized by high energy, distraction, and a kind of “jittery-ness”/ agitation. I think other people may use the handle “anxiety” for other different states.)

I came up with this a few weeks ago, durring that period of anxiety and procrastination. (It was at least partial inspired by my reading a draft of Kaj’s recent post on IFS. I don’t usually have “pain” as an element of my psychological theorizing.)

The model

Basically, the state that I’m calling anxiety is characterized by two responses moving “perpendicular” to each other: increased physiological arousal, mobilizing for action, and a flinch response redirecting attention to decrease pain.

Here’s the causal diagram:

 

IMG_2554.JPG

The parts of the model

It starts with some fear or belief about the state of the world. Specially, this fear is an alief about an outcome that 1) would be bad and 2) is uncertain.

For instance:

  • Maybe I’ve waited too late to start, and I won’t be able to get the paper in by the deadline.
  • Maybe this workshop won’t be good and I’m going to make a fool of myself.
  • Maybe this post doesn’t make as much sense as I thought.

(I’m not sure about this, but I think that the uncertainty is crucial. At least in my experience, at least some of the time, if there’s certainty about the bad outcome, my resources are mobilized to deal with it. This “mobilization and action” has an intensity to it, but it isn’t anxiety.)

This fear is painful, insofar as it represents the possibility of something bad happening to you or your goals.

The fear triggers physiological arousal, or SNS activation. You become “energized”. This is part of your mind getting you ready to act, activating the fight-or-flight response, to deal with the possible bad-thing.

(Note: I originally drew the diagram with the pain causing the arousal. My current guess is that it makes more sense to talk about the fear causing the arousal directly. Pain doesn’t trigger fight-or-flight responses (think about being stabbed, or having a stomach ache). It’s when their’s danger, but not certain harm, that we get ready to move.)

However, because the fear includes pain, there are other parts of the mind that have a flinch response. There’s a sub-verbal reflex away from the painful fear-thought.

In particular, there’s often an urge towards distraction. Distractions like…

  • Flipping to facebook
  • Flipping to LessWrong
  • Flipping to Youtube
  • Flipping to [webcomic of your choice]
  • Flipping over to look at your finances
  • Going to get something to eat
  • Going to the bathroom
  • Walking around “thinking about something”

This is often accompanied by rationalization thought, that is justifying the distraction behavior to yourself.

So we end up with the fear causing both high levels of physiological SNS activation, and distraction behaviors.

Consequences

The distraction-seeking is what gives rise to the “reactivity” (I should write about this sometime) of anxiety, and the heightened SNS gives rise to the jittery “high energy” of anxiety.

Of course, these responses work at cross purposes: the SNS energy is mobilizing for action, (and will be released when action has been taken and the situation is improved) and and the flinch is trying not to think the bad possibility.

I think the heightened physiological arousal might be part of why  anxiety is hard to dialogue with. Doing focusing requires (? Is helped by?) calm and relaxation.

I think this might also explain a phenomenon that I’ve observed in myself: both watching TV and masturbating defuse anxiety. (That is, I can be highly anxious and unproductive, but if if I watch youtube clips for and hour and a half, or masturbate, I’ll feel more settled and able to focus afterwards).

This might be because both of these activities can grab my attention so that I loose track of the originating fear thought, but I don’t think that’s right. I think that these activities just defuse the heightened SNS, which clears space so that I can orient on making progress.

This suggests that any activity that reduces my SNS activation will be similarly effective. That matches my experience (exercise, for instance, is a standard excellent response to anxiety), but I’ll want to play with modulating my physiological arousal a bit and see.

Note for application

In case this isn’t obvious from the post, this model suggests that you want to learn to notice your flinches and (the easier one) your distraction behaviors, so that they can be triggers for self-dialogue. If you’re looking to increase your productivity, this is one of the huge improvements that is on the table for many people. (I’ll maybe say more about this sometime.)

RAND needed the “say oops” skill

[Epistemic status: a middling argument]

A few months ago, I wrote about how RAND, and the “Defense Intellectuals” of the cold war represent another precious datapoint of “very smart people, trying to prevent the destruction of the world, in a civilization that they acknowledge to be inadequate to dealing sanely with x-risk.”

Since then I spent some time doing additional research into what cognitive errors and mistakes  those consultants, military officials, and politicians made that endangered the world. The idea being that if we could diagnose which specific irrationalities they were subject to, that this would suggest errors that might also be relevant to contemporary x-risk mitigators, and might point out some specific areas where development of rationality training is needed.

However, this proved somewhat less fruitful than I was hoping, and I’ve put it aside for the time being. I might come back to it in the coming months.

It does seem worth sharing at least one relevant anecdote, from Daniel Ellsberg’s excellent book, the Doomsday Machine, and analysis, given that I’ve already written it up.

The missile gap

In the late nineteen-fifties it was widely understood that there was a “missile gap”: that the soviets had many more ICBM (“intercontinental ballistic missiles” armed with nuclear warheads) than the US.

Estimates varied widely on how many missiles the soviets had. The Army and the Navy gave estimates of about 40 missiles, which was about at parity with the the US’s strategic nuclear force. The Air Force and the Strategic Air Command, in contrast, gave estimates of as many as 1000 soviet missiles, 20 times more than the US’s count.

(The Air Force and SAC were incentivized to inflate their estimates of the Russian nuclear arsenal, because a large missile gap strongly necessitated the creation of more nuclear weapons, which would be under SAC control and entail increases in the Air Force budget. Similarly, the Army and Navy were incentivized to lowball their estimates, because a comparatively weaker soviet nuclear force made conventional military forces more relevant and implied allocating budget-resources to the Army and Navy.)

So there was some dispute about the size of the missile gap, including an unlikely possibility of nuclear parity with the Soviet Union. Nevertheless, the Soviet’s nuclear superiority was the basis for all planning and diplomacy at the time.

Kennedy campaigned on the basis of correcting the missile gap. Perhaps more critically, all of RAND’s planning and analysis was concerned with the possibility of the Russians launching a nearly-or-actually debilitating first or second strike.

The revelation

In 1961 it came to light, on the basis of new satellite photos, that all of these estimates were dead wrong. It turned out the the Soviets had only 4 nuclear ICBMs, one tenth as many as the US controlled.

The importance of this development should be emphasized. It meant that several of the fundamental assumptions of US nuclear planners were in error.

First of all, it meant that the Soviets were not bent on world domination (as had been assumed). Ellsberg says…

Since it seemed clear that the Soviets could have produced and deployed many, many more missiles in the three years since their first ICBM test, it put in question—it virtually demolished—the fundamental premise that the Soviets were pursuing a program of world conquest like Hitler’s.

That pursuit of world domination would have given them an enormous incentive to acquire at the earliest possible moment the capability to disarm their chief obstacle to this aim, the United States and its SAC. [That] assumption of Soviet aims was shared, as far as I knew, by all my RAND colleagues and with everyone I’d encountered in the Pentagon:

The Assistant Chief of Staff, Intelligence, USAF, believes that Soviet determination to achieve world domination has fostered recognition of the fact that the ultimate elimination of the US, as the chief obstacle to the achievement of their objective, cannot be accomplished without a clear preponderance of military capability.

If that was their intention, they really would have had to seek this capability before 1963. The 1959–62 period was their only opportunity to have such a disarming capability with missiles, either for blackmail purposes or an actual attack. After that, we were programmed to have increasing numbers of Atlas and Minuteman missiles in hard silos and Polaris sub-launched missiles. Even moderate confidence of disarming us so thoroughly as to escape catastrophic damage from our response would elude them indefinitely.

Four missiles in 1960–61 was strategically equivalent to zero, in terms of such an aim.

This revelation about soviet goals was not only of obvious strategic importance, it also took the wind out of the ideological motivation for this sort of nuclear planning. As Ellsberg relays early in his book, many, if not most, RAND employees were explicitly attempting to defend US and the world from what was presumed to be an aggressive communist state, bent on conquest. This just wasn’t true.

But it had even more practical consequences: this revelation meant that the Russians had no first strike (or for that matter, second strike) capability. They could launch their ICBMs at American cities or military bases, but such an attack had no chance of debilitating US second strike capacity. It would unquestionably trigger a nuclear counterattack from the US who, with their 40 missiles, would be able to utterly annihilate the Soviet Union. The only effect of a Russian nuclear attack would be to doom their own country.

[Eli’s research note: What about all the Russian planes and bombs? ICBMs aren’t the the only way of attacking the US, right?]

This means that the primary consideration in US nuclear war planning at RAND and elsewhere, was fallacious. The Soviet’s could not meaningfully destroy the US.

…the estimate contradicted and essentially invalidated the key RAND studies on SAC vulnerability since 1956. Those studies had explicitly assumed a range of uncertainty about the size of the Soviet ICBM force that might play a crucial role in combination with bomber attacks. Ever since the term “missile gap” had come into widespread use after 1957, Albert Wohlstetter had deprecated that description of his key findings. He emphasized that those were premised on the possibility of clever Soviet bomber and sub-launched attacks in combination with missiles or, earlier, even without them. He preferred the term “deterrent gap.” But there was no deterrent gap either. Never had been, never would be.

To recognize that was to face the conclusion that RAND had, in all good faith, been working obsessively and with a sense of frantic urgency on a wrong set of problems, an irrelevant pursuit in respect to national security.

This realization invalidated virtually all of RAND’s work to date. Virtually every, analysis, study, and strategy, had been useless, at best.

The reaction to the revelation

How did RAND employees respond to this reveal, that their work had been completely off base?

That is not a recognition that most humans in an institution are quick to accept. It was to take months, if not years, for RAND to accept it, if it ever did in those terms. To some degree, it’s my impression that it never recovered its former prestige or sense of mission, though both its building and its budget eventually became much larger. For some time most of my former colleagues continued their focus on the vulnerability of SAC, much the same as before, while questioning the reliability of the new estimate and its relevance to the years ahead. [Emphasis mine]

For years the specter of a “missile gap” had been haunting my colleagues at RAND and in the Defense Department. The revelation that this had been illusory cast a new perspective on everything. It might have occasioned a complete reassessment of our own plans for a massive buildup of strategic weapons, thus averting an otherwise inevitable and disastrous arms race. It did not; no one known to me considered that for a moment. [Emphasis mine]

According to Ellsberg, many at RAND were unable to adapt to the new reality and continued (fruitlessly) to continue with what they were doing, as if by inertia, when the thing that they needed to do (to use Eliezer’s turn of phrase) is “halt, melt, and catch fire.”

This suggests that one failure of this ecosystem, that was working in the domain of existential risk, was a failure to “say oops“: to notice a mistaken belief, concretely acknowledge that is was mistaken, and to reconstruct one’s plans and world views.

Relevance to people working on AI safety

This seems to be at least some evidence (though, only weak evidence, I think), that we should be cautious of this particular cognitive failure ourselves.

It may be worth rehearsing the motion in advance: how will you respond, when you discover that a foundational crux of your planning is actually mirage, and the world is actually different than it seems?

What if you discovered that your overall approach to making the world better was badly mistaken?

What if you received a strong argument against the orthogonality thesis?

What about a strong argument for negative utilitarianism?

I think that many of the people around me have effectively absorbed the impact of a major update at least once in their life, on a variety of issues (religion, x-risk, average vs. total utilitarianism, etc), so I’m not that worried about us. But it seems worth pointing out the importance of this error mode.


A note: Ellsberg relays later in the book that, durring the Cuban missile crisis, he perceived Kennedy as offering baffling terms to the soviets: terms that didn’t make sense in light of the actual strategic situation, but might have been sensible under the premiss of a soviet missile gap. Ellsberg wondered, at the time, if Kennedy had also failed to propagate the update regarding the actual strategic situation.

I believed it very unlikely that the Soviets would risk hitting our missiles in Turkey even if we attacked theirs in Cuba. We couldn’t understand why Kennedy thought otherwise. Why did he seem sure that the Soviets would respond to an attack on their missiles in Cuba by armed moves against Turkey or Berlin? We wondered if—after his campaigning in 1960 against a supposed “missile gap”—Kennedy had never really absorbed what the strategic balance actually was, or its implications.

I mention this because additional research suggests that this is implausible: that Kennedy and his staff were aware of the true strategic situation, and that their planning was based on that premise.

Goal-factoring as a tool for noticing narrative-reality disconnect

[The idea of this post, as well as the opening example, were relayed to me by Ben Hoffman, who mentioned it as a thing that Michael Vassar understands well. This was written with Ben’s blessing.]

Suppose you give someone an option of one of three fruits: a radish, a carrot, and and apple. The person chooses the carrot. When you ask them why, they reply “because it’s sweet.”

Clearly, there’s something funny going on here. While the carrot is sweeter than the radish, the apple is sweeter than the carrot. So sweetness must not be the only criterion your fruit-picker is using to make his decision. He/she might be choosing partially on that basis, but there must also be some other, unmentioned factor, that is guiding his/her choice.

Now imagine someone is describing the project that they’re working on (project X). They explain their reasoning for undertaking this project, the good outcomes that will result from it: reasons a, b, and c.

When someone is presenting their reasoning like this, it can be useful to take a, be and c as premises, and try and project what seems to you like the best course of action that optimizes for those goals. That is, do a quick goal-factoring, to see if you can discover a y, that seems to fulfill goals a, b, and c, better than X does.

If you can come up with such a Y, this is suggestive of some unmentioned factor in your interlocutor’s reasoning, just as there was in the choice of your fruit-picker.

Of course this could be innocuous. Maybe Y has some drawback you’re unaware of, and so actually X is the better plan. Maybe the person you’re speaking with just hadn’t thought of Y.

But but it also might be he/she’s lying outright about why he/she’s doing X. Or maybe he/she has some motive that he/she’s not even admitting to him/herself.

Whatever the case, the procedure of taking someone else’s stated reasons as axioms and then trying to build out the best plan that satisfies them is a useful procedure for drawing out dynamics that are driving situations under the surface.

I’ve long used this technique effectively on myself, but I sugest that it might be an important lens for viewing the actions of institutions and other people. It’s often useful to tease out exactly how their declared stories about themselves deviate from their revealed agency, and this is one way of doing that.

 

 

When do you need traditions? – A hypothesis

[epistemic status: speculation about domains I have little contact with, and know little about]

I’m rereading Samo Burja’s draft, Great Founder Theory. In particular, I spent some time today thinking about living, dead, and lost traditions and chains of Master-Apprenticeship relationships.

It seems like these chains often form the critical backbone of a continuing tradition (and when they fail, the tradition starts to die). Half of Nobel winners are the students of other Noble winners.

But it also seems like there are domains that don’t rely, or at least don’t need to rely on the conveyance of tacit knowledge via Master-Appreticeship relationships.

For instance, many excellent programmers are self-taught. It doesn’t seem like our civilization’s collective skill in programming depends on current experts passing on their knowledge to the next generation via close in-person contact. As a thought experiment, if all current programers disappeared today, but the computers and educational materials remained, I expect we would return to our current level of collective programing skill within a few decades.

In contrast, consider math. I know almost nothing about higher mathematics, but I would guess that if all now-living mathematicians disappeared, they’ed leave a lot of math, but progress on the frontiers of mathematics would halt, and it would take many years, maybe centuries, for mathematical progress to catch up to that frontier again. I make this bold posit on the basis of the advice I’ve heard (and I’ve personally verified) that learning from tutors is way more effective than learning just from textbooks, and that mathematicians do track their lineages.

In any case, it doesn’t seem like great programers run in lineages the way that Nobel Laureates do.

This is in part because programming in particular has some features that lends itself to autodidactictry: in particular, a novice programer gets clear and immediate feedback: his/her code either compiles or it doesn’t. But I don’t think this is the full story.

Samo discusses some of the factors that determine this difference in his document: for instance, traditions in domains that provide easy affordance for “checking work” against the territory  (such as programming) tend to be more resilient.

But I want to dig into a more specific difference.

Theory:

A domain of skill entails some process that when applied, produces some output.

Gardening is the process, fruits are the output. Carpentry (or some specific construction procedure) is the process, the resulting chair is the output.  Painting is the process, the painting is the output.

To the degree that the output is or embodies the generating process, master-apprenticeship relationships are less necessary.

It’s a well trodden trope that a program is the programmer’s thinking about a problem. (Paul Graham in Holding a Program in One’s Head: “Your code is your understanding of the problem you’re exploring.“) A comparatively large portion of a programmer’s thought process is represented in his/her program (including the comments). A novice programer, looking at a program written by a master, can see not just what a well-written program looks like, but also, to a large degree, what sort of thinking produces a well-writen program. Much of the tacit knowledge is directly expressed in the final product.

Compare this to say, a revolutionary scientist. A novice scientist might read the papers of elite groundbreaking science, and the novice might learn something, but so much of the process – the intuition that the topic in question was worth investigating, the subtle thought process that led to the hypothesis, the insight of what experiment would elegantly investigate that hypothesis – are not encoded in the paper, and are not legible to the reader.

I think that this is a general feature of domains. And this feature is predictive of the degree to which skill in a given domain relies strongly on traditions of Master- Apprenticeship.

Other examples:

I have the intuition, perhaps false (are there linages of award-winning novelist the way there are linages of Nobel laureates?), that novelists mostly do not learn their craft in apprenticeships to other writers. I suggest that writing is like programing: largely self-taught, except in the sense that one ingests and internalizes large numbers of masterful works. But enough of the skill of writing great novels is contained in the finished work that new novelists can be “trained” this way.

What about Japanese wood-block printing? From the linked video, it seems as if David Bull received about an hour of instruction in wood carving once every seven years or so. But those hours were enormously productive for him. Notably, this sort of wood-carving is a step removed from the final product: one carves the printing block, and then uses the block to make a print. Looking at the finished block, it seems, does not sufficiently convey the techniques used for creating the block. But on top of that the block is not the final product, only an intermediate step. The novice outside of an apprenticeship may only ever see the prints of a master-piece, not the blocks that make the prints.

Does this hold up at all?

That’s the theory. However, I can come up with at least a few counter proposals and confounding factors:

Countertheory: The dominating factor is the age of the tradition. Computer Science is only a few decades old, so recreating it can’t take more than a few decades. Let it develop for a few more centuries (without the advent of machine intelligence or other transformative technology), and the Art of Programming will have progressed so far that it does depend on Master/Apprentice relationships, and the loss of all living programers would be as much as a hit as the loss of all living mathematicians.

This doesn’t seem like it explains novelists, but maybe “good writing” is mostly a matter of fad? (I expect some literary connoisseurs would leap down my throat at that. In any case, it doesn’t seem correct to me.)

Confounder: economic incentive: If we lost all masters of Japanese wood-carving, but there was as much economic incentive for the civilization to remaster it as there would be for remastering programming, would it take any longer? I find that dubious.

Why does this matter? 

Well for one thing, if you’re in the business of building traditions to last more than a few decades, it’s pretty important to know when you will need to institute close-contact lineages.

Separately, this seem relevant whenever one is hoping to learn from dead masters.

Darwin surely counts among the great scientific-thinkers. He successfully abstracted out a fundamental structuring principle of the natural world. As someone interested in epistemology, it seems promising to read Darwin, in order to tease out how he was thinking. I was previously planning to read the Origins of Species. Now, it seems much more fruitful to read Darwin’s notebooks, which I expect to contain more of his process than his finished works do.