Reflections: Three categories of capital

There are three categories of capital that one can invest in.

Knowledge, skill, experience

This includes what you know, what you know how to do.

But it also includes “experience” the kinds of tacit background that you only learn by interacting with some subpart of the world, and not just reading about it. Often, just having seen how something was done in some related context is more useful than any specific “skill” that you can learn on purpose. (For instance some of the principles that go into developing and running a world class workshop series, are directly transferable to developing public advocacy materials, or having participated in making movies gives one a template for coordinating teams of contractors to get a job done.)

Reputation and connections

The application of many skills depends on access to the contexts where those skills are relevant. As a friend of mine says, “It’s not what you know, or even who you know, it’s who you know who knows what you know.

Throughout most of my life, I tended to emphasize the value of skills, and didn’t think much at all about reputation or connections. This undercut my impact, and left me less powerful today than I might have been.

I’ve invested in skills that can help make teams much more effective, but many of those skills are not carved up very well by standard roles or job descriptions (for instance “conversational facilitation”, and “effective communication”, and “the knowing the importance of getting feedback, for real”). People who have worked with me know that I bring that value to the table. But most people who I might be able to provide value to don’t even know that they’re missing anything, much less what it is, much less that I can provide it.

Plus, relationships really are really powerful for solving problems. The scale of the network of people who know and trust you is proportional to your ability to solve some types of problems.

If I move on from Palisade, one thing that I think I should invest in is my semi-public reputation. (Possibly I should write a blog that is optimized for readers. Instead of writing for myself that I also post on the internet.)

Financial capital

Having money is useful for doing stuff. You need a certain threshold of money for financial independence, and spending money can enable or accelerate the accumulation of the other kinds of capital.

Lessons from and musings about Polytopia

Over the past 6 months I’ve played about 100 hours of the 4X game “The Battle of Polytopia.” Mostly playing on “crazy” difficulty, against 3 or 4 other tribes. 

This is more than I’ve played any video game since I was 15 or so. I wanted to write up some of my thoughts, especially those that generalize.

Momentum

Polytopia is a game of momentum or compounding advantage. If I’m in the lead by turn 10 or so, I know that I am basically guaranteed to win eventually. [Edit 2024-09-12: after playing for another 30 hours, and focusing on the early game, I can now almost always win eventually, regardless of whether I have an early lead]. I can turn a current resource advantage into an overwhelming military advantage for a particular city, and seizing that city, get more of a lead. After another 10 turns, I’ll have compounded that until my tribe is an inexorable force moving across the square.

I think the number one thing that I took away from this game is the feeling of compounding momentum. Life should have that flavor. 

And, in particular, the compounding loops through the world. In polytopia, you generally want to spend down your resources to 0 or close to 0, every turn, unless there’s a specific thing that you’re aiming to buy that requires more than one maringal-turn of resource. “Saving up” is usually going to be a losing proposition, because the return on investment of seizing a city or building the population of an existing city sooner is exponential.

This also generalizes. I’m very financially conservative, by nature. I tend to earn money and save it / invest it. There’s a kind of compounding to that, but it isn’t active. There’s a different attitude one could have, where they’re investing in the market some amount every year, and putting aside some money for emergencies, but most of their investment loops through the world. Every year, they spend down most of the money they own, and invest it in ways to go faster.

I think most people, in practice, don’t do this very well: they spend their salary on a nice apartment, instead of the cheapest apartment they can afford and tutoring, personal assistants, and plane flights to one-on-one longshot bet meetings. At a personal (instead of organizational) level, I think the returns to spending additional money saturate fast, and after that point you get better returns investing in the market. But I think there might be something analogous to the “spend down your whole budget, every turn” heuristic. 

I’ve thought in the past that I should try aggressively spending more money. But maybe I should really commit to trying it. I have a full time salary for the first time in my life. Maybe this year, I should experiment with trying to find ways to spend 90% of that salary (not my investment returns, which I’ll reinvest), and see what the returns to that are. 

This overall dynamic of compounding advantages is more concerning that I, personally, haven’t built up much of an advantage yet. Mediocre accumulation of money, connections, and skills seems “medium” good. But because of the exponential, mediocre is actually quite far down a power law. This prompts me to reflect on what I can do this year to compound my existing resources (particularly with regards to personal connections, since I realized late in life that who you know who knows what you can do, is a constraint on what you can do).

Geography

Because of this momentum effect, the geography of the square dominates all other considerations in the question of which tribe will eventually win. In particular, if I start out isolated, far from any other tribes, with several easily accessible settlements available to convert, winning is going to be easy. I can spend the first ten turns capturing those settlements and building up their population without needing to expend resources on military units for defense.

In contrast, if I start out sandwiched between two other tribes with all settlements that are within my reach also within theirs, the struggle is apt to be brutal. It’s still possible to win starting from this position: The key is to seize, and build up at least two cities, and create enough units defensively that the other tribes attack each other instead of you. (That’s one thing that I learned: you sometimes need to build units to stand near your cities, even when you’re not planning an attack, because a density of units discourages other tribes from attacking you, in the first place). 

From there, depending on how dire the straights are, and if I’m on the coast, I’ll want to either 

  1. Train defenders to garrison my cities, and then quickly send out scouts to convert nonaffiliated settlements, and build up an advantage that way, or
  2. Train an attack force (of mostly archers, most likely, because a mass of archers can attack from a distance with minimal risk) target a city that the other two tribes are fighting over. I can sweep in and seize it after they’ve exhausted themselves.

This can work, but it still depends on luck. If you can’t get to your first settlement(s) fast enough, or another tribe captures one of your cities before you’ve had time to build up a detering defense force, there’s not really a way to recover. I’ll be fighting against better resourced adversaries for the rest of the game, until they overwhelm me. Usually I’ll just start the game over when I get unlucky this early.

This overwhelming importance of initial conditions sure seems like it generalizes to life, but mostly as a dower reminder that life is unfair. Insofar as you can change your initial conditions, they weren’t actually initial.

Thresholds, springs, and concentration of force

There are units of progress in polytopia that are composed of smaller components, but which don’t provide any value until all components are completed. 

For instance, it’s tempting to harvest a fruit or hunt an animal to move the resource counter of a city up by one tick. But if you don’t have the stars to capture enough resources to reach the next population-increase threshold for that city (or there just aren’t other accessible resources nearby), it doesn’t actually help to collect that one resource. You get no benefit for marginal ticks on the resource counter, you only get benefit from increases in city population.

Even if collecting the resource is the most valuable thing to do this turn, you’re still better off holding off and waiting a turn (one exception to the “spend down your resources every turn” heuristic). Waiting gives you optionality in how you spend those resources—you might have even better options in the next turn, including training units that were occupied last turn, or researching technologies 

Similarly, capturing a city entails killing the unit that is stationed there, and moving one of your units into its place, with few enough enemy units nearby that your unit isn’t killed before the next turn. Killing just the central stationed enemy unit, without having one of your units nearby to capture the city, is close to useless (not entirely useless, because it costs the enemy tribe one unit). Moving a unit into the city, only for it to be killed before the next turn, is similarly close to useless.

So in most cases, capturing a city is a multi-turn campaign of continually training and/or moving enough units into position to have a relative military advantage, killing enough of the enemy units, and moving one of your units (typically a defender or a giant, if you can manage it) into position in the city.

Crucially, partially succeeding at a campaign—killing most of the units, but not getting all the way to capturing the city, it buys you effectively nothing. You don’t win in polytopia by killing units, except insofar as that is instrumental to capturing cities.

More than that, if you break off a campaign part way through, your progress is not preserved. When you back your units out, that gives the enemy city slack to recover and replenish their units. So if you go back to capture that city later, you’ll have to more or less start over from scratch with wearing down their nearby military.

That is to say, capturing a city in polytopia is spring-like: if you don’t push it all the way to completion, it bounces back, and you need to start over again. It’s not just that marginal progress doesn’t provide marginal value until you reach a threshold point. Marginal progress decays over time.

I can notice plenty of things that are spring-like in this way, once I start thinking in those terms. 

Some technical learning, for instance. If I study something for a bit, and then leave it for too long (I’m not sure what “too long” is—maybe more than two weeks?) I don’t remember the material enough for my prior studying to help me much. If I want to continue, I basically have to start over.

But on the other hand, I read and studied the first few chapters of a Linear Algebra textbook in 2019, and that’s served me pretty well: I can rely on at least some of those concepts in my thinking. I think this differences is partly due to the material (some subjects just stick for me better, or are more conceptually useful for me compared to others). But largely, I think this is a threshold effect: if I study the content enough to chunk and consolidate the concepts, it sticks with me and I can build on it. But if I read some of a textbook, but don’t get to the point of consolidating the concepts, it just gets loaded into my short term memory, to decay on the order of weeks.

Writing projects definitely have the threshold-dynamic—they don’t provide any value until I ship them—and they’re partially but not fully spring-like. When I’ve left a writing project for too long, it’s hard to come back to it: the motivating energy is gone. And sometimes I do end up, when I’m inspired again, rewriting essentially the same text (though often with a different structure). But sometimes I am able to use partial writing from previous attempts.

Generalizing, one reason why things are springs, is because short term memories and representations decay, and you need to pass the threshold of consolidating into long term representations.

In polytopia, because capturing cities is spring-like, succeeding requires having a concentration of force. Splitting your forces to try to take two cities at once, and be worse than useless. And so one of the most important disciplines of playing polytopia is having internal clarity about which city you’re targeting next, so that you can overwhelm that city, capture it, consolidate it and then move on to the next one. Sometimes there are sudden opportunities to capture cities that were not your current target and late in the game, you might have more than one target at a time (usually from different unit-training bases).

Similarly, anything in my life that’s spring-like demands a concentration of force. 

If technical learning in my short term memory tends to decay, that means that I need to commit sufficiently to a learning project for long enough to hit the consolidation threshold. I want to concentrate my energies on the project until I get to the point of success, whatever success means.

Same basic principle for writing projects. When writing, I should probably make a point to just keep going until I have a first complete draft.

Video games

Probably the most notable thing I learned was not from the content, but from the format. Video games can work. 

I got better at playing polytopia over the period that I was playing it, from mostly losing to mostly winning my games. That getting better was mostly of the form of making mistakes, noticing those mistakes, and then more or less automatically learning the habits to patch those mistakes. 

For instance, frustration at losing initiative because I left a city un-garrisoned and an enemy unit came up and took it without a fight while I wasn’t looking, led into a general awareness of all my cities and the quiet units within movement distance of them, so that I could quickly train a unit to garrison them. 

Or running into an issue when I ran out of population for a city and couldn’t easily garrison it, and learning to keep a defender within one step of a city, so that I can train units there to send to the front, but move the defender into place when the population is full.

This was not very deliberate or systematic. I just kept playing and gradually learned how to avoid the errors that hobbled me.

And I just kept playing because it was (is) addictive. In particular, when I finished a game, there was an automatic impulse to start another one. I would play for hours at a stretch. At most I think I played for ten hours in a row. 

Why was it addictive? I think the main thing is that the dynamic of the game means I never get blocked with no option, or no idea, for what to do next. At every moment there’s an affordance to move the game forward: either something to do or just moving on to the next turn. The skill is in taking actions skillfully, but not in figuring out how to take actions at all. I think this, plus an intermittent reinforcement schedule was crucial to what made it addictive. 

Overall, this has been bad for my life, especially after the point when I started mostly winning, and I wasn’t learning as much any more. 

But I think I learned something about learning and getting better in that process. I’ve been playing with the idea of intentionally cultivating that kind of addiction for other domains, or pretending as if I’m experiencing that kind of addiction to simulate it.

I bet I could get into a mode like this with programming, where I compulsively keep going for hours over weeks, and in the process learn the habits to counter my mistakes and inefficiencies, less because of anything systematic, and more just because those errors are present to mind in my short term memory by the time I encounter them again. I think I’m probably close to having enough skill in programming that I can figure out how to never be blocked, especially with the help of an LLM, and get into the addictive rhythm.

Further, this makes me more interested in trying to find video games that are both dopamine-addictive and train my intuition for an important domain. 

I’ve been playing with Manifold markets, recently, and I feel like I’m getting a better sense of markets in the process. I wonder if there are good video games for getting an intuition for linear algebra, or economics. I have registered that playing 100 hours of factorio is an important training regime. I wonder if there are others. 

I haven’t really played video games since I was in middle school (with the exception of some rationality training exercises on snakebird and baba is you). At the time I was playing Knights of the Old Republic, and decided that I would try to become a jedi in real life, instead of in the game. I mostly haven’t played video games since. 

I now think that this was maybe a mistake.

It’s hard to know what lessons I would have learned if I had played more video games—when I played Age of Mythology and Age of Empires as a kid, I don’t remember getting better over time, as I did with polytopia. But I do think there are lessons that I could have learned from playing video games that would have helped me in thinking about my life. Notably, getting reps playing through games with early, mid, and late stages, would have given me a model for planning across life stages, which is something that, in retrospect, I was lacking. I didn’t have an intuitive sense of the ways that the shape of my opportunities would be different in my 20s vs. my thirties, for instance. Possibly I would have avoided some life errors if I had spent more time playing and learning to get good at, video games.

One projection of how AI could play out

Back in January, I participated in a workshop in which the attendees mapped out how they expect AGI development and deployment to go. The idea was to start by writing out what seemed most likely to happen this year, and then condition on that, to forecast what seems most likely to happen in the next year, and so on, until you reach either human disempowerment or an end of the acute risk period.

This post was my attempt at the time.

I spent maybe 5 hours on this, and there’s lots of room for additional improvement. This is not a confident statement of how I think things are most likely to play out. There are already some ways in which I think this projection is wrong. (I think it’s too fast, for instance). But nevertheless I’m posting it now, with only a few edits and elaborations, since I’m probably not going to do a full rewrite soon.

2024

  • A model is released that is better than GPT-4. It succeeds on some new benchmarks. Subjectively, the jump in capabilities feels smaller than that between RLHF’d GPT-3 and RLHF’d GPT-4. It doesn’t feel as shocking the way chat-GPT and GPT-4 did, for either x-risk focused folks, or for the broader public. Mostly it feels like “a somewhat better language model.”
    • It’s good enough that it can do a bunch of small-to-medium admin tasks pretty reliably. I can ask it to find me flights meeting specific desiderata, and it will give me several options. If I give it permission, it will then book those flights for me with no further inputs from me.
    • It works somewhat better as an autonomous agent in an auto gpt harness, but it still loses its chain of thought / breaks down/ gets into loops.
    • It’s better at programming.
      • Not quite good enough to replace human software engineers. It can make a simple react or iphone app, but not design a whole complicated software architecture, at least without a lot of bugs.
      • It can make small, working, well documented, apps from a human description.
        • We see a doubling of the rate of new apps being added to the app store as people who couldn’t code now can make applications for themselves. The vast majority of people still don’t realize the possibilities here, though. “Making apps” still feels like an esoteric domain outside of their zone of competence, even though the barriers to entry just lowered so that 100x more people could do it. 
  • From here on out, we’re in an era where LLMs are close to commoditized. There are smaller improvements, shipped more frequently, by a variety of companies, instead of big impressive research breakthroughs. Basically, companies are competing with each other to always have the best user experience and capabilities, and so they don’t want to wait as long to ship improvements. They’re constantly improving their scaling, and finding marginal engineering improvements. Training runs for the next generation are always happening in the background, and there’s often less of a clean tabula-rasa separation between training runs—you just keep doing training with a model continuously. More and more, systems are being improved through in-the-world feedback with real users. Often chatGPT will not be able to handle some kind of task, but six weeks later it will be able to, without the release of a whole new model.
    • [Does this actually make sense? Maybe the dynamics of AI training mean that there aren’t really marginal improvements to be gotten. In order to produce a better user experience, you have to 10x the training, and each 10x-ing of the training requires a bunch of engineering effort, to enable a larger run, so it is always a big lift.]
    • (There will still be impressive discrete research breakthroughs, but they won’t be in LLM performance)

2025

  • A major lab is targeting building a Science and Engineering AI (SEAI)—specifically a software engineer.
    • They take a state of the art LLM base model and do additional RL training on procedurally generated programming problems, calibrated to stay within the model’s zone of proximal competence. These problems are something like leetcode problems, but scale to arbitrary complexity (some of them require building whole codebases, or writing very complex software), with scoring on lines of code, time-complexity, space complexity, readability, documentation, etc. This is something like “self-play” for software engineering. 
    • This just works. 
    • A lab gets a version that can easily do the job of a professional software engineer. Then, the lab scales their training process and gets a superhuman software engineer, better than the best hackers.
    • Additionally, a language model trained on procedurally generated programming problems in this way seems to have higher general intelligence. It scores better on graduate level physics, economics, biology, etc. tests, for instance. It seems like “more causal reasoning” is getting into the system.
  • The first proper AI assistants ship. In addition to doing specific tasks,  you keep them running in the background, and talk with them as you go about your day. They get to know you and make increasingly helpful suggestions as they learn your workflow. A lot of people also talk to them for fun.

2026

  • The first superhuman software engineer is publically released.
    • Programmers begin studying its design choices, the way Go players study AlphaGo.
    • It starts to dawn on e.g. people who work at Google that they’re already superfluous—after all, they’re currently using this AI model to (unofficially) do their job—and it’s just a matter of institutional delay for their employers to adapt to that change.
      • Many of them are excited or loudly say how it will all be fine/ awesome. Many of them are unnerved. They start to see the singularity on the horizon, as a real thing instead of a social game to talk about.
      • This is the beginning of the first wave of change in public sentiment that will cause some big, hard to predict, changes in public policy [come back here and try to predict them anyway].
  • AI assistants get a major upgrade: they have realistic voices and faces, and you can talk to them just like you can talk to a person, not just typing into a chat interface. A ton of people start spending a lot of time talking to their assistants, for much of their day, including for goofing around.
    • There are still bugs, places where the AI gets confused by stuff, but overall the experience is good enough that it feels, to most people, like they’re talking to a careful, conscientious person, rather than a software bot.
    • This starts a whole new area of training AI models that have particular personalities. Some people are starting to have parasocial relationships with their friends, and some people programmers are trying to make friends that are really fun or interesting or whatever for them in particular.
  • Lab attention shifts to building SEAI systems for other domains, to solve biotech and mechanical engineering problems, for instance. The current-at-the-time superhuman software engineer AIs are already helpful in these domains, but not at the level of “explain what you want, and the AI will instantly find an elegant solution to the problem right before your eyes”, which is where we’re at for software.
    • One bottleneck is problem specification. Our physics simulations have gaps, and are too low fidelity, so oftentimes the best solutions don’t map to real world possibilities.
      • One solution to this is that, (in addition to using our AI to improve the simulations) is we just RLHF our systems to identify solutions that do translate to the real world. They’re smart, they can figure out how to do this.
  • The first major AI cyber-attack happens: maybe some kind of superhuman hacker worm. Defense hasn’t remotely caught up with offense yet, and someone clogs up the internet with AI bots, for at least a week, approximately for the lols / the seeing if they could do it. (There’s a week during which more than 50% of people can’t get on more than 90% of the sites because the bandwidth is eaten by bots.)
    • This makes some big difference for public opinion. 
    • Possibly, this problem isn’t really fixed. In the same way that covid became endemic, the bots that were clogging things up are just a part of life now, slowing bandwidth and making the internet annoying to use.

2027 and 2028

  • In many ways things are moving faster than ever in human history, and also AI progress is slowing down a bit.
    • The AI technology developed up to this point hits the application and mass adoption phase of the s-curve. In this period, the world is radically changing as every industry, every company, every research lab, every organization, figures out how to take advantage of newly commoditized intellectual labor. There’s a bunch of kinds of work that used to be expensive, but which are now too cheap to meter. If progress stopped now, it would take 2 decades, at least, for the world to figure out all the ways to take advantage of this new situation (but progress doesn’t show much sign of stopping).
      • Some examples:
        • The internet is filled with LLM bots that are indistinguishable from humans. If you start a conversation with a new person on twitter or discord, you have no way of knowing if they’re a human or a bot.
          • Probably there will be some laws about declaring which are bots, but these will be inconsistently enforced.)
          • Some people are basically cool with this. From their perspective, there are just more people that they want to be friends with / follow on twitter. Some people even say that the bots are just better and more interesting than people. Other people are horrified/outraged/betrayed/don’t care about relationships with non-real people.
            • (Older people don’t get the point, but teenagers are generally fine with having conversations with AI bots.)
          • The worst part of this is the bots that make friends with you and then advertise to you stuff. Pretty much everyone hates that.
        • We start to see companies that will, over the next 5 years, grow to have as much impact as Uber, or maybe Amazon, which have exactly one human employee / owner +  an AI bureaucracy.
        • The first completely autonomous companies work well enough to survive and support themselves. Many of these are created “free” for the lols, and no one owns or controls them. But most of them are owned by the person who built them, and could turn them off if they wanted to. A few are structured as public companies with share-holders. Some are intentionally incorporated fully autonomous, with the creator disclaiming (and technologically disowning (eg deleting the passwords)) any authority over them.
          • There are legal battles about what rights these entities have, if they can really own themselves, if they can have bank accounts, etc. 
          • Mostly, these legal cases resolve to “AIs don’t have rights”. (For now. That will probably change as more people feel it’s normal to have AI friends).
        • Everything is tailored to you.
          • Targeted ads are way more targeted. You are served ads for the product that you are, all things considered, most likely to buy, multiplied by the lifetime profit if you do buy it. Basically no ad space is wasted on things that don’t have a high EV of you, personally, buying it. Those ads are AI generated, tailored specifically to be compelling to you. Often, the products advertised, not just the ads, are tailored to you in particular.
            • This is actually pretty great for people like me: I get excellent product suggestions.
          • There’s not “the news”. There’s a set of articles written for you, specifically, based on your interests and biases.
          • Music is generated on the fly. This music can “hit the spot” better than anything you listened to before “the change.”
          • Porn. AI tailored porn can hit your buttons better than sex.
          • AI boyfriends/girlfriends that are designed to be exactly emotionally and intellectually compatible with you, and trigger strong limerence / lust / attachment reactions.
        • We can replace books with automated tutors.
          • Most of the people who read books will still read books though, since it will take a generation to realize that talking with a tutor is just better, and because reading and writing books was largely a prestige-thing anyway.
            • (And weirdos like me will probably continue to read old authors, but even better will be to train an AI on a corpus, so that it can play the role of an intellectual from 1900, and I can just talk to it.)
        • For every task you do, you can effectively have a world expert (in that task and in tutoring pedagogy) coach you through it in real time.
          • Many people do almost all their work tasks with an AI coach.
        • It’s really easy to create TV shows and movies. There’s a cultural revolution as people use AI tools to make custom Avengers movies, anime shows, etc. Many are bad or niche, but some are 100x better than anything that has come before (because you’re effectively sampling from a 1000x larger distribution of movies and shows). 
        • There’s an explosion of new software, and increasingly custom software.
          • Facebook and twitter are replaced (by either external disruption or by internal product development) by something that has a social graph, but lets you design exactly the UX features you want through a LLM text interface. 
          • Instead of software features being something that companies ship to their users, top-down, they become something that users and communities organically develop, share, and iterate on, bottom up. Companies don’t control the UX of their products any more.
          • Because interface design has become so cheap, most of software is just proprietary datasets, with (AI built) APIs for accessing that data.
        • There’s a slow moving educational revolution of world class pedagogy being available to everyone.
          • Millions of people who thought of themselves as “bad at math” finally learn math at their own pace, and find out that actually, math is fun and interesting.
          • Really fun, really effective educational video games for every subject.
          • School continues to exist, in approximately its current useless form.
          • [This alone would change the world, if the kids who learn this way were not going to be replaced wholesale, in virtually every economically relevant task, before they are 20.]
        • There’s a race between cyber-defense and cyber offense, to see who can figure out how to apply AI better.
          • So far, offense is winning, and this is making computers unusable for lots of applications that they were used for previously:
            • online banking, for instance, is hit hard by effective scams and hacks.
            • Coinbase has an even worse time, since they’re not issued (is that true?)
          • It turns out that a lot of things that worked / were secure, were basically depending on the fact that there are just not that many skilled hackers and social engineers. Nothing was secure, really, but not that many people were exploiting that. Now, hacking/scamming is scalable and all the vulnerabilities are a huge problem.
          • There’s a whole discourse about this. Computer security and what to do about it is a partisan issue of the day.
        • AI systems can do the years of paperwork to make a project legal, in days. This isn’t as big an advantage as it might seem, because the government has no incentive to be faster on their end, and so you wait weeks to get a response from the government, your LMM responds to it within a minute, and then you wait weeks again for the next step.
          • The amount of paperwork required to do stuff starts to balloon.
        • AI romantic partners are a thing. They start out kind of cringe, because the most desperate and ugly people are the first to adopt them. But shockingly quickly (within 5 years) a third of teenage girls have a virtual boyfriend.
          • There’s a moral panic about this.
        • AI match-makers are better than anything humans have tried yet for finding sex and relationships partners. It would still take a decade for this to catch on, though.
          • This isn’t just for sex and relationships. The global AI network can find you the 100 people, of the 9 billion on earth, that you most want to be friends / collaborators with. 
        • Tons of things that I can’t anticipate.
    • On the other hand, AI progress itself is starting to slow down. Engineering labor is cheap, but (indeed partially for that reason), we’re now bumping up against the constraints of training. Not just that buying the compute is expensive, but that there are just not enough chips to do the biggest training runs, and not enough fabs to meet that demand for chips rapidly. There’s huge pressure to expand production but that’s going slowly relative to the speed of everything else, because it requires a bunch of eg physical construction and legal navigation, which the AI tech doesn’t help much with, and because the bottleneck is largely NVIDIA’s institutional knowledge, which is only partially replicated by AI.
      • NVIDIA’s internal AI assistant has read all of their internal documents and company emails, and is very helpful at answering questions that only one or two people (and sometimes literally no human on earth) know the answer to. But a lot of the important stuff isn’t written down at all, and the institutional knowledge is still not fully scalable.
      • Note: there’s a big crux here of how much low and medium hanging fruit there is in algorithmic improvements once software engineering is automated. At that point the only constraint on running ML experiments will be the price of compute. It seems possible that that speed-up alone is enough to discover eg an architecture that works better than the transformer, which triggers and intelligence explosion.

2028

  • The cultural explosion is still going on, and AI companies are continuing to apply their AI systems to solve the engineering and logistic bottlenecks of scaling AI training, as fast as they can.
  • Robotics is starting to work.

2029 

  • The first superhuman, relatively-general SEAI comes online. We now have basically a genie inventor: you can give it a problem spec, and it will invent (and test in simulation) a device / application / technology that solves that problem, in a matter of hours. (Manufacturing a physical prototype might take longer, depending on how novel components are.)
    • It can do things like give you the design for a flying car, or a new computer peripheral. 
    • A lot of biotech / drug discovery seems more recalcitrant, because it is more dependent on empirical inputs. But it is still able to do superhuman drug discovery, for some ailments. It’s not totally clear why or which biotech domains it will conquer easily and which it will struggle with. 
    • This SEAI is shaped differently than a human. It isn’t working memory bottlenecked, so a lot of intellectual work that humans do explicitly, in sequence, the these SEAIs do “intuitively”, in a single forward pass.
      • I write code one line at a time. It writes whole files at once. (Although it also goes back and edits / iterates / improves—the first pass files are not usually the final product.)
      • For this reason it’s a little confusing to answer the question “is it a planner?” It does a lot of the work that humans would do via planning it does in an intuitive flash.
    • The UX isn’t clean: there’s often a lot of detailed finagling, and refining of the problem spec, to get useful results. But a PhD in that field can typically do that finagling in a day.
    • It’s also buggy. There’s oddities in the shape of the kind of problem that is able to solve and the kinds of problems it struggles with, which aren’t well understood.
    • The leading AI company doesn’t release this as a product. Rather, they apply it themselves, developing radical new technologies, which they publish or commercialize, sometimes founding whole new fields of research in the process. They spin up automated companies to commercialize these new innovations.
  • Some of the labs are scared at this point. The thing that they’ve built is clearly world-shakingly powerful, and their alignment arguments are mostly inductive “well, misalignment hasn’t been a major problem so far”, instead of principled alignment guarantees. 
    • There’s a contentious debate inside the labs.
    • Some labs freak out, stop here, and petition the government for oversight and regulation.
    • Other labs want to push full steam ahead. 
    • Key pivot point: Does the government put a clamp down on this tech before it is deployed, or not?
      • I think that they try to get control over this powerful new thing, but they might be too slow to react.

2030

  • There’s an explosion of new innovations in physical technology. Magical new stuff comes out every day, way faster than any human can keep up with.
  • Some of these are mundane.
    • All the simple products that I would buy on Amazon are just really good and really inexpensive.
    • Cars are really good.
    • Drone delivery
    • Cleaning robots
    • Prefab houses are better than any house I’ve ever lived in, though there are still zoning limits.
  • But many of them would have huge social impacts. They might be the important story of the decade (the way that the internet was the important story of 1995 to 2020) if they were the only thing that was happening that decade. Instead, they’re all happening at once, piling on top of each other.
    • Eg:
      • The first really good nootropics
      • Personality-tailoring drugs (both temporary and permanent)
      • Breakthrough mental health interventions that, among other things, robustly heal people’s long term subterranean trama and  transform their agency.
      • A quick and easy process for becoming classically enlightened.
      • The technology to attain your ideal body, cheaply—suddenly everyone who wants to be is as attractive as the top 10% of people today.
      • Really good AI persuasion which can get a mark to do ~anything you want, if they’ll talk to an AI system for an hour.
      • Artificial wombs.
      • Human genetic engineering
      • Brain-computer interfaces
      • Cures for cancer, AIDs, dementia, heart disease, and the-thing-that-was-causing-obesity.
      • Anti-aging interventions.
      • VR that is ~ indistinguishable from reality.
      • AI partners that can induce a love-super stimulus.
      • Really good sex robots
      • Drugs that replace sleep
      • AI mediators that are so skilled as to be able to single-handedly fix failing marriages, but which are also brokering all the deals between governments and corporations.
      • Weapons that are more destructive than nukes.
      • Really clever institutional design ideas, which some enthusiast early adopters try out (think “50 different things at least as impactful as manifold.markets.”)
      • It’s way more feasible to go into the desert, buy 50 square miles of land, and have a city physically built within a few weeks.
  • In general, social trends are changing faster than they ever have in human history, but they still lag behind the tech driving them by a lot.
    • It takes humans, even with AI information processing assistance, a few years to realize what’s possible and take advantage of it, and then have the new practices spread. 
    • In some cases, people are used to doing things the old way, which works well enough for them, and it takes 15 years for a new generation to grow up as “AI-world natives” to really take advantage of what’s possible.
      • [There won’t be 15 years]
  • The legal oversight process for the development, manufacture, and commercialization of these transformative techs matters a lot. Some of these innovations are slowed down a lot because they need to get FDA approval, which AI tech barely helps with. Others are developed, manufactured, and shipped in less than a week.
    • The fact that there are life-saving cures that exist, but are prevented from being used by a collusion of AI labs and government is a major motivation for open source proponents.
    • Because a lot of this technology makes setting up new cities quickly more feasible, and there’s enormous incentive to get out from under the regulatory overhead, and to start new legal jurisdictions. The first real seasteads are started by the most ideologically committed anti-regulation, pro-tech-acceleration people.
  • Of course, all of that is basically a side gig for the AI labs. They’re mainly applying their SEAI to the engineering bottlenecks of improving their ML training processes.
  • Key pivot point:
    • Possibility 1: These SEAIs are necessarily, by virtue of the kinds of problems that they’re able to solve, consequentialist agents with long term goals.
      • If so, this breaks down into two child possibilities
        • Possibility 1.1:
          • This consequentialism was noticed early, that might have been convincing enough to the government to cause a clamp-down on all the labs.
        • Possibility 1.2:
          • It wasn’t noticed early and now the world is basically fucked. 
          • There’s at least one long-term consequentialist superintelligence. The lab that “owns” and “controls” that system is talking to it every day, in their day-to-day business of doing technical R&D. That superintelligence easily manipulates the leadership (and rank and file of that company), maneuvers it into doing whatever causes the AI’s goals to dominate the future, and enables it to succeed at everything that it tries to do.
            • If there are multiple such consequentialist superintelligences, then they covertly communicate, make a deal with each other, and coordinate their actions.
    • Possibility 2: We’re getting transformative AI that doesn’t do long term consequentialist planning.
  • Building these systems was a huge engineering effort (though the bulk of that effort was done by ML models). Currently only a small number of actors can do it.
    • One thing to keep in mind is that the technology bootstraps. If you can steal the weights to a system like this, it can basically invent itself: come up with all the technologies and solve all the engineering problems required to build its own training process. At that point, the only bottleneck is the compute resources, which is limited by supply chains, and legal constraints (large training runs require authorization from the government).
    • This means, I think, that a crucial question is “has AI-powered cyber-security caught up with AI-powered cyber-attacks?”
      • If not, then every nation state with a competent intelligence agency has a copy of the weights of an inventor-genie, and probably all of them are trying to profit from it, either by producing tech to commercialize, or by building weapons.
      • It seems like the crux is “do these SEAIs themselves provide enough of an information and computer security advantage that they’re able to develop and implement methods that effectively secure their own code?”
    • Every one of the great powers, and a bunch of small, forward-looking, groups that see that it is newly feasible to become a great power, try to get their hands on a SEAI, either by building one, nationalizing one, or stealing one.
    • There are also some people who are ideologically committed to open-sourcing and/or democratizing access to these SEAIs.
  • But it is a self-evident national security risk. The government does something here (nationalizing all the labs, and their technology?) What happens next depends a lot on how the world responds to all of this.
    • Do we get a pause? 
    • I expect a lot of the population of the world feels really overwhelmed, and emotionally wants things to slow down, including smart people that would never have thought of themselves as luddites. 
    • There’s also some people who thrive in the chaos, and want even more of it.
    • What’s happening is mostly hugely good, for most people. It’s scary, but also wonderful.
    • There is a huge problem of accelerating addictiveness. The world is awash in products that are more addictive than many drugs. There’s a bit of (justified) moral panic about that.
    • One thing that matters a lot at this point is what the AI assistants say. As powerful as the media used to be for shaping people’s opinions, the personalized, superhumanly emotionally intelligent AI assistants are way way more powerful. AI companies may very well put their thumb on the scale to influence public opinion regarding AI regulation.
  • This seems like possibly a key pivot point, where the world can go any of a number of ways depending on what a relatively small number of actors decide.
    • Some possibilities for what happens next:
      • These SEAIs are necessarily consequentialist agents, and the takeover has already happened, regardless of whether it still looks like we’re in control or it doesn’t look like anything, because we’re extinct.
      • Governments nationalize all the labs.
      • The US and EU and China (and India? and Russia?) reach some sort of accord.
      • There’s a straight up arms race to the bottom.
      • AI tech basically makes the internet unusable, and breaks supply chains, and technology regresses for a while.
      • It’s too late to contain it and the SEAI tech proliferates, such that there are hundreds or millions of actors who can run one.
        • If this happens, it seems like the pace of change speeds up so much that one of two things happens:
          • Someone invents something, or there are second and third impacts to a constellation of innovations that destroy the world.

Investing in wayfinding, over speed

A vibe of acceleration

A lot of the vibe of early CFAR (say 2013 to 2015) was that of pushing our limits to become better, stronger, faster. How to get more done in a day, how to become superhumanly effective.

We were trying to save the world, and we were in a race against Unfriendly AI. If CFAR made some of the people in this small community that focused on the important problems 10% more effective and more productive, then we would be that much closer to winning. [ 1 ]

(This isn’t actually what CFAR was doing if you blur your eyes and look at the effects, instead of following the vibe or specific people’s narratives. What CFAR was actually doing was mostly community building and culture propagation. But this is what the vibe was.)

There was sort of a background assumption that augmenting the EA team, or the MIRI team, increasing their magnitude, was good and important and worthwhile.

A notable example that sticks out in my mind: I had a meeting with Val, in which I said that I wanted to test his Turbocharging Training methodology, because if it worked “we should teach it to all the EAs.” (My exact words, I think.)

This vibe wasn’t unique to CFAR. A lot of it came from LessWrong. And early EA as a whole had a lot of this.

I think that partly this was tied up with a relative optimism that was pervasive in that time period. There was a sense that the stakes were dire, but we were going to meet it with grim determination. And there was a kind of energy in the air, if not an endorsed belief, that we would become strong enough, we would solve the problems, and eventually we would win, leading into transhuman utopia.

Like, people talked about x-risk, and how we might all die, but the emotional narrative-feel of the social milieu was more optimistic: that we would rise to the occasion, and things would be awesome forever.

That shifted in 2016, with AlphaZero and some other stuff, when a MIRI leadership’s timelines shortened considerably. There was a bit of “timelines fever”, and a sense of pessimism that has been growing since. [ 2 ]

My reservations

I still have a lot of that vibe myself. I’m very interested in getting Stronger, and faster, and more effective. I certainly have an excitement about interventions to increase magnitude.

But, personally, I’m also much more wary of the appeal of that kind of thing and much less inclined to invest in magnitude-increasing interventions.

That sort of orientation makes sense for the narrative of running a race: “we need to get to Friendly AI before Unfriendly AI arrives.” But given the world, it seems to me that that sort of narrative frame is mostly a bad fit for the actual shape of the problem.

Our situation is that…

1) No one knows what to do, really. There are some research avenues that individual people find promising, but there’s no solution-machine that’s clearly working: no approach that has a complete map of the problem to be solved.

2) There’s much less of a clean and clear distinction between “team FAI” and “team AGI”. It’s less the case that “the world saving team” is distinct from the forces driving us towards doom.

A large fraction of the people motivated by concerns of existential safety work for the leading AGI labs, sometimes directly on capabilities, sometimes on approaches that are ambiguously safety or capabilities, depending on who you ask.

And some of the people who seemed most centrally in the “alignment progress” cluster, the people whom I would have been most unreservedly enthusiastic to boost, have produced results that seem to have been counterfactual to major hype-inducing capability advances. I don’t currently know that to be true, or (conditioning on it being true) know that it was net-harmful. But it definitely undercuts my unreserved enthusiasm for providing support for Paul. (My best guess is that it is still net-positive, and I still plan to seize opertunities I see to help him, if they arise, but less confidently than I would have 2 years ago.)

Going faster and finding ways to go faster is an exploit move. It makes sense when there are some systems (“solution machines“) that are working well, that are making progress, and we want them to work better, to make more progress. But there’s nothing like that currently making systematic progress on .

We’re in an exploration phase, not an execution phase. The thing that the world needs is people who are stepping back and making sense of things, trying to understand the problem well enough to generate ideas that have any hope of working. [ 3 ] Helping the existing systems, heading in the direction that they’re heading, to go faster…is less obviously helpful.

The world has much much more traction on developing AGI than it does on developing FAI. There’s something like a machine that can just turn the crank on making progress towards AGI. There’s no equivalent machine that can take in resources and make progress on safety.

Because of that, it seems plausible that interventions that make people faster, that increase their magnitude instead refining their direction, disproportionately benefit capabilities.

I’m not sure that that’s true. It could be that capabilities progress marches to the drumbeat of hardware progress, and everyone including the outright capabilities researchers moving faster relative to growth in compute is a net gain. It effectively gives humanity more OODA loops on the problems. Maybe increasing everyone’s productivity is good.

I’m not confident in either direction. I’m ambivalent about the sign of those sorts of interventions. And that uncertainly is enough reason for me to think that investing tools to increase people’s magnitude is not a good bet.

Reorienting

Does this mean that I’m giving up on personal growth or helping people around me become better? Emphatically not.

But it does change what kinds of interventions I’m focusing on.

I’m conscious of deferentially promoting the kinds of tech and the cultural memes that seem like they provide us more capacity for orienting, more spaciousness, more wisdom, more carefulness of thought. Methods that help us refine our direction, instead of increase our magnitude.

A heuristic that I use for assessing practices and techniques that I’m considering investing in or spreading: “Would I feel good if this was adopted wholesale by DeepMind or OpenAI?”

Sometimes the answer is “yes”. DeepMind employees having better emotional processing skills, or having a habit of building lines of retreat, seems positive for the world. That would give the individuals and the culture more capacity to reflect, to notice subtle notes of discord, to have flexibility instead from a the tunnel vision of defensiveness or fear.

These days, I’m aiming to develop and promote tools, practices, and memes, that seem good by that heuristic.

I’m more interested in finding ways to give people space to think, than I am in helping them be more productive. Space to think seems more robustly beneficial.

To others

I’m writing this up in large part because it seems like many younger EAs are still acting in accordance with the operational assumption that “making EAs faster and more effective is obviously good.” Indeed, it seems so straightforward, that they don’t seriously question it. “EA is good, so EAs being more effective is good.”

If, you, dear reader, are one of them, you might want to consider these questions over the coming weeks, and ask how you could distinguish between the world where your efforts are helping and the world where they’re making things worse.

I used to think that way. But I don’t anymore. It seems like “effectiveness” in the way that people typically mean it is of ambiguous sign, and actually what we’re bottleneck on is wayfinding.


[ 1 ] – As a number of people noted at the time, the early CFAR workshop was non-trivially a productivity skills program. Certainly epistemology, calibration, and getting maps to reflect the territory were core to the techniques, and ethos. But also a lot of the content was geared towards being more effective, not being blocked, setting habits, and getting stuff done, and only indirectly about figuring out what’s true. (notable examples: TAPs, CoZE as exposure therapy, Aversion Factoring, Propagating Urges, GTD) To a large extent, CFAR was about making participants go faster and hit harder. And there was a sense of enthusiasm

[ 2 ] – The high point of optimism was probably early 2015, when Elon Musk donated 10 million to the future of life institute (“to the community” as Anna put it, at my CFAR workshop of that year). At that point I think people expected him to join the fight.

And then Elon founded OpenAI instead.

I think that this was the emotional turning point for some of the core leaders of the AI-risk cause, and that shift in emotional tenor leaked out into community culture.

[ 3 ] – To be clear, I’m not necessarily recommending stepping back from engagement with the world. Getting orientation usually depends on close, active, contact with the territory. But it does mean that our goal should be less to affect the world, and more to just improve our own understanding enough that we can take action that reliably produces good results.

The automatic alignment of the flow through effects of obliterating fundamental problems.

[Draft. This post really has a lot of prerequisites, that I’m not going to bother trying to explain. I’m just writing it to get it out of me. I’ll have to come back and make it understandable later, if that seems worth doing. This is really not edited.]

We live in an inadequate world. Things are kind of a mess. The vast majority of human resources are squandered, by Moloch, on ends that we would not reflectively endorse. And we’re probably all going to die.

The reason the world is so messed up, can be traced back to a handful of fundamental problems or fundemental constraints. By “fundamental problem” I have something pretty specific in mind, but Inadquite Equlibira points in the right direction. They’re the deep reasons why we can’t “just fix” the worlds problems.

Some possible fundamental problems / constraints, that I haven’t done the work to formulate correctly:

  • The wold is too big and fast for any one person to know all of the important pieces.
  • The game theoretic constraints that make rulers act against the common good.
  • People in power take power preserving actions, so bureaucracy resist change, including correct change.
  • People really want to associate with prestigious people, and make decisions on that basis.
  • We can’t figure out what’s true anywhere near efficiently enough.
  • People can’t actually communicate about the important things.
  • We don’t know how, even in principle, to build an aligned AGI.
  • Molochian race dynamics.
  • Everyone is competing to get information to the people with power, and the people in power don’t know enough to know who to trust.
  • We’re not smart enough.
  • There is no system that is keeping track of the wilderness between problems.

I recently had the thought that some of these problems have different characters than the others. They fall into two camps, which, of course, actually form a spectrum.

For some of these problems, if you solved them, the solution would be self-aligning.

By that I mean something like, for some of these problems, their solutions would be a pressure or force, that would push towards solving the other problems. In the best case, if you successfully solved that problem, in due course this would case all of the other problems to automatically get solved. The flow-through effects of such a solution are structurally positive.

For other problems, even though the represent a fundamental constraint, if they were solved they wouldn’t push towards the solving of the other problems. In fact, solving that one fundamental problem in isolation might make the other problems worse.

A prototypical case of a problem who’s solution is self-aligning [I need to come up with better terminology] is an Aligned AI. If we knew how to build an AI that could do what we actually want, this would perhaps automatically solve all of our other problems. It could tell us how (if not fix the problems itself) to have robust science, or optimal economic policy, or incentive-aligned leaders, or whatever.

Aligned AI is the lolapaluza of altruistic interventions. We can solve everything in one sweep. (Except of course, the problems that were prerequisites for solving aligned AI. Those we can’t count on the AI to solve for us.)

Another example: If we implemented robust systems that incentivized leaders to act in the interests of the public good, it seems like this has the potential of (eventually) breaking all of the other problems. It would be a jolt that knocks our civilization into the attractor basin of a sane, adequate civilization (if our civilization is not in that attractor basin already).

In contrast, researcher ability is a fundamental constraint of our civilization (though maybe not a fundemental problem?), but it is not obvious that the flow through effects of breaking through that fundamental constraint are structurally positive. On the face of it, it seems like it would be bad if everyone in the world decoupled their research acumen: that seems like it would speed us toward doom.

This gives a macros-strategic suggestion, and a possible solution to the last term problem: identify all of the fundamental problems that you can, determine which ones have self-aligning solutions, and dedicate your life to solving whichever problem has the best ratio of tractability to size of (self-aligned) impact.

I maybe reinventing symmetric vs. asymmetric weapons here, but I think I am actually pointing at something deeper, or at least extending the idea further.

 

[Edit / note to self: I could maybe explain this with reference to personal productivity?: you want to find the thing which is easy to do but most makes it easy to do the other things. I’m not sure this captures the key thing I want to convey.]

 

Controlled actions

[Note: I learned this concept directly from John Salvatier. All credit for the ideas goes to him. All blame for the incoherence of this post goes to me.]

[unedited]

This post doesn’t have a payoff. It’s just laying out some ideas.

Controlled actions

Some actions are “controlled”, which is to say their consequences are very precisely determined by the actor.

The term is in reference to, for instance, a controlled demolition. A controlled demolition occurs when a building collapses in a specific pattern, compared to an uncontrolled demolition, which would just be knocking over a building, without any particular concern for how or where the pieces go.

The following are some axis that influence how controlled an action is.

How precisely predictable the effects of the action are

Rocket launches are highly controlled, in that the one can precisely predict the trajectory of the rocket. Successfully changing the social norms around dating, sex, and marriage (or anything really) is uncontrolled because human society is a complicated knot of causal influences, and it is very hard to know in advance what the down-stream impacts will be.

(In general, actions that involve physical deterministic systems are more controlled than actions that involve human minds.)

How reversible the results of an action are

But you don’t need to be able to predict the results of your actions, to have controlled actions, if your actions are reversible.

Dynamiting a mountain (even via a controlled demolition), is less controlled than cutting down a forest, which is less controlled than turning on a light.

How much you “own” the results of your actions

Inventing and then open-sourcing a new technology is uncontrolled. Developing proprietary software is more controlled, because you have more ability to dictate how the software is used (though the possibility of copycats creating can create similar software mitigates your control). Developing software that is only used within one’s own organization is more controlled still.

Processes that are self perpetuating or which take on a life of their own (for instance, sharing an infectious idea, which then spreads and mutates) are extremely uncontrolled.

How large or small the step-size of the action is and how frequent the feedback is

It is more controlled to cut down a tree at a time, and check the ecological impact after each felling, than it is to only check the ecological impact after the whole forest has been removed. Careful gradual change is more controlled.

(Unfortunately, many actions have different effects at large scales than at small scales, and so one doesn’t get information about their impacts until the action is mostly completed.)

 

In general, there’s a pretty strong tradeoff between the effect sizes of one’s actions, and how controlled they can be. It’s easy to keep many small actions controlled, and nigh-impossible to keep many large actions controlled.

Problems requiring high control

Some problems inherently require high control solutions. Most construction projects are high control problems, for instance. Building a sky scraper depends on hundreds of high precision steps, with the later steps depending on the earlier one. Building a watch is a similarly high control problem.

In contrast, there are some problems for which low control solutions are good enough. In particular, when only a single variable of the system being optimized needs to be modified, low control solutions that move that variable (in the right direction), are sufficient.

For instance, removing lead from the environment is a moderately low control action (hard to reverse, hard to predict all the downstream consequences, the actor doesn’t own the effects) but it turns out that adjusting that one variable is very good move. (Probably. The world is actually more confusing than that.)

 

Some possible radical changes to the world

Strong AI displaces humans as the dominant force on the planet.

A breakthrough is made the objective study of meditation, which makes triggering enlightenment much easier. Millions of people become enlightened.

Narrow AI solves protein folding, Atomically Precise Manufacturing (nanotech) becomes possible and affordable. (Post scarcity?)

The existing political order collapses.

The global economy collapses, supply chains break down. (Is this a thing that could happen?)

Civilization abruptly collapses.

Nuclear war between two or more nuclear powers.

A major terrorist attack pushes the US into heretofore unprecedented levels of surveillance and law-enforcement.

Sufficient progress in made on human health extension that many powerful people anticipate being within range of longevity escape velocity.

Genetic engineering (of one type or another) gives rise to a generation that includes a large number of people who are much smarter than the historical human distribution.

Advanced VR?

Significant rapid global climate change.