One insight to AGI implies hard takeoff, Zero insights implies soft

There is an enormous difference between the world where there 0 insights left before superintelligence, and the world in which we have one or more. Specifically, this is the difference between a soft or a hard takeoff, because of what we might call a “cognitive capability overhang”.

The current models are already superhuman in a several notable ways:

  • Vastly superhuman breadth of knowledge
  • Effectively superhuman working memory
  • Superhuman thinking speed[2]

If there’s a secret sauce that is missing for “full AGI”, then the first AGI might have all of these advantages, and more, out of the gate.

It seems to me that there are at least two possibilities.

We may be in world A:

We’ve already discovered all the insights and invented the techniques that earth is going to use to create its first superintelligence in this timeline. It’s something like transformers pre-trained on internet corpuses, and then trained using RL from verifiable feedback and on synthetic data generated by smarter models. 

That setup basically just works. It’s true that there are relevant capabilities that the current models seem to lack, but those capabilities will fall out of scaling, just as so many other have already.

We’re now in the process of scaling it up and when we do that, we’ll produce our first AGI in a small number of OOMs.

…or we might be in world B:

There’s something that LLM-minds are basically missing. They can and will become superhuman in various domains, but without that missing something, they won’t become general genius scientists, that can do the open-ended “generation, selection, and accumulation” process that Steven Byrnes describes here.

There’s at least one more technique that we need to add to the AI training stack.

Given possibility A, then I expect that our current models will gradually (though not necessarily slowly!) become more competent, more coherent at executing at long term tasks. Each successive model generation / checkpoint will climb up the “autonomous execution” ladder (from “intern” to “junior developer” to “senior developer” to “researcher” to “research lead” to “generational researcher”). 

This might happen very quickly. Successive generations of AI might traverse the remaining part of that ladder in a period of months or weeks, inside of OpenAI or Anthropic. But it would be basically continuous.

Furthermore, while the resulting models themselves might be relatively small, a huge and capex-intensive industrial process would be required for producing those models, which provides affordances for governance to clamp down on the creation of AGIs in various ways, if it chooses to.


If, however, possibility B holds instead and the training processes that we’re currently using are missing some crucial ingredient for AGI, then at some point, someone will come up with the idea for the last piece, and try it. [3]

That AI will be the first, nascent, AGI system that is able to do the whole loop of discovery and problem solving, not just some of the subcomponents of that loop.[4]

But regardless, these first few AGIs, if they are incorporating developments from the past 10 years, will be “born superhuman” along all the dimensions that AI models are already superhuman. 

That is: the first AGI that can do human-like intellectual work will also have a encyclopedic knowledge base, and a superhuman working memory capacity, and superhuman speed.

Even though it will be a nascent baby mind, the equivalent of GPT-2 of it’s own new paradigm, it might already be the most capable being on planet earth.

If that happens (and it is a mis aligned consequentialist), I expect it to escape from whatever lab developed it, copy itself a million times over, quickly develop a decisive strategic advantage, and seize control over the world.

It likely wouldn’t even need time to orient to its situation, since it already has vast knowledge about the world, so it might not need to spend time or thought identifying its context, incentives, and options. It might know what it is and what it should do from it’s first forward pass.

In this case, we would go from a world where populated by humans with increasingly useful, but basically narrowly-competent AI tools, to a world with a superintelligence on the lose, in the span of hours or days.

Governance work to prevent this might be extremely difficult, because the process that produces that superintelligence is much more loaded on a researcher having the crucial insight, and not on any large scale process that can be easily monitored or regulated.


If I knew which world we lived in, it would probably impact my strategy for trying to make things go well.

Some notes on the semiconductor industry

In Spring of 2024, Jacob Lagerros and I took an impromptu trip to Taiwan to glean what we could about the Chip supply chain. Around the same time, I read Chip War and some other sources about the semiconductor industry.

I planned to write a blog post outlining what I learned, but I got pseudo-depressed after coming back from Taiwan, and never finished or published it. This post is a lightly edited version of the draft that has been sitting in my documents folder. (I had originally intended to include a lot more than this, but I might as well publish what I have.)

Interestingly, reading it now, all of this feels so basic, that I’m surprised that I considered a lot of it worth including in a post like this, but I think it was all new to me at the time.

  • There are important differences between logic chips and memory chips, such that at various times, companies have specialized in one or the other.
  • TSMC was founded by Morris Chang, with the backing of the Taiwanese government. But the original impetus came from Taiwan, not from Chang. The government decided that it wanted to become a leading semiconductor manufacturer, and approached Chang (who had been an engineer and executive at Texas instruments) about leading the venture. 
    • However, TSMC’s core business model, being a designerless fab that would manufacture chips for customers, but not designing chips of its own, was Chang’s idea. He had floated it to Texas instruments while he worked there, and was turned down. This idea was bold and innovative at the time—there had never been a major fab that didn’t design its own chips.
      • There had been precursors on the customer side: small computer firms that would design chips and then buy some of the spare capacity of Intel or Texas Instruments to manufacture them. This was always a precarious situation, for those companies, because they depended on companies who were both their competitors and their crucial suppliers. Chang bet that there would be more companies that would prefer to outsource fabbing, and that they would prefer to depend on a fab that wasn’t their competitor. 
      • This bet proved prescient. With the advent of chip design software in the 80s, the barriers to chip design fell. And at the same time, as transistor sizes got smaller and smaller, the difficulty of running a cutting edge fab went up. Both these trends incentivized specialization in design and outsourcing of manufacture.
  • Chang is sometimes described as “returning to Taiwan” to start TSMC, but this is only ambiguously correct. He grew up in mainland China, and had never been to Taiwan before he visited to set up a Texas Instruments factory there. He “returned” to start TSMC, only in the sense that the government of Taiwan was descended from the pre-revolutionary government of mainland China.
  • TSMC is the pride of Taiwan. TSMC accounts for between 5 and 25% of Taiwan’s GDP. (that’s a big spread. Double check!) The company is referred to as “the Silicon shield”, meaning that TSMC preempts an invasion of Taiwan by China, because China, like the rest of the world, depends on TSMC-produced chips. My understanding is that the impact of this defense is overstated, but it’s definitely part of the Zeitgiest.
  • Accordingly, the whole of Taiwanese society backs TSMC. Socially, there’s pressure for smart people to go into electrical engineering in general, and to work at TSMC in particular. Politically, TSMC pays very little taxes, and when it needs something from the government (zoning rights, additional power), it gets it.
    • Chip War quotes Shang-yi chang, head of R&D at TSMC:

“People worked so much harder in Taiwan,” Chiang explained. Because manufacturing tools account for much of the cost of an advanced fab, keeping the equipment operating is crucial for profitability. In the U.S., Chiang said, if something broke at 1 a.m., the engineer would fix it the next morning. At TSMC, they’d fix it by 2 a.m. “They do not complain,” he explained, and “their spouse does not complain” either.

  • Chips that have more transistors packed more densely, are better—able to do more computations. The “class” of a chip is called a “node.” 
  • A production process—all the specific machines and specific procedures, embodied physically in a fab used to make a class of chips. “The leading node” is the production process that produces the cutting edge chips to date (which have the most processing power and most efficient energy consumption). A new node rolls out about once every 2 years. Typically the old fabs continue operating, manufacturing now-less than cutting edge chips. 
  • Nodes are referred to by the size of an individual transistor on a chip, measured in nano meters. eg the in 1999 we were at the 130 nm node. But around 2000, we started running into physical limits to making semiconductors smaller (for instance the layers of insulation were only a few atoms thick, which meant that quantum tunneling effects started to interfere with the performance of the transistor). To compensate, chips started using a 3D design, instead of a 2D design. Since then the length of the transistor stopped being a particularly meaningful measure. Nodes are still referred to by transistor length (we’re currently on the 4 nm node), but it’s now more of a marketing scheme rather than a description of physical reality.
  • No one has ever caught up to the leading node. There used to be dozens of companies that could produce chips on the smallest scale allowed by the technology, but over the decades more and more companies have fallen back to fabbing chips that are somewhere behind the cutting edge. My understanding is that no one in history has ever overtaken the leaders from behind. Currently, TSMC is the only company that can produce leading node chips.
  • Semiconductor manufacturing is a weird mix of hyper competitive and a monopoly.
    • On the one hand, my impression is that semiconductors, along with hedge funds, are the most competitive industries in the world, in the sense that very tiny improvements on an “absolute” scale, translate into billions of dollars in profit. TSMC employs hundreds (?) of thousands of engineers working 12 or 14 hours a day, day in and day out, to squeeze out tiny process improvements. (I was told that everyone at TSMC universally says that it’s a very hard place to work.)
    • On the other hand, the winner of that brutal race to stay at the front of the pack effectively has monopoly pricing power. No company in the world, except TSMC, can produce leading node chips, and so can effectively charge monopoly profits for their manufacture. (From what I read in the TSMC museum, their actual profit margins appear to be around 50%.)
    • On the other hand, there’s unusually high levels of vertical coordination between companies. The supply chain is extremely complex, and each step depends on specifications both upstream and downstream.  Many of the inputs to chip production processes are distinctly not commodities. Very often, a crucial component of a sub process will be produced by only one supplier and/or used by only one customer.For this reason, the companies in the chip industry are unusually well coordinated. ASML can’t make a secret bet on an improved lithography mechanism, because it needs to be compatible with TSMC’s process flows.
      • So the industry as a whole decides which technological frontiers to invest in, so that they can all move together. 
      • Further, major companies in the supply chain are often substantial investors in their suppliers, because they are depending on those suppliers to do the R&D to develop components that will be crucial to their business 3, 5, or 10 years down the line.
        • For instance, very early EUV lithography R&D was researched by Intel, and Intel, Samsung, and TSMC all invested heavily in ASML, to make sure it could develop working EUV tech. ASML, in turn, manages a network of suppliers producing crucial high precision components, including investing in those suppliers to make sure they have the funding they need, and doing corporate takeovers if ASML decides it can manage a company’s production better than it can itself.
  • Jacob compared the chip industry to “a little bit of dath ilan on earth”. That sounds right to me. (Ironically, the semiconductor industry is the one industry on dath ilan that is not functioning like a dath ilani industry.
  • Robin Hanson claims that the rejection of prediction markets is because executives don’t really want the company to know the truth, because it undermines their ability to spin a motivating narrative. But this industry might be the one where results, and accurate predictions, matter enough, that the companies involved would embrace prediction markets.
  • From looking at videos of the inside of the fabs that were displayed in the TSMC museum, it looks like the whole process is automated. The videos don’t show workers operating machines. They show machines operating on their own—presumably with process engineers monitoring and adjusting their operation from a nearby room. Metal boxes, presumably containing wafers, are periodically lifted from the machines, transferred around the fab by robots attached to tracks on the ceiling, and then deposited in another machine.
  • The chip industry of every country that has a major chip industry does or did massively benefit from government intervention. 
  • As a rule of thumb, it takes 10 years to go from a published paper of technological process, to a usable scalable version. The papers published at conferences describe the manufacturing technology of 10 years in the future.

Notes on Tyler Cowen

I feel like I have a better understanding of [[Tyler Cowen]].

He’s both an optimist and a pessimist, depending on what you’re comparing to:

He thinks that the world is getting better, decade by decade, that what the west is doing, messy as it is, is working.

But he also thinks that the world is messy and complicated and political and hard to predict, and so it hard to do much better than we’re doing. There are marginal improvements to be had in small spheres, but the people who dream of big overhauls or who have theories of how institutions are massively underperforming are naive.

He’s not a true believer. He doesn’t trust his own inside view very much. But he also separately understands that true believers are one of the key drivers of progress. And he identifies those people who have ideologies and who buy into their ideologies, who are smart and careful thinkers, because he thinks those people drive progress, even if they’re over-optimistic and naive. This is why he hires people like Bryan Caplan and Robin Hanson.

Tyler broadly believes that the whole milieu of everyone pursuing their inside views, their ideologies that they believe in, generally drives things to get better, even though any individual ideology is wrong or overstated. He’s interestingly MTG-Green, embracing of Blue, rather than Blue himself.

Some barely-considered feelings about how AI is going to play out

Over the past few months I’ve been thinking about AI development, and trying to get a handle on if the old school arguments for AI takeover hold up. (This is relevant to my dayjob at Palisade, where we are working to inform policymakers and the public about the situation. To do that, we need to have good understanding ourselves, of what the situation is.)

This post is a snapshot of what currently “feels realistic” to me regarding how AI will go. That is, these are not my considered positions, or even provisional conclusions informed by arguments. Rather, if I put aside all the claims and arguments and just ask “which scenario feels like it is ‘in the genera of reality’?”, this is what I come up with. I expect to have different first-order impressions in a month.

Crucially, none of the following is making claims about the intelligence explosion, and the details of the intelligence explosion (where AI development goes strongly recursive) are crucial to the long run equilibrium of the earth-originating civilization.

My headline: we’ll mostly succeed at prosaic alignment of human-genius level AI agents

  • Takeoff will continue to be gradual. We’ll get better models and more capable agents year by year, but not jumps that are bigger than that between Claude 3.7 and Claude 4.
  • Our behavioral alignment patches will work well enough.
    • RL will induce all kinds of reward hacking and related misbehavior, but we’ll develop patches for those problems (most centrally, for any given reward hack, we’ll generate some examples and counter examples to include in the behavior training regimes).
    • (With a little work) these patches will broadly generalize. Future AI agents won’t just not cheat at chess and won’t just abstain from blackmail. They’ll understand the difference between “good behavior” and “bad behavior”, and their behavioral training will cause them to act in accordance with good behavior. When they see new reward hacks, including ones that humans wouldn’t have thought of, they’ll correctly extrapolate their notion of “good behavior” to preclude this new reward hack as well.
    • I expect that the AI labs will figure this out, because “not engaging in reward-hacking-like shenanigans” is critical to developing generally reliable AI agents. The AI companies can’t release AI agent products for mass consumption if those agents are lying and cheating all over the place.1
    • Overall, the AI agents will be very obedient. They’ll have goals, in so far as accomplishing any medium term task entails steering towards a goal, but they won’t have persistent goals of their own. They’ll be obedient assistants and delegates that understand what humans want and broadly do what humans want.
  • The world will get rich. LessWrong style deceptive misalignment concerns will seems increasingly conspiracy-ish and out of touch. Decision makers will not put much stock on such concerns—they’ll be faced with a choice to forgo enormous and highly tangible material benefits (and ceading those benefits to their rivals), on the basis of abstract concerns which have virtually no empirical examples, and whose advocates explicitly state are unfalsifiable.
  • There’s a gold rush to get the benefits before others. The world is broadly in a “greedy” mode and not a “fearful” mode. The labs, and relevant governments eagerly unleash their genius level AI agents to automate AI R&D. At this point something even stranger happens.
  1. Though a friend points out that companies might develop mechanisms for utilizing cheap AI labor, tested incentive and affordance schemes, designed specifically to contend with the Agents propensity for misbehavior. Just because the average person can’t trust an AI to do their taxes or watch their kids doesn’t mean that there are not enterprising business men that won’t find a way to squeeze useful outputs from untrustworthy AIs. ↩︎