The “function” of government

[note: probably an obvious point to most people]

Sleep

When I was younger I was interested the the question “why do we sleep? What is the biological function of sleep?” This is a more mysterious than one might naively guess, for the past 150 years scientists have put forth many theories of the function of sleep. But for every one of those theories, some of the specific observed facts about the biology of sleep don’t fit well with it.

At some point I realized that the question “what is the function of sleep?” relies on a confused assumption that there’s one only one function, or rather that “sleep” is one thing, rather than many overlapping processes.

A more accurate historical accounting is something like the following…

Many eons ago there was some initial reason why it was adaptive to early animals to have an active mode and a different less active mode. That original reason for that less active mode might have been any of a number of thing: clearing of metabolic waste products, investments in cellular growth over cellular activity, whatever.

But once an organism has that division between an active mode and a relatively inactive proto-sleep mode, the later comes to include many additional functions. As the complexity of the organism increases and new biological functions evolve, some of those functions will be more compatible with the proto-sleep mode than with the active mode, and so those functions evolve to occur in that mode. Sleep is all the biological processes that happen together during during the relatively inactive period.

On might be tempted to ask what the original purpose of the inactive mode was, and declare that the true purpose of sleep. But that would be yielding to an unfounded essentialism. Just because it was first doesn’t mean that it is in any sense more important. It might very well be that the original biological function that sleep evolved around (like a perl around a grain of sand) has itself evolved away. That has no baring on an organism’s evident need to sleep.

Government

Similarly, I had previously been thinking of states as stationary bandits. States emerge from warlord using violence to extort wealth from productive peasants, and evolve into their modern form as power-conflicts between factions within the ruling classes rearrange the locuses of power. I think this is basically right as a (simplified historical accounting).

But reading a bit about economic history, I have new sense of it being kind like evolved subsystems.

Yes, the state starts out as a stationary bandit, but once it’s there, and and taken for granted as a part of life, it is (for better or for worse) a natural entity to enforce contract law, provide public goods, run a welfare state, stimulate aggregate demand, or run a central bank. There’s a path dependency by which the state evolves to take on these functions because at any given step of historical development, the state is the existing institution that can most easily be repurposed to solve a new problem, which both changes and entrenches the power of the state, much as each newly evolved function that synergizes with the rest of sleep reinforces sleep as a behavioral pattern.

The difference

But unlikely in the case of sleep the original nature of the thing is still relevant to it’s current form. All of the later functions of the state are still founded on force and the use of force. Doing solving problems with a state almost necessarily requires solving them via, at some point in the process, threatening someone with violence.

In principle, many, maybe all of those functions could be served by voluntary, non-coercive institutions, but since the state, given it’s power, is the default solution, many problems get “solved” via more violence and more coercion than was necessary.

That states have additional layers of functionality, some of which are arguably aligned with broader socitey, doesn’t make me notably more positive about states. Rather, it makes them seem more insidious. When there’s an entity around that has, by schelling agreement, the legitimate right to use force to extract value, it creates a temptation to co-opt and utilize that entity’s power for many an (arguably) good cause, in addition to outright corruption.

Reflecting on some regret about not trying to join and improve specific org(s)

I started a new job recently, which has prompted me to reflect on my work over the past few years, and how I could have done better.

Concretely, I regret not joining SERI MATS, and helping it succeed, when it was first getting started. 

I think this might have been a great fit for me: I had existing skills and experience that I think would have been helpful for them. The seasonal on-off schedule would have given me the flexibility to do and learn other things. It would have (I think) helped me get a better grounding in Machine Learning and technical alignment approaches.

And if I had joined with an eye towards agentically shaping the organization’s culture and priorities as it developed, I think I would have had a positive impact on the seed that has grown into the current alignment field . In particular, I think I might have had leverage to establish some cultural norms regarding how to think about the positive and negative impacts of one’s work.1 

I regarded MATS as the obvious thing to do. The nascent alignment field was bottlenecked on mentorship—a small number of people (arguably) had good taste for the kinds of research that was on track, but had limited bandwidth for research mentorship, so conveying that research taste was (and is?) a bottleneck for the whole ecosystem. A program aiming to unblock everything else to expand the capacity for research mentorship as much as possible seemed like the obvious straightforward thing to do.

I said as much in my post from early 2023:

There is now explicit infrastructure to teach and mentor these new people though, and that seems great. It had seemed for a while that the bottleneck for people coming to do good safety research was mentorship from people that already have some amount of traction on the problem. Someone noticed this and set up a system to make it as easy as possible for experienced alignment researchers to mentor as many junior researchers as they want to, without needing to do a bunch of assessment of candidates or to deal with logistics. Given the state of the world, this seems like an obvious thing to do.

I don’t know that this will actually work (especially if most of the existing researchers are themselves doing work that dodges the core problem), but it is absolutely the thing to try for making more excellent alignment researchers doing real work. And it might turn out that this is just a scalable way to build a healthy field.

In retrospect, I should have written those paragraphs and generated the next thought “I should actively go try to get involved in SERI MATS and see if I can help them.”

So why didn’t I?

Misapplied notion of counterfactual impact

I didn’t do this because I was operating on the model/assumption that, while this was important, they were doing it now, and were probably not in danger of failing at it. It was taken care of and so I didn’t need to do it.

I now think that was probably a mistake. Because I didn’t get involved, I don’t know one way or the other, but it seems plausible to me that I could have contributed to making the overall project substantially better: more effective and with better positive externalities. 

This isn’t because I’ve learned anything in particular about how SERI MATS missed the mark, but just getting more exposure to organizations and adjusting my prior that even if an organization is broadly working, and not in danger of collapse, it might be the case that I can personally make it much better with my efforts. In particular, I think it will sometimes be the case that there is room to substantially improve an organization in ways that don’t line up very neatly with the specific roles that they’re attempting to explicitly hire for, if you have strategic orientation and specific relevant experience.2

This realization is downstream with my interactions with Palisade over recent weeks. Also, Ronny made a comment a few years ago (paraphrased) that “you shouldn’t work for an organization unless you’re at least a little bit trying to reform it”. That stuck with me, and changed my concept of “working for an org”.

Possibly this difference in frame is also partially downstream of thinking a bit about shapley values through reading Planecrash and thinking about donation-matching for SFC. (I previously aimed to do things that, if I didn’t do them, wouldn’t happen. Now, I’ve continuous-ized that notion, and aim for, approximately, high shapley value).

Underestimating the value of “having a job”

Also, regarding SERI potentially being a good fit for me in particular, I think I have historically underestimated the value of having a job for structuring one’s life and supporting personal learning. I currently wish that I had more technical background in ML and alignment/control work, and I think I might have gotten more of that if I had been actively trying to develop in that direction while supporting MATS in a non-technical capacity, instead of trying to develop that background (inconsistently) independently.

Strategic misgivings

I didn’t invest heavily in any project over recent years because there wasn’t much that I straightforwardly believed in. As noted above, the idea-of-MATS was a possible exception to this—it seemed like the obvious thing to do given the constraints of the world. And I now think I should take “this seems like the obvious thing to do” as a much stronger indicator that I should get involved with a project, somehow, and figure out how to help, than I previously did.

But part of what held me back from doing that was misgivings about the degree to which MATS was acting as a feeder pool for the scaling labs. MATS is another project that doesn’t seem obviously robustly good to me (or “net-positive”, though I kind of think that’s the wrong frame). As with many projects, I felt reticent to put my full force behind it for that reason.

In retrospect, I think maybe I should have shown up and tried to solve the problem of “it seems like we’re doing plausible real harm, and that seems unethical” from the inside. I could have repeatedly and vocally drawn attention to it, raised it as a consideration in strategic and tactical planning, etc. Either I would have shaped the culture around this problem for the MATS staff sufficiently that I trusted the overall organism to optimize safely, or we would have bounced off of each other unproductively. And in that second case, we could part ways, and I could move on.

In general, it feels like a more obvious affordance to me, now, if I think something is promising, but I don’t trust it to have positive impacts, I just try non-disruptively making it better according to the standards that I think are important, and if that doesn’t work or doesn’t go well, parting ways with the org.

This all begs the question, “should I still try to work for SERI MATS and make it much better?”

My guess is that the opportunity is smaller now than it was a few years ago, because both the culture and processes of the org have more found an equilibrium that works. There’s less leverage to make an org much better when the org is figuring out how to do the thing it’s trying to do, compared to when it has reached product-market-fit, and is mostly finding ways to reproduce that product consistently and reliably.

That said, one common class of error is overestimating the degree to which an opportunity has passed. e.g. not buying Bitcoin in 2017, because you believe that you’ve already missed the big opportunity—it’s true in some sense, but you’re underestimating how much of the opportunity still remains. 

So, if I were still unattached, writing this essay would prompt me to reach out to Ryan, and say directly that I’m interested in exploring working for MATS, and try to get more contact with the territory, so that I can see for myself. As it is, I have a job which seems like it needs me more, and which I anticipate absorbing my attention for at least the next year.

  1. Note: of all the things I wrote here, this is the point that I am most uncertain of. It seems plausible to me that because of psychological dynamics akin to “It is difficult to get a man to understand something, when his salary depends on his not understanding it”, and classic EA-style psychological commitment to life narratives that impart meaning via impact, the cultural norms around how the ecosystem as a whole thinks about positive and negative impacts, were and are basically immovable. Or rather, I might have been able to make more-or-less performative hand-wringing fashionable, and possibly cause people to have less of an action-bias , but not actually produce norms that lead to more robustly positive outcomes.

    At least, I don’t have a handle on either how to approach these questions myself, or how to effectively intervene on the culture about them. And so I’m not clear on if I could have made things better in this way. But I could have made this my explicit goal and tried, and made some progress, or not. ↩︎
  2. A bit of context that is maybe important. I have not, applied for a job since I was 21, and was looking for an interim job during college. Every single job that I’ve gotten in my adult life has resulted from either, my just showing up and figuring out how I could be helpful, or someone I already know reaching out to me and asking me for help with a project.

    For me at least, “show up and figure out what is needed and make that happen” is a pretty straightforward pattern of action, but it might be foreign to other people who have a different conception of jobs that is more centered on specific roles, that you’re well-suited for, and doing a good job in those roles. ↩︎