Reflections: Three categories of capital

There are three categories of capital that one can invest in.

Knowledge, skill, experience

This includes what you know, what you know how to do.

But it also includes “experience” the kinds of tacit background that you only learn by interacting with some subpart of the world, and not just reading about it. Often, just having seen how something was done in some related context is more useful than any specific “skill” that you can learn on purpose. (For instance some of the principles that go into developing and running a world class workshop series, are directly transferable to developing public advocacy materials, or having participated in making movies gives one a template for coordinating teams of contractors to get a job done.)

Reputation and connections

The application of many skills depends on access to the contexts where those skills are relevant. As a friend of mine says, “It’s not what you know, or even who you know, it’s who you know who knows what you know.

Throughout most of my life, I tended to emphasize the value of skills, and didn’t think much at all about reputation or connections. This undercut my impact, and left me less powerful today than I might have been.

I’ve invested in skills that can help make teams much more effective, but many of those skills are not carved up very well by standard roles or job descriptions (for instance “conversational facilitation”, and “effective communication”, and “the knowing the importance of getting feedback, for real”). People who have worked with me know that I bring that value to the table. But most people who I might be able to provide value to don’t even know that they’re missing anything, much less what it is, much less that I can provide it.

Plus, relationships really are really powerful for solving problems. The scale of the network of people who know and trust you is proportional to your ability to solve some types of problems.

If I move on from Palisade, one thing that I think I should invest in is my semi-public reputation. (Possibly I should write a blog that is optimized for readers. Instead of writing for myself that I also post on the internet.)

Financial capital

Having money is useful for doing stuff. You need a certain threshold of money for financial independence, and spending money can enable or accelerate the accumulation of the other kinds of capital.

Reflecting on some regret about not trying to join and improve specific org(s)

I started a new job recently, which has prompted me to reflect on my work over the past few years, and how I could have done better.

Concretely, I regret not joining SERI MATS, and helping it succeed, when it was first getting started. 

I think this might have been a great fit for me: I had existing skills and experience that I think would have been helpful for them. The seasonal on-off schedule would have given me the flexibility to do and learn other things. It would have (I think) helped me get a better grounding in Machine Learning and technical alignment approaches.

And if I had joined with an eye towards agentically shaping the organization’s culture and priorities as it developed, I think I would have had a positive impact on the seed that has grown into the current alignment field . In particular, I think I might have had leverage to establish some cultural norms regarding how to think about the positive and negative impacts of one’s work.1 

I regarded MATS as the obvious thing to do. The nascent alignment field was bottlenecked on mentorship—a small number of people (arguably) had good taste for the kinds of research that was on track, but had limited bandwidth for research mentorship, so conveying that research taste was (and is?) a bottleneck for the whole ecosystem. A program aiming to unblock everything else to expand the capacity for research mentorship as much as possible seemed like the obvious straightforward thing to do.

I said as much in my post from early 2023:

There is now explicit infrastructure to teach and mentor these new people though, and that seems great. It had seemed for a while that the bottleneck for people coming to do good safety research was mentorship from people that already have some amount of traction on the problem. Someone noticed this and set up a system to make it as easy as possible for experienced alignment researchers to mentor as many junior researchers as they want to, without needing to do a bunch of assessment of candidates or to deal with logistics. Given the state of the world, this seems like an obvious thing to do.

I don’t know that this will actually work (especially if most of the existing researchers are themselves doing work that dodges the core problem), but it is absolutely the thing to try for making more excellent alignment researchers doing real work. And it might turn out that this is just a scalable way to build a healthy field.

In retrospect, I should have written those paragraphs and generated the next thought “I should actively go try to get involved in SERI MATS and see if I can help them.”

So why didn’t I?

Misapplied notion of counterfactual impact

I didn’t do this because I was operating on the model/assumption that, while this was important, they were doing it now, and were probably not in danger of failing at it. It was taken care of and so I didn’t need to do it.

I now think that was probably a mistake. Because I didn’t get involved, I don’t know one way or the other, but it seems plausible to me that I could have contributed to making the overall project substantially better: more effective and with better positive externalities. 

This isn’t because I’ve learned anything in particular about how SERI MATS missed the mark, but just getting more exposure to organizations and adjusting my prior that even if an organization is broadly working, and not in danger of collapse, it might be the case that I can personally make it much better with my efforts. In particular, I think it will sometimes be the case that there is room to substantially improve an organization in ways that don’t line up very neatly with the specific roles that they’re attempting to explicitly hire for, if you have strategic orientation and specific relevant experience.2

This realization is downstream with my interactions with Palisade over recent weeks. Also, Ronny made a comment a few years ago (paraphrased) that “you shouldn’t work for an organization unless you’re at least a little bit trying to reform it”. That stuck with me, and changed my concept of “working for an org”.

Possibly this difference in frame is also partially downstream of thinking a bit about shapley values through reading Planecrash and thinking about donation-matching for SFC. (I previously aimed to do things that, if I didn’t do them, wouldn’t happen. Now, I’ve continuous-ized that notion, and aim for, approximately, high shapley value).

Underestimating the value of “having a job”

Also, regarding SERI potentially being a good fit for me in particular, I think I have historically underestimated the value of having a job for structuring one’s life and supporting personal learning. I currently wish that I had more technical background in ML and alignment/control work, and I think I might have gotten more of that if I had been actively trying to develop in that direction while supporting MATS in a non-technical capacity, instead of trying to develop that background (inconsistently) independently.

Strategic misgivings

I didn’t invest heavily in any project over recent years because there wasn’t much that I straightforwardly believed in. As noted above, the idea-of-MATS was a possible exception to this—it seemed like the obvious thing to do given the constraints of the world. And I now think I should take “this seems like the obvious thing to do” as a much stronger indicator that I should get involved with a project, somehow, and figure out how to help, than I previously did.

But part of what held me back from doing that was misgivings about the degree to which MATS was acting as a feeder pool for the scaling labs. MATS is another project that doesn’t seem obviously robustly good to me (or “net-positive”, though I kind of think that’s the wrong frame). As with many projects, I felt reticent to put my full force behind it for that reason.

In retrospect, I think maybe I should have shown up and tried to solve the problem of “it seems like we’re doing plausible real harm, and that seems unethical” from the inside. I could have repeatedly and vocally drawn attention to it, raised it as a consideration in strategic and tactical planning, etc. Either I would have shaped the culture around this problem for the MATS staff sufficiently that I trusted the overall organism to optimize safely, or we would have bounced off of each other unproductively. And in that second case, we could part ways, and I could move on.

In general, it feels like a more obvious affordance to me, now, if I think something is promising, but I don’t trust it to have positive impacts, I just try non-disruptively making it better according to the standards that I think are important, and if that doesn’t work or doesn’t go well, parting ways with the org.

This all begs the question, “should I still try to work for SERI MATS and make it much better?”

My guess is that the opportunity is smaller now than it was a few years ago, because both the culture and processes of the org have more found an equilibrium that works. There’s less leverage to make an org much better when the org is figuring out how to do the thing it’s trying to do, compared to when it has reached product-market-fit, and is mostly finding ways to reproduce that product consistently and reliably.

That said, one common class of error is overestimating the degree to which an opportunity has passed. e.g. not buying Bitcoin in 2017, because you believe that you’ve already missed the big opportunity—it’s true in some sense, but you’re underestimating how much of the opportunity still remains. 

So, if I were still unattached, writing this essay would prompt me to reach out to Ryan, and say directly that I’m interested in exploring working for MATS, and try to get more contact with the territory, so that I can see for myself. As it is, I have a job which seems like it needs me more, and which I anticipate absorbing my attention for at least the next year.

  1. Note: of all the things I wrote here, this is the point that I am most uncertain of. It seems plausible to me that because of psychological dynamics akin to “It is difficult to get a man to understand something, when his salary depends on his not understanding it”, and classic EA-style psychological commitment to life narratives that impart meaning via impact, the cultural norms around how the ecosystem as a whole thinks about positive and negative impacts, were and are basically immovable. Or rather, I might have been able to make more-or-less performative hand-wringing fashionable, and possibly cause people to have less of an action-bias , but not actually produce norms that lead to more robustly positive outcomes.

    At least, I don’t have a handle on either how to approach these questions myself, or how to effectively intervene on the culture about them. And so I’m not clear on if I could have made things better in this way. But I could have made this my explicit goal and tried, and made some progress, or not. ↩︎
  2. A bit of context that is maybe important. I have not, applied for a job since I was 21, and was looking for an interim job during college. Every single job that I’ve gotten in my adult life has resulted from either, my just showing up and figuring out how I could be helpful, or someone I already know reaching out to me and asking me for help with a project.

    For me at least, “show up and figure out what is needed and make that happen” is a pretty straightforward pattern of action, but it might be foreign to other people who have a different conception of jobs that is more centered on specific roles, that you’re well-suited for, and doing a good job in those roles. ↩︎