MR VS HN

Last night, my earlier post on polymaths was featured on Marginal Revolution. What followed were some of the nicest comments I’ve ever seen on the internet:

Over email

  • I came across your blog after seeing the recommendation in Tyler Cowen’s post. I accidentally read (almost) every post. I admire your writing style.
  • I’m very curious who you are and what you do. I love your blog—it’s brilliant—and would like to work with you in some professional capacity.
  • I enjoy the way you think and the way you write, so you have my email.

And from the comments:

  • Just in case you haven’t clicked on the link above labeled "Beware the Casual Polymath, it takes you to a blog called Applied Divinity Studies that is really terrific. Highly recommended!
  • “From Beware the Casual Polymath (excellent link btw):”

Around 9 hours later, it was reposted to Hacker News where it’s accrued 150 points and 75 comments. Some of them are mean, but the real surprise is that many of them accuse me of being a bot. In other words, I fail the Turing test.

Over email:

  • Can you tell me how you’re doing it? A retrieval-based language model, perhaps?

And from the comments:

  • I’m about 95% certain this article was written with GPT or some similar algorithm.
  • Probably machine generated.
  • Would a human write that crap or are you shouting at a computer generated thingie?
  • It’s absolute rubbish. This is an experiment.
  • This looks like an experiment. Please de-cloak and explain.
  • Assuming this isn’t some GPT-3 garbage, why does any of this matter?
    This is an experiment (I think) along the lines of the Turing test.

I don’t spend enough time on Hacker News to know if this is a common reaction or if there’s something uniquely bot-like about my writing. Maybe the unintended consequence of GPT-3 is that it makes people deeply distrustful of each other.

Upon closer examination, 5 of the 8 comments are from the same user. In one sense, this is comforting, at least there are fewer people. But in another sense, it’s much more worrying. When it comes to negative reactions, I would rather have 8 mild ones than 1 strong one.

Bus Factor 1

A friend at a prominent tech company took last month off using their new covid policy. When he got back to work last week, he was horrified. Not to see that things had fallen apart in his absence, but to realize that they hadn’t. Overwhelmingly, life without him had continued as normal.

This is not a coincidence. A company dependent on a single person’s work has a single point of failure. In tech jargon, they have low Bus Factor, the number of people who would have to get hit by a bus for the project to fail. The higher your Bus Factor, the greater your institutional resilience.

But it’s hard to reconcile this experience with faith in an impactful career. If a firing squad executes a prisoner, we can say that each shooter is sufficient but not necessary. The success of the project depends on the shooters in aggregate, but not on any person in particular.

It’s easy to understand why an organization might want this kind of redundancy. Sure, it’s wasteful and inefficient, but it prevents the greatest threat to any company: giving employees leverage.

The Misery of Redundancy

If you’re reading this, you’re probably well versed on sublinear returns to headcount. You know about the Mythical Man-Month, and understand that communication scales quadratically. You shudder at the thought of growing complexity.

But the untold story isn’t about efficiency, it’s about negative returns to engagement. One person on a project is personally responsible. Two people might manage to coordinate. Beyond that, you’re in the endless hellscape of team projects.

Remember the last time you were assigned partners in school? A team project is a game of chicken. Each participant signals their willingness to fail through procrastination, until finally at the last minute someone defects and does all the work. The more you care, the more you lose.

(At least those projects ended. As a friend once put it, the weirdest thing about working at large companies is that you’ll wake up one morning, have a new teammate randomly assigned to you, and proceed to see them 8 hours a day every day until one of you quits or dies.)

This is the world of high Bus Factor. The greater a project’s supposed “resilience”, the more each person becomes dispensable. Sufficient, but not necessary.

(Even if you’re working in a mission-driven company, your job is meaningless unless you’re personally responsible for the success of that mission. The US Government might be the single more important organization in the world. How do you think it feels to work as an entry level bureaucrat?)

Have you ever wondered why you spend so much time writing design docs and detailed PRs? Or why you’re constantly stuck in meetings explaining what you do to various groups of people? Or why your project has a separate engineer, EM, PM, Designer and Tech Lead? Maybe these are best practices, maybe we’ve all settled on the right way of doing things.

But they’re also ways of making sure that if you get hit by a bus, no one has to care.

Self-Reinforcing Churn

Every large tech company runs performance reviews twice a year. Don’t ask me why, it’s the way things have always been done.

Simultaneously, the average tenure at these companies hovers at around 2 years.

That means that at any given point, fully one quarter of your teammates are totally checked out.

On the high end, there are people who know they’re leaving in the next year, and just have absolutely no reason to care. If you’re at the beginning of a new review cycle, you’re 6 months away from getting PIP’d, which means you’re 12 months away from getting fired.

Odds are, you’re going to lose the game of chicken. You might not have the most to lose, but unless you’re also quitting next cycle, you certainly don’t have the least.

For the employer, the upshot is that anyone can quit at any time. If you’re on a 12 person team, you might lose 3 people each half. The only viable response is to up the Bus Factor, layer on more redundancy, and limit the importance of each individual.

Of course, the irony is that it’s precisely this sense of replaceability that drives churn in the first place! The more employees are alienated, the more willing they are to leave as soon as their equity vests.

So this ends up being a vicious cycle where higher churn forces a higher Bus Factor, which causes alienation, which increases churn, and so on until you either die or become Oracle.

Escaping the Labyrinth

The flip side is that there’s still hope.

Since the cycle is self-reinforcing, we can break it in the middle and slay the ouroboros. There’s historical path dependence, but no fundamental reason we have to live this way.

The alternative to redundancy isn’t fragility, it’s personal responsibility. It’s keeping your Bus Factor low.

In this world, you can develop specific mastery instead of a broad assortment of skills. You can be a craftsman instead of a code monkey, understand systems deeply instead of cargo culting best practices.

Imagine working at a company where every single one of your coworkers gives a shit. Imagine knowing that your contributions are important, that the world is different because of your existence.

That’s Bus Factor 1.

As far as I can tell, this is basically how SpaceX works. It’s not a coincidence that they’re ranked #1 for both stress and sense of meaning. If you hear that employees are overworked and underpaid, don’t cry abuse, ask how they’re getting compensated instead.

FAQ

Why exactly does leverage even matter?
A priori, it shouldn’t. If all employees have leverage, there’s nothing to negotiate for. You could ask for more money, but everyone else could too, eventually the company falls apart and you all lose.

But a posteriori, we can work backwards from behavior. If companies are willing to sacrifice efficiency, that is itself evidence of exploitation.

That sounds crazy. If engineers are actually underpaid, why don’t they just go somewhere else? Why doesn’t the market raise salaries?
I’ve screened resumes from entry level engineers who claim to have saved their companies millions of dollars. The crazy part is that I believe it! If you can make even the smallest change to a ranking algorithm, or improve the performance of a costly computation, or run an experiment that marginally increases click-through rate, the effects multiply out across a huge user base.

And yet, I don’t know a single entry level engineer getting paid commensurately.

First, there’s a lack of counterfactual impact. Sure, you saved the company a million dollars, but if you didn’t take the job someone else would, and they would have done the same work.

Your impact looks huge, but in a high Bus Factor world, your actual impact is just the delta between your ability and the next best engineer. You might be able to negotiate another $10k, but not two orders of magnitude more.

Second, you don’t own the means of production. Your skills are only worth millions of dollars in the particular context of this company. You can’t walk away and produce that value on your own. You depend on the company much more than it depends on you.

Hmm, maybe, but if firms have so much market power, why don’t they just keep Bus Factor low but set a strong precedent against adversarial negotiation?
Great question. Since companies are good at internal coordination, and employees are bad at collective bargaining, firms ought to be able to keep salaries low even without increasing the Bus Factor.

You could just wait for someone to speak up, immediately fire them, take whatever financial hit comes from their project’s failure, but set the precedent for any future dissidents that negotiation is unacceptable.

The problem is, firms don’t handle negotiations, managers do, and their incentives are just as screwed up as yours. A manager stands to lose their career if too many projects fail, but it costs them nothing to advocate for your promotion. So in a low Bus Factor world, the tendency is always to default to generosity.

Why doesn’t this happen at SpaceX? Remember, none of this is actually about salaries or promotions, it’s about sociopathically pursuing leverage as an alternative to fulfilment when you have no other choices. This is all conditional on not being treated like a human, and not having a meaningful job.

In Bus Factor 1 world, you have too much personal responsibility to coast on momentum or free ride on a teammate’s accomplishments. More importantly, you might actually personally care.

Wouldn’t running a company as Bus Factor 1 invite risk?
Absolutely. SpaceX has blown up rockets. Tesla has missed deadlines. Neuralink is doing god knows what.

But management science, financial engineering and organizational design are filled with countless other methods for deferring risk.

Okay fine, I’m sold, where do I sign up?
Pretty much nowhere!

I’ve asked every friend, accepted every cold call from recruiters, and as far as I can tell, nearly no companies or even individual teams are run with Bus Factor 1. Again, maybe SpaceX, but I’m relaying that second hand.

If you find out, let me know. Or better yet, start it yourself and tell me!

Base Rates on Secession

Here’s the scenario painted by The Atlantic, NYT and others:

  • On election day, Trump appears to win
  • In the following days, mail-in ballots are counted, leading to a reversal
  • Trump does not accept the result
  • All hell breaks loose

But rather than ask “is this the most contentious election ever?”, it would be better to ask “how many elections have been similarly contentious, and not led to secession”.

(I use secession interchangeably with civil war, revolution and independence.)

In a year of increasingly dramatic reporting, I want to make the basic claim that historical data can generate useful intuitions about the future.

For example: Trump’s approval rating in California is 29%. By comparison, 5 past presidents have had similar national approval ratings, including Truman (22%), Nixon (24%), G.W. Bush (25%), Carter (28%) and G. H. W. Bush (29%) . Nixon was forced to resign, but in other cases there was no remarkable outburst. So at first approximation, it seems unlikely that California will take any kind of momentous action.

Of course, we can always make claims about why this time is different. Perhaps there will be a big swing left if the election goes poorly. But whatever those claims are, they should be expressed as updates to the base rate.

To be clear, none of this should be taken too literally, but it at least provides a point of reference for more dramatic and speculative arguments.

US Presidential Elections
There have been 58 presidential elections in US history. Only the 1860 election led to secession.
Rate: 1/58, 1.7%

Civil War in Any Country
Stanford University Professor Stephen D. Krasner writes “There are some thirty ongoing civil wars”. Wikipedia lists 32 ongoing civil wars as of April 2020. There are 195 countries total, giving:
Rate: 32/195, 16%

New Sovereign States since 2000
From Wikipedia, there have been 24 new sovereign states since 2000. Looking at a 4 year period, that’s 6 out of 195 countries.
Rate: 6/195, 3%

Independence Referendums since 1900
From Wikipedia, there have been 114 Independence Referendums since 1900. Notably, many of these were the former colonies of France breaking apart in the May 1958 Crisis and the breakup of the Soviet Union in 1991. In a 4 year period we have 114/195/(120/4):
Rate: 1.9%

4 Year Time Span Since the Establishment of the 13 Colonies
Including the American Revolution, and counting since the Georgia’s 1732 establishment:
Rate: 2/(288/4), 2.8%

Scenario Rate
US Presidential Elections 1.7%
Civil War in any Country 16%
New Sovereign States since 2000 3%
Independence Referendums since 1900 1.9%
4 Year Time Span Since the Establishment of the 13 Colonies 2.8%
Simple Average 5.08%

Many of these rates beg for further qualification. I’ve provided a non-comprehensive list of corrections in the appendices addressing sensitivity analysis, the United States as a non-generic country, and so on.

I’m not going to rehash all the object-level arguments clearly detailed elsewhere. Yes, Trump has said he won’t step down. Yes, there are logistical concerns with mail-in ballots. Yes, there is currently an even number of supreme court justices.

But it’s hard to reason your way from “things seem very bad” to “shit will actually hit the fan”.

Intuitions on the question appear totally split. On one hand, it’s easy to say “there are already mass protests and this would tip the scales.” On the other hand, your uncle says he’s moving to Canada every time a republican is elected, Texas says it’s seceding every time a democrat is elected, neither ever happens. Why should this year be different?

In these scenarios, (high drama, limited information, bi-modal outcomes), it’s useful to turn to base rates.

Appendix A: Temporal Framing

I counted Independence Referendums since 1900, but there’s no good reason to pick this date over any other.

Here’s the rates we would get if we started counting at other dates. (Data and analysis at this Google Sheet)

There’s a good bit of variance depending on when we start counting, ranging from 1.1% if we count every listed referendum, to 4.1% if we start counting right before the 1991 collapse of the Soviet Union.

There isn’t a clear advantage to any one start date. If we start counting from 1810, we’re capturing all the available data, but you might argue that the conditions allowing for civil war have changed, and it makes more sense to only count post-WWII, or post Soviet Collapse.

Working with parameterized models, you can pick a Schelling point to reduce the possibility of intentional manipulation, but when possible, it’s better to do the sensitivity analysis.

I’m lucky in that there’s really only one parameter here, so it’s easy to communicate these results as a simple bar graph and get a good intuitive feel for how much our start date matters. In more complex models, this might not be possible.

Appendix B: American Uniqueness

Since the future is always in some sense unprecedented, it’s not always clear which statistics to use. This is aggravated by American uniqueness. Whatever you think of American exceptionalism as a doctrine, there’s no denying that America is at least odd. Is it in the class of all countries? Democratic countries? Wealthy democratic countries? The reference class we choose will have huge implications for identifying relevant historical data.

But base rates are still good background information. Claims to uniqueness should take the form of a bayesian update, not a total disregard for priors.

Many of these base rates scream for further qualification. The 32 countries listed as engaged in civil war are generally much less wealthy and democratic than the US. But since the US is literally the world’s wealthiest country, it’s not clear where we should draw the line.

Appendix C: Fragile States Index

The 4.1% aggregate measure is heavily skewed by the 16% rate of Civil Wars in any country. An obvious objection to this point is that the US is not like those other countries. We’re richer, more democratic, or in some broader sense, more stable.

The Fragile States Index attempts to capture this idea, looking at corruption, political stability, economic inequality and more.

We can’t look at 2020 data since it will account for the existence of ongoing civil wars. Instead, we’ll take the oldest historical data from 2006, and join it with the list of countries with civil wars that started after 2006:

Country (2006) FSI Rating (lower is better)
Syria 89
Mali 75
Central African Republic 98
Egypt 90
Libya 69
Ukraine 73
Yemen 97
Cameroon 88
Mozambique 75
United States (2020) 38

We should still take into account the dangers listed in Appendix A, but since FSI is actually a sensible leading indicator, it feels reasonable to suggest that coutries with low FSI actually are very unlikely to start a civil war, and the base rate for the population of countries relevant to the US is actually 0%.

Appendix D: Predictive Theories

It’s tempting to claim a different reference class by drawing up a broad theory, such as: “No western democratic country has had a civil war since WWI”. Except that isn’t actually true. Manuel Azaña was elected Prime Minister of Spain in 1931, leading up to the 1932 Coup, 1934 Revolution and 1936 Civil War.

So okay, maybe the real temptation is to claim a different reference class by drawing up a broad theory that backtests well. The broader danger is in finding a theory that backtests well, but isn’t actually predictive, and falling into a kind of Garden of Forking Paths. As usual, there’s a relevant XKCD.

If you go on Wikipedia and read about societal collapse, you’ll find a near-perfect description of America today:

factors such as environmental change, depletion of resources, unsustainable complexity, decay of social cohesion, rising inequality, secular decline of cognitive abilities, loss of creativity, and bad luck.

And if you read Joseph Tainter’s foundational work The Collapse of Complex Societies, you’ll find more of the same, with chapter headings ranging from resource depletion, to catastrophes and social dysfunction.

But note that Tainter is not trying to predict future collapse, merely explain past ones:
Societies do encounter resource shortages, class interests do conflict, catastrophes do happen, and not uncommonly the response does not resolve such problems. A general explanation of collapse should be able to take what is best in these themes and incorporate it. It should provide a framework under which these explanatory themes can be subsumed.

As a result, he doesn’t look at any civilizations which did not collapse in order to determine whether or not these factors are causal. Reading Tainter does not tell us anything about the relative proportion of civilizations who experienced resources shortages, class conflict, etc, and did not fall apart.

Appendix E: Asymmetric Returns

I intended all this as a remedy to otherwise alarmist rhetoric, but don’t take a low base rate of succession to mean that we shouldn’t worry about it. A game of Russian Roulette with a 50-chamber cylinder would still be terrifying.

At first approximation, you might think a ~2% chance of collapse gives you an expected value of 98% of your current life, however you may choose to measure it.

But the real harm isn’t in losing your existing health, property, net worth or other assets, it’s the damage to your long term wellbeing. In the Russian Roulette example, you should obviously be willing to pay much more than 2% of your current assets to avoid playing. Instead, it should be something closer to 2% of your expected assets over the rest of your life, subject to an appropriate temporal discount.

This is further aggravated by diminishing returns from assets to wellbeing. A 2% chance of death is much worse than a 2% tax levied on all future income.

I’m not saying you should rush out today and buy guns, or even canned beans. Just that as silly as it feels to prepare for unlikely events, it is sometimes better to pay for insurance than to pick up pennies in front of a steamroller.

Appendix F: Simple Averages

If the base rates are already hand wavy, taking the simple average is a frenetic shake. We can easily create semi-redundant rates to increase the weight given to some statistics. US Presidential Elections and 4 Year Span of US History turn up slightly different results, but essentially encode the same information.

You could do sensitivity analysis as before, or report two different averages, with and without the outlier. I’m not sure what the proper adjustment is other than to again, not take any of this at face value.

In the service of that aim, it’s worth noting that models are always garbage in, garbage out. Using overly sophisticated analytical tools can create the illusion of precision where it doesn’t exist. So yes, it might be worth doing the sensitivity analysis in this case, but it risks lending too much credibility while adding too little information.

Appendix G: Secession, Independence, Civil War, Revolution

I’ve conflated these forms of conflict throughout the post. Partially because it’s messy (the Civil War started out as mere secession), but also because I think this is what readers actually care about.

If I told you the odds of Civil War were x%, and then we got a revolution or military coup instead, you would probably feel cheated.

Appendix H: Metaculus Predictions

Metaculus is similar to a prediction market and weighs user predictions by their historical correctness. It has a Brier score of 0.095 and a Log score of 0.120, indications that it has been broadly reliable. Of course, this is partially a function of question popularity, with the most popular questions receiving thousands of predictions.

Will the USA enter a second civil war before July 2021?:
1%, 251 predictions

Will at least one US state secede from the Union before 31 December, 2030?:
5%, 51 predictions

[EDIT 10/08/2020]

There’s a systematic bias against predicting the apocalypse.

If you predict the world will end and then it does, you receive no benefit. Equivalently, if you predict the world won’t end and then it does, you receive no harm.

This is less true for only mildly-catastrophic risks, but still applies. You’ll survive, but the institutions set up to reward you for correctness may not.

So the epistemic correction is to tilt a bit more heavily towards apocalyptic thinking, and the behavioral correction is to avoid shaming people who incorrectly predict apocalypse.

In other news:

None of this affects the base rates, but it sure is interesting.

[EDIT 10/13/2020]

Dan Coats, formerly the Director of National Intelligence, issued a call last month to establish an bi-partisan election commision.

Again, not good evidence in any direction, but it’s a good reminder that some predictions are self-defeating. The more apocalypse-prophets provide warnings, the more likely they are to be wrong. Another good reason to not be overly harsh in our criticism.

It’s also a reminder that apocalypse is often averted only through heroic effort. Y2K seems like a joke in retrospect, but only because the U.S. spent $150B (inflation adjusted) to prevent it.

The lack of nuclear war is similarly the result of massive ongoing efforts. As is the less-catastrophic-than-possible pace of climate change, the relative lack of antibiotic resistance and so forth.