How Long Should you Take to Decide on a Career?

According to a recent survey from Rethink Priorities, when asked what best describes their current career respondents replied:

  • Building flexible career capital and will decide later (18.7%)
  • Still deciding what to pursue (17.2%)

I was struck how perfectly this aligns with the optimal solution to the Secretary Problem: given N options, and subject to certain constraints, you should evaluate 37% of them before committing (those two survey responses add up to 35.6%). This is mostly coincidental, but leads to a longer exploration of real-life applications of SP-like dynamics.

Of course, SP is just a toy model. For our use, the two most problematic assumptions are Binary Payoff: the evaluator’s goal is merely to maximize their probability of selecting the best candidate, meaning 2nd best is just as bad as the worst, and No Opportunity Cost: evaluation is treated as free, there’s no cost to making a selection after 100 seeing candidates rather than after 10. I refer to an adjusted SP without these assumptions as the Modified Secretary Problem (MSP).

This post is in two parts:
1. Optimal Stopping Points for the MSP
2. Discussion

All code is available here as a Colab Notebook.

Optimal Stopping Points for the MSP

Setup: To adjust against the disanalogous assumptions, we’ll make the following modifications:

  • Utility is proportional to the quality of the selected candidate. Unlike the regular SP, 2nd best is almost as good as 1st, and much better than worst.
  • Utility is proportional to the number of candidates remaining after selection. This gives an opportunity cost to evaluation, and introduces a kind of explore-exploit tradeoff.

Additionally, we’ll consider possible scenarios where:

  • Candidate quality is uniformly, normally or log-normally distributed
  • Utility is either gained:
    • Purely based on whether or not you picked the best candidate (BINARY)
    • Directly proportional to the quality of the selected candidate (PROPORTIONAL)
    • Directly proportional to the quality of the selected candidate, multiplied by the number of remaining candidates (TIME)
    • Gained at each time step directly proportional to the quality of the candidate currently being evaluated, plus the quality of the selected candidate multiplied by the number of candidates remaining at the end (TIME_CONTINUOUS)

Since BINARY payoff doesn’t depend on the quality distribution, we run a 3x3 matrix of scenarios, plus the initial base condition.

Approach: There is an analytical solution to the Secretary Problem described in Appendix A, but we’ll be focusing on a numerical solution (i.e. simulation). I start by replicating the original result to test the validity of the simulation approach, then find optimal stopping points for the 11 modified scenarios.

Pseudocode

  • Generate n=100 candidates with quality sampled from the scenario’s distribution
  • For each possible stopping point (S), run t=10,000 trials
  • Find the best_initial candidate from the first S candidates
  • Iterate through the remaining candidates in order, selecting the first candidate with quality better than the best_initial candidate

This is an implementation of the optimal strategy for the original SP described in Analysis of heuristic solutions to the best choice problem (Stein, Seale and Rapoport, 2003).

Results
First, we manage to replicate the original result, yielding a stopping point of 34%. This is close to the optimal solution of 37%, but clearly noisy.

Numerical solution (mine, left) compared to analytical solution from Stein et al., 2003

Next, we’ll run the MSP for the matrix of initial conditions.

Table of Results

BINARY       PROPORTIONAL TIME                TIME_CONTINUOUS
UNIFORM 35 9 2 3
NORMAL N/A 8 1 5
LOG_NORMAL N/A 24 7 12

Charts

Sanity Checks
Let’s begin with a quick sanity check of the results. All scenarios remain characterized by a U-shaped curve. Stopping too early results in having a too low bar, and satisfying too early. But stopping too late results in not having enough candidates left to evaluate, and risking going home empty handed (technically, defaulting to the last candidate in the pool).

The time-sensitive conditions all result in substantially earlier stopping points than their non-temporal equivalents. Again, this makes sense. Taking into account opportunity costs will push the decision time earlier, providing more time to “exploit” the benefits of a good candidate selection.

It also makes sense that TIME_CONTINUOUS stops a bit after TIME. Since utility is gained during the evaluation process, the effect of opportunity costs are dulled.

Discussion

The Difficulty of Reasoning from Toy Models, and Ambiguity of Increased Realism
We’ve adapted the secretary problem to resolve a couple key disanalogies to the real-life problem of selecting a career. These tend to push the optimal stopping point much earlier, with one scenario advocating a stopping point of just 1%. This bears two interpretations:

  • Object-level: Take this result literally, and apply it to major life decisions
  • Meta-Level: Since the solution is so highly subject to model parameters, don’t take any of this too literally

Note that the model parameters are not arbitrary, so this is different than just conducting sensitivity analysis and declaring the whole model unreliable. I genuinely feel that the modifications made to the original SP make the model more realistic.

Having said that, it is not necessarily true that a more realistic model will yield better results. Going from UNIFORM+PROPORTIONAL to NORMAL+PROPORTIONAL, you are arguably getting a more realistic model, but the stopping point goes from 7% to 19%. NORMAL+TIME_CONTINUOUS is perhaps even more realistic, but results in a subsequent drop of the stopping point from 19% down to 4%.

So the fact that this discussion is more nuanced than the original problem has some benefits, but doesn’t necessarily indicate that results are more applicable.

This is reminiscent of The Atlantic’s “The Curse of Econ 101“, arguing that too naively applied, economic reasoning can be more misleading than useful. Econ-101 is an important step on the path to reasoning rigorously about difficult problems, but there’s no guarantee that taking the step will make your decisions better in the short-run.

Intuitions are Arbitrarily Bad
Given the flaws of formal models, we might wish to retreat to a more intuitive stance. Abstractly, you could even extend this to a broader critique against rationalism, or against modernism, or against planning and so forth.

Tanner Greer disagrees. Although intuition, and it’s cousin cultural tradition, were helpful in the past, our current world is too bizarre. He goes on to conclude:

The trouble with our world is that it is changing… What traditions could their grandparents give them that might prepare them for this new world? By the time any new tradition might arise, the conditions that made it adaptive have already changed… This may be why the rationalist impulse wrests so strong a hold on the modern mind. The traditions are gone; custom is dying. In the search for happiness, rationalism is the only tool we have left.

Intuition is not quite the same as custom, but it’s related. Your intuitions might stem from an evolutionary background, or advice from your parents, or a broader set of cultural norms. But these are all maladapted to the current moment, and to your current circumstances.

At the extremes, intuition can easily veer into neurosis. It’s easy to feel “I can’t commit to a career path until I’ve seen more of them, I’m always afraid that there’s a better opportunity around the corner.” Or alternatively, “What I have now is good enough. I should be grateful for this opportunity, and not try too hard to improve my life.”

Formal models might not be right, but they can at least help disabuse us of even worse mental models.

Further Disanalogies and Alternative Strategies
So far, we’ve analyzed different optimal stopping points, but only for a single strategy (look at the first K candidates, identify the best_initial, then pick the best candidate from the remaining pool better than best_initial). This is one reasonable approach, but it’s not the only one. From Stein et al. on alternative strategies:

The Cutoff Rule is the one we’re familiar with, and results in the highest peak, making it superior for the original SP. However, the Successive Non-Candidate Rule peaks earlier, making it potentially superior for scenarios that incorporate opportunity cost. Modifying the simulation code to incorporate this strategy is a promising avenue for future work.

In real life, there are all sorts of other strategies we can imagine. The candidate pool is not just an ordered list of opportunities to step through linearly, it’s a huge and dynamic space of possibilities you can jump to in any order.

Job searches in particular are not nearly as blind. You can take a job, consider the particular features that would have improved it, and then seek out subsequent opportunities on the basis of that knowledge. You might also ask friends, read about other people’s careers, take some kind of career aptitude evaluation, and so on.

Additionally, the SP and MSP consider only relative knowledge. As the evaluator, all you know about an candidate is how it ranks compared to previous candidates. In real life, there is some capacity, albeit limited, for more absolute evaluations. A job in a coal mine would (probably) not just be worse than any job I’ve had before, it would be clearly and dramatically so.

Finally, job searches are highly path-dependent. You don’t just “try out” being a PhD student to see if you like it. You pursue that particular credential in the service of gaining access to specific further opportunities, some of which you might not even get to evaluate before taking on a massive commitment. Similarly, you don’t get to “try out” being a billionaire startup CEO until you spend years in other “jobs” on the path to get there.

Empirical Data from the Effective Altruism Community
According to a recent survey from Rethink Priorities, when asked what best describes their current career, Effective Altruists replies included:

  • Building flexible career capital and will decide later (18.7%)
  • Still deciding what to pursue (17.2%)

These percentages add up to 35.6%, which is surprisingly close to the optimal solution to the original SP (~36.8%), but very far from the solutions I propose here.

It’s worth acknowledging that this is not a uniform sample of people at random points in their career. The survey also notes that the mean age is just 30 (median 27). So the respondents are largely early-career, and precisely in the period of life where “exploration” takes precedence over “exploitation”.

There are two additional dynamics to consider.

First, many Effective Altruists view causes as having incredibly high variance, on a very skewed distribution. This results in a very high “Moral Value of Information“. Per Bykvist, Ord, and MacAskill in Moral Uncertainty:

it’s plausible that the most important problem really lies on the meta-level: that the greatest priority for humanity, now, is to work out what matters most, in order to be able to truly know what are the most important problems we face.

Analogously, career choice may involve various meta-strategies, including:

  • Spending (relatively) a lot of time evaluating different paths before committing
  • Working directly on cause prioritization
  • Building flexible career capital while we (collectively) make progress on identifying the most important problems

This tendency is further encouraged by the EA community’s appreciation for exponential growth curves. Rather than more linear views where it’s important to take advantage of known and proxiomate opportunities, the exponential view broadly encourages investment on the meta-level, or investments in the rate of growth itself.

Second, and in stark contrast, Effective Altruists might face an increased sense of urgency, and an need to begin doing direct work as soon as possible. As I argued earlier:

According to an Open Philanthropy estimate and AI Expert Surveys, there’s a 50% chance of transformative Artificial Intelligence emerging by around 2050… If you take this idea seriously, we should be obsessed with the short term to the exclusion of all other timescales.

So while a human lifespan (in the UK) is 81 years, with a retirement age of 65, timelines might be aggressively compressed if everything changes in 29 years. For the median EA at age 27, working life might only last until age 56.

Proleptic Career Choice
So far, we’ve assumed that the perceived quality of applicants is stable. In fact, the process of exploration may itself entail a shift in the evaluator’s desiderata. Perhaps working a job causes them to change their beliefs about the quality of subsequent jobs, altered their personal circumstances, or even affected a deep transformation on the level of values. Consider:

  • Carol, an ambitious young Stanford grad, initially ascribes high value to Venture Capital and Entrepreneurship. After taking a job as an associate at an investment firm and seeing hundreds of failed startups, she becomes more hesitant to start a company herself.
  • Seeking financial stability, Peter initially places the greatest value in investment banking, followed by software engineering. After a stint in software, he’s earned enough money to retire, and now place more weight on non-financial aspects of future jobs, causing i-banking to fall in relative rank.
  • After moving to Chicago and experiencing frigid winters, Eve starts to value warmth more heavily and places higher value on future jobs in California and Florida.

Particularly savvy agents may actually take a job they don’t value, expecting it to change their values for the better:

  • A burgeoning Effective Altruist from London has no first hand experience with direct aid, and can’t really relate to the plight of the very poor. Nevertheless, they take a job in global development, hoping that they’ll develop a better appreciation for the role once they’re already doing it.

Agnes Callard describes this internal tension at length:

One characteristic of someone motivated by these complex reasons… is some form of embarrassment or dissatisfaction with oneself. She is pained to admit, to herself or others, that she can “get herself” to listen to music only through those various stratagems. She sees her own motivational condition as in some way imperfectly responsive to the reasons that are out there. Nonetheless, her self-acknowledged rational imperfection does not amount to akrasia, wrongdoing, error, or, more generally, any form of irrationality. Something can be imperfect in virtue of being undeveloped or immature, as distinct from wrong or bad or erroneous. (There is something wrong with a lion that cannot run fast, but there is nothing wrong with a baby lion that cannot run fast.) When the good student of music actively tries to listen, she exhibits not irrationality but a distinctive form of rationality.

….Thus I will defend the view that you can act rationally even if your antecedent conception of the good for the sake of which you act is not quite on target—and you know that. In these cases, you do not demand that the end result of your agency match a preconceived schema, for you hope, eventually, to get more out of what you are doing than you can yet conceive of. I call this kind of rationality “proleptic.”

In some version of this view, career choice is not merely a matter of evaluation and selection, but of active exploration, information-seeking, and intentional self-modification.

It’s important to understand this as a dynamic process. Leaving behind the toy model, I’m suggesting that career choice takes part in a self-modulating cycle of:

  • Trying out a jobs
  • Updating your beliefs and values as a result
  • Imposing a new ranking function on the basis of those changes
  • Seeking out a next job on the basis of that novel ranking
  • …and so on

This is not merely path-dependence. It is a kind of profound illegibility. If the loss function is itself updating in real-time, all optimization techniques fail.

I can think of one avenue for salvation. Earlier, we discussed the case of an Effective Altruist trying to “change their values for the better.” Rather than “values” on the level of “care for animals” or “financial stability”, agents could be modeled as having “meta-values” on the level of, for example:

  • Taking on values that lead to long-term satisfaction.
  • Aligning emotional motivations with cognitive beliefs about what is right.
  • Better approximating a “correct” moral view.

If at least these meta-values were stable, the problem would be at least partially resolved.

See also
Robert Wiblin – How replaceable are the top candidates in large hiring rounds?
Stein, Seale and Rapoport, 2003 – Analysis of heuristic solutions to the best choice problem
Chapter 1 of Algorithms to Live By.
Robert Wiblin – The ‘secretary problem’ is too bad a match for real life to usefully inform our decisions — so please stop citing it

Varieties of Deterministic Experience

What determines the fate of our world? Depending on your views, there may be few different macro narratives:

Mimetic Determinism

There are ideas floating around with various evolutionary properties. Some of them are really good at embedding in our minds, some are really good at making their hosts spread them further. The best memes get stuck in your brain, compel you to attain a position of great prestige, credibility and power, then insist that you spread them as wide and far as possible.

There are also meme-complexes (Memeplexes) with symbiotic or parasitic relationships. For example, some people are compelled to become Venture Capitalists, but it’s useless without the corresponding meme that compels people to become startup founders.

From Nadia’s The tyranny of ideas:

Rather than viewing people as agents of change, I think of them as intermediaries, voice boxes for some persistent idea-virus that’s seized upon them and is speaking through their corporeal form. You might think of this as “great prophet theory”.

Ideas ride us into battle like warhorses. We can witness, participate in, and even lead these battles, but their true meaning eludes us. We don’t really know where ideas come from, nor how to control them.

See Also
Wikipedia – Memeplex
Dawkins – The Selfish Gene
Joe Carlsmith – The innocent gene
Bernard Beckett – Genesis
Creanza et al. – Cultural evolutionary theory

Financial / Economic Determinism

There are incentives which necessitate certain consequences. If it’s profitable, it will be built. From Scott’s Meditations on Moloch

Just as you can look at an arid terrain and determine what shape a river will one day take by assuming water will obey gravity, so you can look at a civilization and determine what shape its institutions will one day take by assuming people will obey incentives… Just as the course of a river is latent in a terrain even before the first rain falls on it – so the existence of Caesar’s Palace was latent in neurobiology, economics, and regulatory regimes even before it existed. The entrepreneur who built it was just filling in the ghostly lines with real concrete.

More specific arguments can apply in local contexts. For example:

  • There are different dating apps.
  • Ironically, the ones that fail to create stable matchings will see more repeat use and popularity.
  • Thus, the dominant dating app will inevitably be designed to alienate.

More broadly, organizations may promote certain values, scientific institutions may promote the construction of certain types of knowledge. Platforms may promote certain kinds of content.

See Also
Wikipedia – Base and superstructure
Wikipedia – Economic Determinism
Applied Divinity Studies – How Substack Became Milquetoast

Game Theoretic Determinism

A perspective frequently utilized in super long term forecasting, and reasoning about alien civilizations or superintelligent agents. For example, an argument might take the form:

  • Some civilizations will colonize the universe
  • Thus, the far future universe will be populated mainly with agents who are pro-expansion and pro-progress

Or more mundanely:

  • Some religious groups on earth promote high-fertility rates
  • Thus, the near future will be populated mainly with agents who are pro-population growth

There are more nuanced and complex arguments of this general shape. In Liu Cixin’s Three Body Problem Trilogy:

  • Civilizations can’t trust each other
  • Technological progress can proceed exponentially, such that observing that a civilization is harmless from thousands of lightyears away is no guarantee that they’ll be harmless by the time you arrive
  • Thus, any civilization that learns about the existence of another civilization will immediately act forcefully to destroy it

See also:
Scott Alexander – The Hour I First Believed
Philip Trammell – Which World Gets Saved
Ben West – An Argument for Why the Future May Be Good

Physical Determinism

There are particles in the universe (including those in our brains) subject to physical laws which determine their behavior.

Discussion

This is a limited typology, there are likely many other views, both reasonable and not. Note that these views are non-exhaustive. You could say that things are determined by physics at the level of individual particles, but by memes at the level of human behavior. Or that it’s all incentives, but that those incentives are modulated by facts about human psychology.

Note as well that these views should be neither horrifying nor necessarily comforting. To take another example, evolution does provide a guarantee that agents are well-adapted to some set of circumstances, but provides no guarantee that they remain well-adapted. Or in capitalism, there is a strong likelihood that sufficiently profitable business will be created by someone, but no guarantee that profit proxies well for human values.

Finally, these views all (perhaps with the exception of physical determinism) are not really absolute. You are perhaps a slave to memes as brain-parasites, but have some control over which memes you choose to take on. Through will-power may be subjugated to habit in general, and occasional breakthroughs of will-power might allow you to, for example, delete Twitter, sign up for therapy and install a Chrome extension that blocks content recommendations.

Identifying more of these views, mapping their influence in various social spheres, and better reasoning through their consequences is a promising avenue for future work. In particular, would love to see follow-up posts:

See Also
Katja Grace – What is going on in the world?
Yudkowsky – An Alien God
Yudkowsky – Inadequate Equilibria

Yes, X Causes Y

Holden wants to know: Does X cause Y? Whenever we try to figure it out, there’s an observational study, followed by maybe a couple more interesting studies that still don’t replicate, or have other horrendous drawbacks. To summarize the grand total of human knowledge of the subject: “it’s super unclear”.

It’s satire, but he’s also serious. Claiming inspiration from (among other sources), Scott Alexander’s “Much More Than You Wanted to Know“ series, Holden concludes “the bottom line usually has the kind of frustrating ambiguity seen in this conclusion.”

This is too pessimistic.

First, on the meta level, “does X cause Y” debates tend to occur right on the boundary line of epistemic confidence. So by nature the discussions are designed to be highly contentious and uncertain. We no longer ask “does X cause Y” questions about matters superseded by modern physics (“Does phlogiston cause combustion? No.”), nor about matters settled by empirical data (“Do germs cause disease? Yes.”). But this progress goes underappreciated.

For people living on the cutting edge of science, the important questions will always have ambiguous answers.

Second, object level, Holden is overstating the ambiguity of Scott’s evidence reviews.

  • In Melatonin, Scott pretty definitively concludes that Melatonin is an effective hypnotic, and the right dosage is 0.3mg.
  • In Autism, Scott concludes with ~100% confidence that “genes that increase risk of autism are disproportionately also genes that increase intelligence”.
  • In College Admissions, Scott concludes that “There is strong evidence for more competition for places at top colleges now than 10, 50, or 100 years ago.”

(I’m cherry picking a bit, but the point stands. In a good chunk of cases, we can be reasonably confident that X causes Y. I’m taking Scott’s confidence at face value here, but I think that’s reasonable, at least with regards to responding to Holden. In another post, Holden agrees that Scott’s predictions have a good track record.)

Third, most of the time, we care more about making the right decisions than about having the right epistemics. By which I mean, it’s okay to not be certain, as long as the expected value works out. For example, in Face Masks, Scott remains uncertain about the efficacy, but points out that for high-risk situations, the annoyance of wearing a mask is probably a price worth paying for even a small likelihood of reducing Covid risk.


Finally, it’s working taking a look at some really quick high level heuristics.

So look. Of course it sounds bad that 2/3rd of psychology studies don’t replicate. But that means that 1/3rd of them do!

It would be nice to know which third is which, but still, that’s not nothing.

Admittedly, it would be very bad if:

  • 1/3rd of studies replicate
  • 1/3rd show no effect
  • 1/3rd demonstrate the opposite effect

Though there is a bit of this in nutrition (“does X cause or prevent cancer? Maybe!”), it’s not universal. For example: I’m not 100% sure how well face masks prevent Covid, but I am pretty darn confident they don’t cause Covid.

This asymmetry matters a lot. And it matters for Holden’s more serious questions too. GiveWell isn’t totally sure if the parasite-killing drugs improve test scores, but they probably don’t make them worse.

So start with a reasonable prior (50% of all studies are wrong, or 66%, or 40% or whatever), update based on some object-level evidence, and multiply out to get an expected value estimate. As far as I can tell, this is immensely simplified, but basically what GiveWell does.

I’m not being critical of Holden. He knows all of this far far better than I do.

I’m just saying as a reader, when you encounter stuff like this, don’t throw your hands up in the air and conclude that since nutrition science is bad, we should all just eat pizza and whiskey for every meal. Don’t make the mistake of falling into epistemic nihilism.


To conclude:

  • Our discussions will tend to revolve around contentious issues, creating the illusion of an epistemic crisis.
  • Even on contentious issues, literature reviews are not useless.
  • The point is not to achieve certainty, the point is good decision making.
  • It might be possible to know some things some of the time.

Footnotes
[1] He clarifies this does not mean up to 40% of doctor’s opinions are wrong. Doctors are not relying on a single study. And again, I would point out that the bulk of doctor’s opinions are totally mundane and non-controversial.