Advice to pivot into AI Safety is likely miscalibrated

Aligning our advice about aligning AI

2025-09-28 — 2025-10-25

Wherein the AI‑safety career‑advice ecosystem is described, its high failure tolerance and its lack of mooring to ground-truth or optimality is noted.

AI safety
catastrophe
economics
faster pussycat
innovation
machine learning
incentive mechanisms
institutions
networks
wonk

Assumed audience:

Mid career technical researchers considering moving into AI Safety research, career advisors in the EA/AI Safety space, AI Safety employers and grantmakers

Nonetl;dr

AI career advice orgs, prominently 80,000 Hours, encourage career moves into AI safety roles, including mid‑career pivots. I analyse the quality of this advice from the private satisfaction, public-good, and counterfactual equilibrium perspectives, and learn the following things:

  1. Rational Failure: If you value personal, direct impact highly, it can be rational to attempt a pivot that will probably fail (e.g., \(\ll 50\%\) success chance).
  2. Opacity: if we needed to work out whether our advice was producing good or not, we would need data both on success rates and candidate quality distributions, which are currently unavailable.
  3. Misalignment: The optimal success rate for the field (maximizing total impact) differs from the optimal rate for individuals (maximizing personal EV). The advice ecosystem appears calibrated to neither.
  4. Counterfactual impact: Counterfactually, the value of a pivot is much lower than naïvely; in highly contested roles you need not only to be the best but to be the best by a wide enough margin to justify all the effort of the other people you are displacing.
  5. Donations: If you donate at the moment and would pause donations while taking a career sabbatical, that is very likely a negative EV move in public goods terms. If the EV of a pivot is uncertain, donating the attempt costs (e.g., the sabbatical expenses you were willing to pay) provides a guaranteed positive counterfactual impact and doesn’t require all this fancy modelling.

The problem bites for mid-career professionals with high switching costs. It is likely less severe for early career professionals who have lower switching costs, or for people doing something other than switching jobs (e.g. if you are starting up a new organisation the model would be different). I propose some mitigations, both personal and institutional. I made an interactive widget to help individuals evaluate their own pivots.

NoneEpistemic status

A solid back-of-the-envelope analysis.

The analysis here should not be astonishing; we know advice calibration is hard, and we know that the collective and private goals might be misaligned. Grinding through the details nonetheless reveals some less obvious implications; for example not only are the number of jobs and the number of applicants important, but your model of the distribution of talent ends up having a huge effect. The fact that the latter two factors — number of applicants and talent distribution — are inscrutable to the people who make the risky choice to do the career pivoting, is why I think that advice about careers is miscalibrated.

Figure 1: The totem pole of AI safety careers. Atop the column is the muse of alignment, inspiring the career advice orgs that amplify interest in AI safety roles. At the base sit recruiters, who screen and filter applicants. In the corner, the lion of ill‑advised unemployment devours those who miscalibrate their EV.

Here’s a napkin model of the AI-safety career-advice economy, or rather, three models of increasing complexity. They sketch how advice can sincerely recommend gambles that mostly fail, and why—without better data—we can’t tell whether that failure rate is healthy (leading to impact at low cost) or wasteful (potentially destroying happiness and even impact). In other words, it’s hard to know whether our altruism is “effective”.

In AI Safety in particular, there’s an extra credibility risk that’s idiosyncratic to this kind of system. AI Safety is, loosely speaking, about managing the risks of badly aligned mechanisms producing perverse outcomes. As such, it’s particularly incumbent on our field to avoid badly aligned mechanisms that produce perverse outcomes; otherwise we aren’t taking our own risk model seriously.

In order to keep things simple, we ignore as many complexities as possible.

1 Part A—Private career pivot decision

An uncertain career pivot is a gamble, so we model it the same way we model other gambles.

NoteMeet Alice

Alice is a senior software engineer in her mid-30s, making her a mid-career professional. She has been donating roughly 10% of her income to effective charities and now wonders whether to switch lanes entirely to achieve impact via technical AI safety work in one of those AI Safety jobs she’s seen advertised She has saved six months of runway funds to explore AI-safety roles — research engineering, governance, or technical coordination. Each month out of work costs her foregone income and reduced career prospects. Her question is simple: Is this pivot worth the costs?

To build the model, Alice needs to estimate four things:

1.0.1 The Stakes: What’s the upside?

  • Annual Surplus (\(\Delta u\)): This is the key number. It’s the difference in Alice’s total annual utility between the new AI safety role (\(u_1\)) and her current baseline (\(u_0\)). This surplus combines the change in her salary and her impact—indirectly via donations and directly by doing some fancy AI safety job.

    • \(u = w + \alpha(i+d)\), where \(w\) is wage, \(i\) is impact, \(d\) is donations, and \(\alpha\) is her personal weighting of impact versus consumption.
    • \(\Delta u := u_1 - u_0\).

1.0.2 The Costs: What does it cost to try?

  • Burn Rate (\(c\)): This is her net opportunity cost per year while on sabbatical (e.g., foregone pay, depleted savings), measured in k$/year.
  • Runway (\(\ell\)): The maximum time she’s willing to try, in years.

1.0.3 The Odds: What are her chances?

  • Application Rate (\(u_1\)): The number of distinct job opportunities she can apply for per year.
  • Success Probability (\(u_0\)): Her average probability of getting an offer from a single application. We assume these are independent and identically distributed (i.i.d.).

The i.i.d. assumption (each job is independent) is likely optimistic. In reality, applications are correlated: if Alice is a good fit for one role, she’s likely a good fit for others (and vice-versa). We formalise this in the next section with a notion of candidate quality distributions that captures the notion that you don’t know your “ranking” in the field, but most people are not in at the top of it, by definition.

1.0.4 The “Timer”: How to value future gains?

  • Discount Rate (\(\rho\)): A continuous rate per year that captures her time preference. A higher \(\rho\) means she values immediate gains more—for example, if she expects short AGI timelines, \(\rho\) might be high).

1.1 Modeling the Sabbatical: The Decision Threshold

With these inputs, we can calculate the total expected value (EV) of her sabbatical gamble. The full derivation is in Appendix A, but here’s the result:

\[ \boxed{ \Delta \mathrm{EV}\_\rho(p) =\frac{1-e^{-(r p+\rho)\ell}}{r p+\rho}\left(\frac{\Delta u r p}{\rho}-c\right). } \]

This formula looks complex, but its logic is simple. The entire decision hinges on the sign of the bracketed term: \[ \left(\frac{\Delta u r p}{\rho}-c\right) \] This is a direct comparison between the expected gain rate (the upside \(\Delta u\), multiplied by the success rate \(rp\), and adjusted for discounting \(1/\rho\)) and the burn rate (\(c\)). The prefactor scales that value according to the length of her runway and her discount rate.

The EV is positive if and only if the gain rate beats the burn rate. This means Alice’s decision boils down to a simple question: Is her per-application success probability, \(p\), high enough to make the gamble worthwhile?

We can find the exact break-even probability, \(p^*\), by setting the gain rate equal to the burn rate. This gives a much simpler formula for her decision threshold:

\[ \boxed{\,p^*=\frac{c\,\rho}{r\,\Delta u}\,}. \]

If Alice believes her actual \(p\) is greater than this \(p^*\), the pivot has a positive expected value. If \(p < p^*\), she should not take the sabbatical, at least in these terms. [TODO clarify]

1.2 What This Model Tells Alice

This simple threshold \(p^*\) gives us a clear way to think about her decision:

  1. The bar gets higher: The threshold \(p^*\) increases with higher costs (\(c\)) or shorter timelines or higher impatience (\(\rho\)). If her sabbatical is expensive or she’s in a hurry, she needs to be more confident of success.
  2. The bar gets lower: The threshold \(p^*\) decreases with more opportunities (\(r\)) or a higher upside (\(\Delta u\)). If the job offers a massive impact gain or she can apply to many roles, she can tolerate a lower chance of success on any single one.
  3. Runway doesn’t change the threshold: Notice that the runway length \(\ell\) isn’t in the \(p^*\) formula. A longer runway gives her more expected value (or loss) if she does take the gamble, but it doesn’t change the break-even probability itself.
  4. The results are fragile to uncertainty: This model is highly sensitive to her estimates. If she overestimates her potential impact (a high \(\Delta u\)) or underestimates her time preference (a low \(\rho\)), she’ll calculate a \(p^*\) that is artificially low, making the pivot look much safer than it is.1
  5. The key unknown: Even with a perfectly calculated \(p^*\), Alice still faces the hardest part: estimating her actual success probability, \(p\).

That \(p\) is, essentially, her chance of getting an offer. It depends not only on the number of jobs available but crucially on the number and quality of the other applicants.

All that said, this is a relatively “optimistic” model. If Alice attaches a high value to getting her hands dirty in AI safety work, she might be willing to accept a remarkably low \(p\); we’ll see that in the worked example. Hold that thought, though, because I’ll argue that this personal decision rule can be pretty bad at maximizing total impact.

If you are using these calculations for real, be aware that our heuristics are likely overestimating Alice’s chances. Job applications are not IID. The effective number of independent shots is lower than the raw application count, reducing effective \(r\) — if your skills don’t match the first job, it is also less likely to match the second, because the jobs might be similar to each other.

1.3 Worked example

Let’s plug in some plausible representative numbers for Alice. She’s a successful software engineer taking home \(w_0=180\)k$/year, donating \(d_0=18\)k$/year post-tax, and having no on-the-job impact \(\mathcal{I}_0=0\) (i.e. no net harm, no net good). Alice earns \(w_0=180\) and donates \(d_0=18\). A target role offers \(w_1=120\), \(d_1=0\) and \(\mathcal{I}_1=100\). Set \(\alpha=1\), runway \(\ell=0.5\) years, application rate \(r=24\)/year, discount \(\rho=1/3\), burn \(c=50\). Then \(\Delta u = (120+0+100)-(180+18+0)=22\) and \[ p^*=\frac{c\rho}{r\Delta u} = \frac{50\cdot\frac{1}{3}}{24\cdot 22} \approx \boxed{3.16\%}. \] Over 6 months, the chance of at least one success at \(p^*\) is \(q^*=1-e^{-rp^*\ell}\approx \boxed{31.5\%}\). Her expected actual sabbatical length is \(\mathbb{E}[\tau]=\frac{1-e^{-rp^*\ell}}{rp^*}\approx \mathbf{0.416}\ \text{years (≈5.0 months)}\), and, conditional on success, it’s \(\mathbb{E}[\tau\mid \text{success}]\approx \mathbf{0.234}\ \text{years (≈2.8 months)}\). Under these assumptions, we expect the sabbatical to break even because the job offers enough upside to compensate for a greater-than-even risk of failure.

We plot a few values of Alice’s to visualize the trade-offs for different upsides \(\Delta u\). [TODO clarify]

If we want to play around with the assumptions, check out the interactive Pivot EV Calculator (source at danmackinlay/career_pivot_calculator).

2 Part B — Field-level model

Nonetl;dr

In a world with a heavy-tailed distribution of candidate impact, the field benefits from many attempts because a few “hits” dominate. In light-tailed worlds, the same encouragement becomes destructive. We simply don’t know which world we’re in.

So far, this has been Alice’s private perspective. Let’s zoom out to the field level and consider: What if everyone followed Alice’s decision rule? Is the resulting number of applicants healthy for the field? What is the optimal number of people who should try to pivot?

2.1 From personal gambles to field strategy

Our goal is to move beyond Alice’s private break-even (\(p^*\)) and calculate the field’s welfare-maximizing applicant pool size (\(K^*\)). This \(K^*\) is how many “Alices” the field can afford to have roll the dice before the costs of failures outweigh the value of successes.

To analyze this, we must shift our model in three ways:

  1. Switch to a Public Ledger: From a field-level perspective, private wages and consumption are just transfers. They drop out of the analysis. What matters is the net production of public goods (i.e., impact).

  2. Distinguish Public vs. Private Costs: The costs are now different.

    • Private Cost (Part A): \(c\) included Alice’s full opportunity cost (foregone wages, etc.).
    • Public Cost (Part B): We now use \(\gamma\), which captures only the foregone public good during a sabbatical (e.g., \(\gamma = \mathcal {I}_0 + d_0 + \varepsilon\), or baseline impact + baseline donations + externalities).
  3. Move from Dynamic Search to Static Contest: Instead of one person’s dynamic search, we’ll use a static “snapshot” model of the entire field for one year. We assume there are \(N\) open roles and \(K\) total applicants.

NoteReconciling the Models: From Poisson Search to a Static Contest

In Part A, Alice saw jobs arriving one-by-one (a Poisson process with rate \(r\)). In Part B, we are modeling an annual “contest” with \(K\) applicants competing for \(N\) jobs.

We can bridge these two views by setting \(N \approx r\). This treats the entire year’s worth of job opportunities as a single “batch” to be filled from the pool of \(K\) candidates who are “on the market” that year.

This is a standard simplification. It allows us to stop worrying about the timing of individual applications and focus on the quality of the matches, which is determined by the size of the applicant pool (\(K\)). We can then compare the total Present Value (PV) of the benefits (better hires) against the total PV of the costs (failed sabbaticals).

Here’s a paragraph to bridge those two concepts. I’d suggest placing this just before the first plot in Part B, where you start to visualize \(W (K)\).

If \(N\) jobs are available annually (which we’ve already equated to Alice’s application rate \(r\)) and \(K\) total applicants are competing for them, a simple approximation for the per-application success probability is that it’s proportional to the ratio of jobs to applicants.

For the rest of this analysis, we’ll assume a simple mapping: \(p \approx N/K\). This allows us to plot both models on the same chart: as the field becomes more crowded (\(K\) increases), the individual chance of success (\(p\)) for any single application shrinks.

2.2 The Field-Level Model: Assumptions

Here is the minimally complicated version of our new model:

  • There are \(K\) total applicants and \(N\) open roles per year.
  • Each applicant \(k\) has a true, fixed potential impact \(J^{(k)}\) drawn i.i.d. from the talent distribution \(F\).
  • Employers perfectly observe \(J^{(k)}\) and hire the \(N\) best candidates. (This is a strong, optimistic assumption about hiring efficiency).
  • Applicants do not know their own \(J^{(k)}\), only the distribution \(F\).

The intuition is that the field benefits from a larger pool \(K\) because it increases the chance of finding high-impact candidates. But the field also pays a price for every failed applicant.

2.3 Benefits vs. Costs on the Public Ledger

Let’s define the two sides of the field’s welfare equation.

The Marginal Benefit (MV) of a Larger Pool

The benefit of a larger pool \(K\) is finding better candidates. We care about the marginal value of adding one more applicant to the pool, which we define as \(\mathrm{MV}_K\). This is the expected annual impact increase from widening the pool from \(K\) to \(K+1\). (Formally, \(\mathrm{MV}_K := \mathbb{E}[S_{N,K+1}] - \mathbb{E}[S_{N,K}]\), where \(S_{N,K}\) is the total impact of the top \(N\) hires from a pool of \(K\)).

The Marginal Cost (MC) of a Larger Pool

The cost is simpler. When \(K > N\), adding one more applicant adds (on average) one more failed pivot. [TODO clarify] This failed pivot costs the field the foregone public good during the sabbatical. We defined the social burn rate per year as \(\gamma\). To compare this to the annual benefit \(\mathrm{MV}_K\), we need the total present value of this foregone impact. We call this \(L_{\text{fail},\delta}\) (the PV of one failed attempt). (This cost is derived in Appendix B as \(L_{\text{fail},\delta}=\gamma\,\frac{1-e^{-\delta\ell}}{\delta}\)).

We do not model employer congestion from reviewing lots of applicants — on the rationale that it is empirically small because employers stop looking at candidates when they’re overwhelmed (J. Horton and Vasserman 2021).2 Note, however, that we have also claimed employers perfectly observe \(J^{(k)}\), which means we are being optimistic about the field’s ability to sort candidates. Maybe we could model a noisy search process?

2.4 Field-Level trade-offs

We can now find the optimal pool size \(K^*\). The total public welfare \(W (K)\) peaks when the marginal benefit of one more applicant equals the marginal cost.

As derived in Appendix B, the total welfare \(W(K)\) is maximized when the present value of the annual benefit stream from the marginal applicant (\(\mathrm{MV}_K / \delta\)) equals the total present value of their failure cost (\(L_{\text{fail},\delta}\)). \[ \frac{\mathrm{MV}_K}{\delta} = L_{\text{fail},\delta} \] Substituting the expression for \(L_{\text{fail},\delta}\) and cancelling the discount rate \(\delta\), we get a very clean threshold: \[ \boxed{\,\mathrm{MV}_K = \gamma\,(1-e^{-\delta\ell})\,}. \] This equation is the core of the field-level problem. The optimal pool size \(K^*\) is the point where the expected annual marginal benefit (\(\mathrm{MV}_K\)) drops to the level of the total foregone public good from one failed sabbatical attempt.

2.5 The Importance of Tail Distributions

How quickly does \(\mathrm{MV}_K\) shrink? Extreme value theory tells us this depends entirely on the tail of the candidate-quality distribution, \(F\). The shape of the tail determines how quickly returns from widening the applicant pool diminish.

We consider two families (the specific formulas are in Appendix B):

  • Light tails (e.g., Exponential): In this world, candidates are variable, but the best is not transformatively better than average. Returns diminish quickly: the marginal value \(\mathrm{MV}_K\) shrinks hyperbolically (roughly as \(1/K\)).
  • Heavy tails (e.g., Fréchet): This captures the “unicorn” intuition. Returns diminish much more slowly. If the tail is heavy enough, \(\mathrm {MV}_K\) decays extremely slowly, justifying a very wide search.

2.6 Implications for Optimal Pool Size

This difference in diminishing returns has a huge effect on the optimal pool size \(K^*\). (The full solutions for \(K^*\) are in Appendix B.

With light tails, there’s a finite pool size after which turning up the hype (growing \(K\)) destroys net welfare. Every extra applicant burns \(L_{\text{fail},\delta}\) in foregone public impact while adding an \(\mathrm{MV}_K\) that shrinks rapidly.

With heavy tails, it’s different. As the tail gets heavier, \(K^*\) explodes. In very heavy-tailed worlds, very wide funnels can still be net positive. We may decide it’s worth, as a society, spending a lot of resources to find the few unicorns.

We set the expected impact per hire per year to \(\mu_{\text{imp}}=100\) (impact dollars/yr) to match Alice’s hypothetical target role; this is just for exposition.

We can, of course, plot this.

δ=0.333/yr, L_fail=8.29 impact-$ (PV)
Exponential: K*=800 (boundary), W*=25869.7
Fréchet α=1.8: K*=800 (boundary), W*=50300.4
Fréchet α=2.0: K*=800 (boundary), W*=40229.2
Fréchet α=3.0: K*=800 (boundary), W*=19116.8
  • This plot shows total net welfare \(W (K)\) and marks the maximum \(K^*\) for each family, showing where total welfare peaks. The dashed line at \(K=N\) shows where failures begin: \((K>N\Rightarrow K-N\) people each impose a public cost of \(L_{\text{fail},\delta})\). The markers show \(K^*=\arg\max W(K)\), the pool size beyond which widening further would reduce total impact.
  • Units: \(B(K)\) is in impact dollars per year and is converted to PV by multiplying by \(H_\delta=\frac{1}{\delta}\). The subtraction uses the discounted per-failure cost \(L_{\text{fail},\delta}=\gamma\,\frac{1-e^{-\delta\ell}}{\delta}\).
  • Fréchet curves use the large-\(K\) asymptotic \(B(K)\approx s\,K^{1/\alpha}C_N\) (with \(s=\mu_{\text{imp}}/\Gamma(1-1/\alpha)\)). We could work harder to get the exact \(B(K)\) for Fréchet, but the asymptotic is good enough to illustrate the qualitative behaviour.
  • We treat all future uncertainties about role duration, turnover, or project lifespan as already captured in the overall discount rate \(\delta\).

We can combine these perspectives to visualize the tension between private incentives and public welfare.

p* (private) = 3.16%
Exponential: K*=868, p(K*)≈2.76%
Fréchet α=1.8: K*=3200 (boundary), p(K*)≈0.75%
Fréchet α=2.0: K*=3200 (boundary), p(K*)≈0.75%
Fréchet α=3.0: K*=1164, p(K*)≈2.06%

This visualization combines the private and public views by assuming an illustrative mapping from pool size to success probability: \(p\approx \beta N/K\) (where \(\beta\) bundles screening efficiency; here \(\beta=1\)). The black curve (left axis) shows a candidate’s private expected value (EV) versus success probability \(p\). The coloured curves (right axis) show field welfare \(W(K)\). The private break-even point \(p^*\) (black dashed line) can fall far to the left of the field-optimal \(p(K^*)\) (coloured vertical lines). This gap represents the region where individuals may be rationally incentivized, at the field level, to enter even though the field is already saturated or oversaturated at the candidate level.

If your applicant pool does not have heavy tails, widening the funnel likely increases social loss.

3 Part C — Counterfactual Impact and Equilibrium

Part A modeled a “naive” applicant who evaluates their pivot based on the absolute impact (\(\mathcal {I}_1\)) of the role, ignoring pool dynamics. Part B analyzed the field-level optimum (\(K^*\)), showing how the marginal value (\(\mathrm{MV}_K\)) of an applicant decreases as the pool grows.

Now we tie those together. If Alice is sophisticated and understands these dynamics and aims to maximize her counterfactual impact, would she make the same choice?

This changes the game, introducing a feedback loop where individual incentives depend on the crowd size (\(K\)), and the candidate quality distribution \((F)\).

3.1 The Counterfactual Impact Model

If we assume applicants can’t know their quality relative to the pool ex-ante (formally: applicants are exchangeable), the expected counterfactual impact of the decision to apply is exactly \(\mathrm{MV}_K\) per year.

Alice should use this in her EV calculation. However, her initial EV formula (Part A) used the impact conditional on success, not the ex-ante expected impact of her decision in isolation.

Let \(\mathcal{I}_{CF}\) be the expected counterfactual impact conditional on success. Let \(q_K\) be the probability of success given the pool size \(K\). (In the static model of Part B, with \(K+1\) applicants competing for \(N\) slots, \(q_K = N/(K+1)\)). [TODO clarify] If the attempt fails, the counterfactual impact is zero.

We can derive the relationship (see Appendix): \[ \mathrm{MV}_K = q_K \cdot \mathcal{I}_{CF}. \] Therefore, the impact, conditional on success, is: \[ \boxed{\mathcal{I}_{CF} = \frac{\mathrm{MV}_K}{q_K}.} \] We recalibrate the private decision by defining the counterfactual private surplus, \(\Delta u_{CF}\), replacing the naive absolute impact \(\mathcal{I}_1\) with the counterfactual estimate \(\mathcal{I}_{CF}\).

This changes the dynamics; previously the gamble’s value depended on the pool size only insofar as it affected the per-application success probability \(p\) and thus the overall success probability \(q_K\). Now the value of the upside also depends on the pool size \(K\). As \(K\) grows, \(\mathrm{MV}_K\) decreases, but \(q_K\) also decreases. The behavior of \(\mathcal{I}_{CF}\) depends on how these balance, which is determined by the tail of the impact distribution.

3.2 The Dynamics of Counterfactual Impact

The behavior of \(\mathcal{I}_{CF}\) leads to different implications depending on whether the recruitment pool is light-tailed or heavy-tailed.

3.2.1 Case 1: Light Tails

In a light-tailed recruitment pool (where applicants are relatively similar), the math shows (See Appendix) that the expected counterfactual impact conditional on success, \(\mathcal{I}_{CF}\), is constant and equal to the population average impact (\(\mu\)), regardless of how crowded the field is (\(K\)). \[ \mathcal{I}_{CF} = \mu \quad \text{(Light Tail)} \] Intuition: While a larger pool increases the quality of the very best hire, it also increases the quality of the person the hire displaces. In the stylized light-tailed model (Exponential distribution), these effects perfectly cancel out. More generally, in light-tailed talent pools, the gap between the hire and the displaced candidate doesn’t grow much as the pool gets larger.

Implication: If the average impact \(\mu\) is modest and the candidate skills are relatively evenly distributed, pivots involving significant pay cuts are likely to have negative EV for the average applicant, regardless of pool size.

3.2.2 Case 2: Heavy Tails

In a heavy-tailed model, “unicorns” hide in the recruiting pool. Here, \(\mathcal{I}_{CF}\) increases as the field gets more crowded (\(K\)), and — under certain assumptions — it can increase fast enough to offset the costs of sabbaticals, foregone donations, etc. For the Fréchet distribution with shape \(\alpha\), \(\mathcal{I}_{CF}\) grows proportionally to \(K^{1/\alpha}\). \[ \mathcal{I}_{CF} \propto K^{1/\alpha} \quad \text{(Heavy Tail)} \] Intuition: As \(K\) increases, the expected quality of the top candidates rises much faster than that of the candidates they displace. Success in a large pool is a strong signal that we are likely a high‑impact individual, and the gap between us and the displaced candidate is large.

Implication: In a heavy‑tailed world, the pivot can become highly attractive if the field is sufficiently crowded, even with significant pay cuts.

3.3 Alice Revisited

Alice revisited. With light‑tailed assumptions, \(\mathcal{I}_{CF}\) equals the population mean \(\mu\) and is too small to offset Alice’s pay cut and lost donations—her counterfactual surplus is negative regardless of \(K\). Under heavy‑tailed assumptions, \(\mathcal{I}_{CF}\) rises with \(K\); across a broad range of conditions, the pivot can become attractive despite large pay cuts (i.e. if Alice truly might be a unicorn). The sign and size of this effect hinge on the tail parameter and scale, which are currently unmeasured.

3.4 Visualizing Private Incentives vs. Public Welfare

We can now visualize the dynamics of private, public and counterfactual private valuations by assuming an illustrative mapping between pool size and success probability: \(p\approx \beta N/K\). This allows us to see how the incentives change as the field gets more crowded (moving left on the x-axis).

This visualization combines all three perspectives using Alice’s parameters. There are a lot of lines and assumptions wrapped up in this plot. The main takeaway: if we care about solving the problem, for many variants of this model — when trading off pivoting versus donating — we should probably donate. The only exception is if we believe the talent pool is very heavy-tailed (Fréchet with \(\alpha \leq 2\)), in which case, if we are one of those unicorns, we should probably pivot. Otherwise, donating is likely to have higher expected impact.

  • Left Axis (Private EV):

    • A (Black Solid): The naive applicant’s EV (Part A). It crosses zero at the naive break-even \(p^* \approx 3.16\%\).
    • C (Colored Dashed): The sophisticated applicant’s EV (Part C), using counterfactual impact \(\Delta u_{CF}(K)\). The point where these curves cross zero defines the equilibrium \(K_{eq}\).
  • Right Axis (Public Welfare):

    • B (Colored Solid): The field’s total welfare \(W(K)\) (Part B). The peak defines the social optimum \(K^*\).
  1. The Information Gap (A vs C): The Naive EV (Black) is significantly higher than the Counterfactual EV (Colored Dashed) across most of the range. Applicants relying on naive valuations of the impact of a career pivot (using personal impact change \(\Delta \mathcal{I}\) instead of counterfactual impact change \(\Delta \mathcal{I}_{CF}\)) will drastically overestimate their counterfactual impact and thus the expected value of the pivot.

  2. The Impact of Costs vs. Tails:

    • In light-tailed talent pools (Exponential, Purple; Fréchet \(\alpha=3.0\), Red), the Counterfactual EV is always negative. Alice’s 78k financial loss dominates the expected impact. The equilibrium is minimal (\(K_{eq}=N\)), leading to Under-Entry relative to the optimum (\(K^*\)).
    • In heavy-tailed talent pools (Fréchet \(\alpha=2.0\), Green; \(\alpha=1.8\), Orange), the dynamics change dramatically.
  3. Complex Dynamics in Heavy Tails (The “Hump Shape”): For heavy tails (Green, Orange dashed lines), the Counterfactual EV is non-monotonic. It starts positive, increases as \(K\) grows (because \(\mathcal {I}_{CF}(K)\) increases rapidly), and eventually decreases as the success probability \(p (K)\) drops too low.

  4. The Structural Misalignment (B vs C): In heavy-tailed talent pools, the equilibrium \(K_{eq}\) is vastly larger than the optimum \(K^*\). The efficient search process (high \(r\)) means the private cost of trying is low, which incentivizes entry long past the social optimum. This leads to massive over-entry. (For example, in the \(\alpha=2.0\) case, \(K^*\) is around 30k, while \(K_{eq}\) is over 400k).

This visualization confirms the analysis: the system’s calibration is highly sensitive to the tail distribution and private costs. Depending on the parameters, the system can structurally incentivize either severe under-entry or massive over-entry, even when applicants are sophisticated.

3.5 Equilibrium vs. Optimum

This feedback mechanism—where incentives depend on \(K\)—creates a natural equilibrium. Applicants will enter until the EV for the marginal entrant is zero. This defines the equilibrium candidate pool size, \(K_{eq}\).

To analyze this, we must reintegrate the counterfactual surplus \(\Delta u_{CF}(K)\) into the dynamic search model (Part A). We assume the pool size \(K\) determines the surplus \(\Delta u_{CF}(K)\) and the per-application success probability \(p(K)\). The equilibrium \(K_{eq}\) occurs when the expected gain rate equals the burn rate (the bracketed term in the EV formula is zero): \[ \frac{\Delta u_{CF}(K)\,r p(K)}{\rho} = c. \] Does this equilibrium \(K_{eq}\) align with the socially optimal pool size \(K^*\) (Part B)?

Generally, no. Whether they align depends on how private costs (\(c\)), social costs (\(\gamma\)) and the efficiency of the job-search process (\(r\)) compare.

3.6 Search Efficiency

The equilibrium condition depends on the application rate \(r\). We can rewrite the equilibrium condition as: \[ \Delta u_{CF}(K) \cdot p(K) = \frac{c\rho}{r}. \] The left side is the expected counterfactual surplus per application attempt. The right side, \(\frac{c\rho}{r}\), represents the effective private cost hurdle per application attempt (scaled by the discount rate).

If the job search process is highly efficient (high \(r\)), the private cost hurdle is low. This encourages people to apply even when the expected counterfactual impact per application is small, because trying is cheap.

We can compare this private incentive to the social optimum. As derived in Appendix C.3, if the private cost hurdle (\(\frac{c\rho}{r}\)) is significantly lower than the social cost of failure (related to \(\gamma\)), the system structurally leads to Over-Entry (\(K_{eq} > K^*\)).

Let’s check Alice’s numbers: \(c=50, \rho=1/3, r=24\). The private cost hurdle is \(\frac{c\rho}{r} \approx \frac{50/3}{24} \approx 0.69k\). The social cost rate \(\gamma\) (foregone donations) is \(18k\).

Since \(0.69k \ll 18k\), the system strongly favours over-entry. The efficiency of the search process dramatically lowers the private barrier to entry compared to the social costs incurred.

For example, in the heavy-tailed case (\(\alpha=2\)), we might find \(K^* \approx 11,600\), while \(K_{eq} \approx 178,000\).

4 Implications and Solutions

The analysis suggests the AI safety field may be oversubscribed. The core problem is misalignment: organizations influencing the funnel size don’t internalize the costs borne by unsuccessful applicants. This incentivizes maximizing application volume (a visible proxy) rather than welfare-maximizing matches—a classic setup for Goodhart’s Law.

A healthy field can rationally accept high individual failure rates if it measures and communicates the odds. If the field doesn’t measure them, the same logic becomes waste. The ethical burden shifts when the system knowingly asks people to take low-probability gambles without making that explicit.

4.1 For Individuals: Knowing the Game

For mid-career individuals, the decision is high-stakes. (For early-career individuals, costs \(c\) are lower, making the gamble more favourable, but the need to estimate \(p\) remains.)

  1. Calculate your threshold (\(p^*\)): Use the model in Part A (and the linked calculator). Without strong evidence that \(p > p^*\) is true, a pivot involving significant unpaid time is likely EV-negative.
  2. Seek cheap signals: Seek personalized evidence of fit—such as applying to a few roles before leaving your current job—before committing significant resources.
  3. Use grants as signals: Organizations like Open Philanthropy offer career transition grants. These serve as information gates. If received, a grant lowers the private cost (\(c\)). If denied, it is a valuable calibration signal. If a major funder declines to underwrite the transition, candidates should update \(p\) downwards. (If you don’t get that Open Phil transition grant, don’t quit your current job.)

4.2 For Organizations: Transparency and Feedback

Employers and advice organizations control the information flow. Unless they provide evidence-based estimates of success probabilities, their generic encouragement should be treated with scepticism.

  1. Publish stage-wise acceptance rates (Base Rates). Employers must publish historical data (applicants, interviews, offers) by track and seniority. This is the single most impactful intervention for anchoring \(p\).
  2. Provide informative feedback and rank. Employers should provide standardized feedback or an indication of relative rank (e.g., “top quartile”). This feedback is costly, but this cost must be weighed against the significant systemic waste currently externalized onto applicants and the long-term credibility of the field.
  3. Track advice calibration. Advice organizations should track and publish their forecast calibration (e.g., Brier scores) regarding candidate success. If an advice organization doesn’t track outcomes, its advice cannot be calibrated except by coincidence.

4.3 For the Field: Systemic Calibration

To optimize the funnel size, the field needs to measure costs and impact tails.

  1. Estimate applicant costs (\(c\ell\)). Advice organizations or funders should survey applicants (successful and unsuccessful) to estimate typical pivot costs.
  2. Track realized impact proxies. Employers should analyze historical cohorts to determine if widening the funnel is still yielding significantly better hires, or if returns are rapidly diminishing.
  3. Experiment with mechanism design. In capacity-constrained rounds, implementing soft caps—pausing applications after a certain number—can reduce applicant-side waste without significantly harming match quality (J. J. Horton et al. 2024).

5 Where next?

I’d like feedback from people deeper in the AI safety career ecosystem. I’d love to chat with people from 80,000 Hours, MATS, FHI, CHAI, Redwood Research, Anthropic, etc., about this. What is your model about the candidate impact distribution, the tail behaviour, and the costs? What have I got wrong? What have I missed? I’m open to the possibility that this is well understood and being actively managed behind the scenes, but I haven’t seen it laid out this way anywhere.

6 Further reading

Resources that complement the mechanism-design view of the AI safety career ecosystem:

7 Appendix A: Private Decision Model Derivations

We model the career pivot attempt as a continuous-time process during a sabbatical of maximum length \(\ell\).

Setup:

  • Job opportunities arrive as a Poisson process with rate \(r\).
  • The per-application success probability is \(p\) (i.i.d.).
  • The success process is a Poisson process with rate \(\lambda = rp\).
  • The time to the first success is \(T_1 \sim \mathrm{Exp}(\lambda)\).
  • The actual sabbatical duration is the stopping time \(\tau = \min\{T_1, \ell\}\).
  • The continuous discount rate is \(\rho>0\).
  • The annual utility surplus if the pivot succeeds is \(\Delta u\).
  • The burn rate during the sabbatical is \(c\).

7.1 Sabbatical Duration and Success Statistics

The probability of success within the runway is: \[ q = P(T_1 \le \ell) = 1 - e^{-\lambda \ell} = 1 - e^{-r p \ell}. \] We calculate the expected duration \(\mathbb{E}[\tau]\) using the survival function \(P(\tau > t)\). For \(t \in [0, \ell]\), \(\tau > t\) holds if and only if no success has occurred by time \(t\), so \(P(\tau > t) = P(T_1 > t) = e^{-\lambda t}\). This gives \(P(\tau>t)=0\) for \(t>\ell\). \[ \mathbb{E}[\tau] = \int_0^\infty P(\tau > t)\,dt = \int_0^\ell e^{-\lambda t}\,dt = \frac{1 - e^{-\lambda \ell}}{\lambda}. \] The expected duration, conditional on success, \(\mathbb{E}[\tau\mid \text{success}] = \mathbb{E}[T_1 \mid T_1 \le \ell]\), is given by the truncated exponential distribution. The PDF of \(T_1\) conditional on \(T_1 \le \ell\) is \(f(t\mid T_1\le\ell) = \frac{\lambda e^{-\lambda t}}{1-e^{-\lambda\ell}}\) for \(t\in[0,\ell]\). Using integration by parts, we get: \[ \begin{aligned} \mathbb{E}[\tau\mid \text{success}] &= \frac{1}{1-e^{-\lambda\ell}} \int_0^\ell t \lambda e^{-\lambda t}\,dt \ &= \frac{1}{1-e^{-\lambda\ell}}\left( \left[-t e^{-\lambda t}\right]_0^\ell + \int_0^\ell e^{-\lambda t}\,dt \right) \ &= \frac{1}{1-e^{-\lambda\ell}}\left( -\ell e^{-\lambda\ell} + \frac{1-e^{-\lambda\ell}}{\lambda} \right) \ &= \frac{1}{\lambda} - \frac{\ell e^{-\lambda\ell}}{1-e^{-\lambda\ell}}. \end{aligned} \]

7.2 Derivation of the Expected Present Value (\(\Delta \mathrm{EV}_\rho(p)\))

The expected value of a pivot attempt equals the expected discounted benefit minus the expected discounted cost.

Expected Discounted Benefit (\(\mathbb{E}[B]\)): If the pivot succeeds at time \(T_1=t \le \ell\), the benefit is the present value (PV) of the stream \(\Delta u\) starting at \(t\): \(B(t) = \int_t^\infty \Delta u\,e^{-\rho (s-t)}e^{-\rho t}\,ds = \frac{\Delta u}{\rho}e^{-\rho t}\). We take the expectation over the time of success \(T_1\), up to the runway limit \(\ell\), using the density \(f_{T_1}(t) = \lambda e^{-\lambda t}\): \[ \begin{aligned} \mathbb{E}[B] &= \int_0^\ell B(t) f_{T_1}(t)\,dt = \int_0^\ell \frac{\Delta u}{\rho}e^{-\rho t} \lambda e^{-\lambda t}\,dt \ &= \frac{\Delta u \lambda}{\rho} \int_0^\ell e^{-(\lambda+\rho)t}\,dt \ &= \frac{\Delta u \lambda}{\rho(\lambda+\rho)} (1-e^{-(\lambda+\rho)\ell}). \end{aligned} \]

Expected Discounted Cost (\(\mathbb{E}[C]\)): The cost is incurred at rate \(c\) during the sabbatical \([0, \tau]\). We compute \(\mathbb{E}\left[\int_0^\tau c e^{-\rho t} dt\right]\). We swap expectation and integration (by Fubini’s theorem, since the integrand is positive): \[ \mathbb{E}[C] = c \int_0^\infty e^{-\rho t} \mathbb{E}[\mathbb{I}(t < \tau)] dt = c \int_0^\infty e^{-\rho t} P(\tau > t) dt. \] We use the survival function \(P(\tau > t) = e^{-\lambda t}\) for \(t \in [0, \ell]\): \[ \mathbb{E}[C] = c \int_0^\ell e^{-\rho t} e^{-\lambda t} dt = c \int_0^\ell e^{-(\lambda+\rho)t} dt = c \frac{1-e^{-(\lambda+\rho)\ell}}{\lambda+\rho}. \]

Total Expected Value: \[ \Delta \mathrm{EV}_\rho(p) = \mathbb{E}[B] - \mathbb{E}[C]. \] We factor out the common term \(\frac{1-e^{-(\lambda+\rho)\ell}}{\lambda+\rho}\) (the expected discounted duration) and substitute \(\lambda=rp\): \[ \boxed{ \Delta \mathrm{EV}_\rho(p) = \frac{1-e^{-(rp+\rho)\ell}}{rp+\rho} \left(\frac{\Delta u\,rp}{\rho} - c\right). } \]

7.3 Break-even Probability (\(p^*\))

The EV is zero whenever the term in brackets vanishes (since the prefactor is strictly positive). \[ \frac{\Delta u\,rp_\rho^*}{\rho} - c = 0 \implies \boxed{p_\rho^* = \frac{c\rho}{r\Delta u}}. \]

8 Appendix B: Field-Level Model Derivations

We analyze the field-level optimum using a public ledger in impact dollars.

Setup:

  • \(K\) applicants, \(N\) seats. Impacts are \(J^{(k)} \sim F\) i.i.d.
  • Hires are the top \(N\) candidates: \(J_{(K)} \ge J_{(K-1)} \ge \dots\).
  • Total annual impact from the top \(N\): \(S_{N,K}:=J_{(K)}+J_{(K-1)}+\dots+J_{(K-N+1)}\).
  • Expected annual benefit: \(B(K) = \mathbb{E}[S_{N,K}]\).
  • Marginal value: \(\mathrm{MV}_K = B(K+1) - B(K)\).
  • Social discount rate: \(\delta\).
  • Social burn rate (foregone public impact): \(\gamma := \mathcal{I}_0 + d_0 + \varepsilon\).

We don’t model congestion costs. Generally, employers who’ve filled a given role can ignore excess applications, and there’s a lot of evidence that they do (J. Horton, Kerr, and Stanton 2017; J. Horton and Vasserman 2021; J. J. Horton et al. 2024).

8.1 Welfare Function and Optimality

Present-value horizon: \(H_\delta = \int_0^\infty e^{-\delta t}\,dt = 1/\delta\).

PV of a failed attempt: Assuming a failed attempt uses the full runway \(\ell\) (this simplifies calculating the marginal cost of an additional applicant): \[ L_{\text{fail},\delta} = \int_0^\ell \gamma e^{-\delta t}\,dt = \gamma\frac{1-e^{-\delta\ell}}{\delta}. \]

Total Welfare (\(W(K)\)): The total welfare \(W(K)\) is the present value of the benefits from the \(N\) hires minus the present value of the costs of all \((K-N)\) failures. \[ W(K) = B(K) \cdot H_\delta - \max\{K-N, 0\} \cdot L_{\text{fail},\delta}. \] The welfare-maximizing pool size \(K^*\) (for \(K>N\)) is where the marginal benefit equals the marginal cost. Adding one applicant produces exactly one expected failure in the pool, so the marginal cost is \(L_{\text{fail},\delta}\). \[ \mathrm{MV}_K \cdot H_\delta = L_{\text{fail},\delta}. \] Substituting the expressions and cancelling \(1/\delta\): \[ \boxed{\mathrm{MV}_K = \gamma (1-e^{-\delta\ell}).} \] This is the optimality condition used in the main text.

8.2 Distribution-Specific Results

We solve for \(K^*\) based on the behaviour of \(\mathrm{MV}_K\) for different distributions \(F\).

8.2.1 Exponential Distribution (Light Tail)

Let \(J \sim \mathrm{Exp}(\lambda)\) have mean \(1/\lambda\). The expected sum of the top \(N\) order statistics out of \(K\) draws has a known closed form, often derived using the Rényi representation of exponential spacings: \[ B(K)=\frac{N}{\lambda}\Bigl(1+H_K-H_N\Bigr), \] Where \(H_K = \sum_{k=1}^K \frac{1}{k}\) is the \(K\)-th harmonic number.

Marginal Value: \[ \mathrm{MV}_K = B(K+1) - B(K) = \frac{N}{\lambda}(H_{K+1} - H_K) = \frac{N}{\lambda(K+1)}. \] Returns diminish hyperbolically (\(O(1/K)\)).

Optimal Pool Size \(K^*\): Set \(\mathrm{MV}_K\) equal to the marginal social cost \(\gamma (1-e^{-\delta\ell})\): \[ \boxed{K^* = \frac{N}{\lambda\gamma(1-e^{-\delta\ell})} - 1.} \]

8.2.2 Fréchet Distribution (Heavy Tail)

Let \(J \sim \text{Fréchet}(\alpha, s)\) have shape \(\alpha>1\) (necessary for a finite mean) and scale \(s\). We use asymptotic results from extreme value theory for large \(K\). The expected sum of the top \(N\) values scales as \(K^{1/\alpha}\): \[ B(K) \approx s\,K^{1/\alpha}\,C_N(\alpha), \] where \(C_N(\alpha)\) is a constant, independent of \(K\) and \(s\): \[ C_N(\alpha) := \sum_{k=1}^{N}\frac{\Gamma\bigl(k-\tfrac{1}{\alpha}\bigr)}{\Gamma(k)}. \]

Marginal Value: We approximate the marginal value by taking the derivative of the asymptotic expression: \[ \mathrm{MV}_K \approx \frac{d}{dK} B(K) = s C_N(\alpha) \frac{1}{\alpha} K^{\frac{1}{\alpha}-1}. \] Returns diminish as a power law (\(O(K^{-(1-1/\alpha)})\)), slower than exponential.

Optimal pool size \(K^*\): We set \(\mathrm{MV}_K\) equal to the marginal social cost \(\gamma (1-e^{-\delta\ell})\) and solve for \(K\): \[ \frac{s C_N(\alpha)}{\alpha} (K^*)^{\frac{1}{\alpha}-1} = \gamma (1-e^{-\delta\ell}). \] \[ \boxed{K^* = \left(\frac{s\,C_N(\alpha)}{\alpha\,\gamma\,(1-e^{-\delta\ell})}\right)^{\frac{\alpha}{\alpha-1}}.} \] As \(\alpha \downarrow 1\) (heavier tails) increases, the exponent \(\frac{\alpha}{\alpha-1} \to \infty\) causes \(K^*\) to explode.

8.3 Plotting Parameters (for \(W(K)\) curves)

To plot the total welfare curves \(W(K)\), we need the total benefit \(B(K)\) and must normalize the distributions so they have the same mean impact, \(\mu_{\text{imp}}\), for a fair comparison. We set \(\mu_{\text{imp}}=100\) impact-dollars/yr in the main text.

  • Exponential:
    • Mean: \(1/\lambda=\mu_{\text{imp}}\Rightarrow \lambda=1/\mu_{\text{imp}}\).
    • Total Benefit (exact): \(B(K)=\mu_{\text{imp}}\,N\,\Big(1+H_K-H_N\Big)\).
  • Fréchet (\(\alpha>1\)):
    • Mean: \(\mathbb{E}[J]=s\,\Gamma(1-1/\alpha)=\mu_{\text{imp}}\Rightarrow s=\mu_{\text{imp}}/\Gamma(1-1/\alpha)\).
    • Total Benefit (asymptotic): \(B(K)\approx s\,C_N(\alpha)\,K^{1/\alpha}\).
    • (Where \(C_N(\alpha)\) is defined above).

We analyze the field-level optimum using a public ledger in impact dollars.

9 Appendix C: Counterfactual Impact and Equilibrium Derivations

9.1 Derivation of \(\mathcal{I}_{CF}\)

We relate the Marginal Value of entry (\(\mathrm{MV}_K\)) to the expected counterfactual impact conditional on success (\(\mathcal{I}_{CF}\)).

\(\mathrm{MV}_K\) is the expected increase in total field impact when an applicant joins the pool (moving from \(K\) to \(K+1\) applicants). Let \(S\) be the event of success (being hired). \(q_K\) is the probability of success \(P(S)\). [TODO clarify] We assume applicants are exchangeable.

By the law of total expectation: \[ \mathrm{MV}_K = \mathbb{E}[\text{Impact of Entry} \mid S] P(S) + \mathbb{E}[\text{Impact of Entry} \mid \neg S] P(\neg S). \] If an applicant enters and fails (\(\neg S\)), their counterfactual impact is 0. The expected impact, conditional on success, is \(\mathcal{I}_{CF}\).

Therefore: \[ \mathrm{MV}_K = \mathcal{I}_{CF} \cdot q_K. \] We assume exchangeable applicants are competing for \(N\) slots in a pool of \(K+1\) and \(q_K = N/(K+1)\). \[ \mathcal{I}_{CF} = \frac{\mathrm{MV}_K}{q_K} = \mathrm{MV}_K \cdot \frac{K+1}{N}. \]

9.2 \(\mathcal{I}_{CF}\) for distributions

Exponential (light-tailed): The mean impact is \(\mu\). See Appendix B for \(\mathrm{MV}_K = N\mu/(K+1)\). \[ \mathcal{I}_{CF} = \frac{N\mu/(K+1)}{N/(K+1)} = \mu. \] We find the expected counterfactual impact, conditional on success, is constant.

Fréchet (heavy tail):

Appendix B shows \(\mathrm{MV}_K \propto K^{1/\alpha-1}\) (valid asymptotically for large K). \(q_K \propto 1/K\). \[ \mathcal{I}_{CF} = \frac{\mathrm{MV}_K}{q_K} \propto \frac{K^{1/\alpha-1}}{1/K} = K^{1/\alpha}. \] The expected counterfactual impact conditional on success grows with the pool size \(K\).

9.3 Equilibrium Condition and Misalignment

The equilibrium \(K_{eq}\) occurs when the private EV — computed using the counterfactual surplus \(\Delta u_{CF}(K)\) — is zero. That happens when the bracketed term in the EV formula (Part A) is zero: \[ \frac{\Delta u_{CF}(K)\,r p(K)}{\rho} - c = 0 \implies \Delta u_{CF}(K) \cdot p(K) = \frac{c\rho}{r}. \] The RHS, \(\frac{c\rho}{r}\), is the effective private cost hurdle per application attempt.

We compare it to the social optimality condition \(K^*\), defined in Part B. For simplicity, we approximate the social cost of failure, \(L_{\text{fail},\delta} \approx \gamma/\delta\), assuming large \(\ell\), and set \(\delta=\rho\). The optimality condition \(\mathrm{MV}_K \cdot H_\delta = L_{\text{fail},\delta}\) becomes: \[ \mathrm{MV}_K = \gamma. \]

Analyzing Over/Under Entry:

To illustrate the misalignment, consider a simplified case where private financial losses (pay cuts) are negligible, so \(\Delta u_{CF}(K) \approx \mathcal{I}_{CF}(K)\). Also, assume that the per-application success rate \(p(K)\) approximates the overall success probability \(q_K\).

In this case, \(\Delta u_{CF}(K) \cdot p(K) \approx \mathcal{I}_{CF}(K) \cdot q_K = \mathrm{MV}_K\).

The private equilibrium condition simplifies to: \(\mathrm{MV}_K = \frac{c\rho}{r}\). The social optimum condition remains: \(\mathrm{MV}_K = \gamma\).

Since \(\mathrm{MV}_K\) is decreasing in \(K\), \(K_{eq} > K^*\) (Over-Entry) occurs if the private threshold is lower than the social threshold: \[ \frac{c\rho}{r} < \gamma. \] This happens when the private cost hurdle per attempt is lower than the social cost rate of failure. As the main text shows using Alice’s parameters (0.69k vs 18k), the inequality often holds strongly, indicating a structural tendency toward over-entry even when agents use sophisticated counterfactual reasoning.

10 References

Arulampalam. 2000. Is Unemployment Really Scarring? Effects of Unemployment Experiences on Wages.”
Caron, Teh, and Murphy. 2014. Bayesian Nonparametric Plackett–Luce Models for the Analysis of Preferences for College Degree Programmes.” The Annals of Applied Statistics.
Earnest, Allen, and Landis. 2011. Mechanisms Linking Realistic Job Previews with Turnover: A Meta‐Analytic Path Analysis.” Personnel Psychology.
Hill, Yin, Stein, et al. 2025. The Pivot Penalty in Research.” Nature.
Horton, John, Kerr, and Stanton. 2017. Digital Labor Markets and Global Talent Flows.” Working Paper. Working Paper Series.
Horton, John J, Sloan, Vasserman, et al. 2024. Reducing Congestion in Labor Markets: A Case Study in Simple Market Design.”
Horton, John, and Vasserman. 2021. Job-Seekers Send Too Many Applications: Experimental Evidence and a Partial Solution.” 021.
Schmidt, Frank L., and Hunter. 1998. The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 85 Years of Research Findings. Psychological Bulletin.
Schmidt, F. L., Oh, and Shaffer. 2016. The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 100 Years of Research.”
Skaperdas. 1996. Contest Success Functions.” Economic Theory.

Footnotes

  1. TODO: come back and plot the sensitivity later.↩︎

  2. I’m making this super clear because when I get LLMs to review this piece they’re inevitably keen to push me to add employer congestion costs, but the literature is pretty clear it’s not a first-order effect in practice.↩︎