The technical interview isn’t a measurement instrument
Fayner Brack is right that we’re rejecting the wrong engineers. The harder question is who in the organization benefits from the current system continuing to be broken.
Fayner Brack published a piece last week arguing that technical interviews systematically reject the wrong engineers. She’s spent fifteen years watching brilliant engineers fail because they didn’t solve a problem the way the interviewer expected, while mediocre engineers get hired because they practiced the right LeetCode patterns. She notes that the “cost of a bad hire” statistic everyone cites traces back to no original source. And she proposes the obvious correct framework move: measure the interview itself, not just the candidate.
She’s right about all of it. I want to extend the argument in one direction the piece gestures at but doesn’t quite land on, because I think it’s the part that explains why the broken system is so durable.
The technical interview, as practiced in most engineering organizations, isn’t functioning as a measurement instrument. It’s functioning as a sorting ritual that performs organizational risk-management work for hiring managers. Better measurement would disrupt the risk-management function, which is why proposals like Brack’s keep getting nodded at and not adopted.
What the interview is doing instead
Consider the manager hiring a senior engineer. They run the standard loop: coding screen, system design, behavioral, hiring manager round. The candidate gets through. Six months later, the candidate is underperforming. The manager has to explain this to their director.
Notice what the interview gave the manager. It gave them an audit trail. “The candidate passed the bar” is a complete answer to “how did this person end up on the team.” The interview didn’t predict job performance. It produced documentation of consensus, which is what the manager needs when the hire goes wrong.
This is risk transfer, not measurement. The interview converts the manager’s individual judgment into a multi-person collective judgment, then ratifies that collective judgment as “the bar.” When outcomes are bad, the bar takes the blame, not the manager. The institution wears the failure on the manager’s behalf.
A better-measured interview process would weaken this function. If the interview were predictive in the way managers claim to want it to be, the manager would be on the hook for the prediction. If the interview produced calibrated probabilities of job success rather than yes/no consensus, the manager would have to defend the probability they accepted. Most managers do not want this. The current system is uncomfortable, but it externalizes their hiring risk in a way they’re not eager to give up.
This is the dynamic I’ve described elsewhere as vibes-based evaluation: an organization runs on feel rather than criteria, because feel is harder to be wrong about. The technical interview is the most ritualized version of this pattern in engineering. It produces decisions without producing accountability for the decisions.
Why “the bar” survives every reform attempt
Every few years a thoughtful engineer publishes a piece like Brack’s. The argument is correct. The proposed reforms are reasonable. And the industry barely moves.
The standard explanation is institutional inertia. Companies are slow, hiring processes are entrenched, interviewers don’t want to change what they’re comfortable with. All true, and all incomplete.
The deeper reason is that “the bar” is doing a job the engineering organization needs done, and changing it would require changing what the organization holds the hiring manager accountable for. Right now, hiring managers are accountable for “did the candidate clear the bar.” If you ask managers to be accountable for “did your hiring decisions produce engineers who shipped well over the next eighteen months,” you’re asking for an entirely different organizational structure. One that has retroactive review of hiring outcomes, that tracks hire-by-hire performance over time, that creates feedback loops between calibration meetings and shipped work.
Most engineering organizations do not have any of this. The hiring decision is made, the candidate joins or doesn’t, and the loop closes. There is no measurement of whether the decision was right because there is no infrastructure for that measurement to happen, and there is no infrastructure because no manager wants their hiring track record to become legible.
Brack’s framework for measuring interview quality holds up. It will struggle to be adopted not because it’s wrong but because it threatens the system the interview is serving.
The AI-era version is worse
Whatever was already wrong with technical interviews as predictors of job performance has gotten sharper now that AI tools are sitting in the production workflow.
LeetCode screens were never great predictors of engineering performance. They tested for a specific kind of pattern-matching that correlated weakly with the broader skill of building and maintaining systems. The correlation was weak before AI; with AI in the workflow, the test is measuring something the job doesn’t require in the form the interview presumes.
The engineer who solves a binary tree problem in fifteen minutes on a whiteboard is demonstrating something. What that something predicts about their ability to use Claude or Copilot effectively, to verify AI-generated code, to maintain mental models of systems where AI wrote most of the lines, is anyone’s guess. I’ve written about this in what interviews measure when AI is in the room: the screen is now testing for a skill the job requires in a fundamentally different shape than it did three years ago, and the screen hasn’t updated.
The defense organizations offer is that the screen still measures “fundamentals.” Maybe. The empirical question, which Brack would correctly insist on, is whether the screen’s accept/reject decisions correlate with subsequent job performance in the AI-augmented workplace. Almost no organization is measuring this, for the same reason they don’t measure hiring outcomes generally: doing so would generate evidence that the current process is selecting on noise.
What real interview measurement requires
Brack is right that the fix is measuring the interview itself. The shape of that measurement has to be more uncomfortable than most organizations have admitted.
A real interview measurement program would do four things, none of which most companies do.
It would track hiring outcomes by interviewer. Which engineers passed candidates who later thrived? Which engineers passed candidates who later washed out? Which engineers rejected candidates who later thrived at competing companies? The pattern of false positives and false negatives by interviewer would be diagnostic. Most companies cannot tell you this because they don’t track it.
It would compute inter-rater reliability on real interview signals. Two interviewers see the same candidate. Do their evaluations agree? If they don’t, the bar isn’t a bar; it’s a lottery. If they agree, but disagree with later job performance, the bar is well-calibrated to a thing that isn’t the job.
It would generate counterfactual data through rejected-candidate follow-up. The most expensive interviews aren’t the false positives, the bad hires that washed out. They’re the false negatives, the candidates you rejected who went to a competitor and outperformed your senior engineers within eighteen months. You will never see this data unless you go looking for it. Almost no engineering organization does.
It would commit to revising the process based on what the data shows. This is the part where most reform efforts collapse. The data shows the interview is poorly predictive, and the response is “but we have to interview people somehow,” and the process continues unchanged because the people who would have to change it are the people the current process protects.
What this means for engineering leaders
If you’re running an engineering organization that hires more than a handful of people a year, Brack’s argument should make you uncomfortable, and the discomfort is the signal that the argument is operating on something real.
The honest assessment of your own process probably looks like this. You don’t track hiring outcomes by interviewer because nobody has time. You don’t compute inter-rater reliability because the calibration meetings already feel productive and people would resist being measured against each other. You don’t follow up on rejected candidates because you don’t want to know. Your “bar” is whatever the senior engineers in your loop have agreed to over the last few years through a process that nobody documented and nobody is going to revisit.
This is the part where the standard recommendation would be “build the measurement infrastructure.” Sometimes that’s right. More often, it’s wrong about the order of operations.
The actual first move is making one person accountable for the predictive validity of your hiring loop, separately from the people who run the loop. This person’s job is to ask, every quarter, whether the engineers you’ve hired in the last year are performing where the interview predicted they would. If they aren’t, the interview needs adjustment. If they are, the interview can stand. Most organizations do not have this role because creating it would require admitting that hiring is a measurement problem with a measurement solution, rather than a ritual that produces consensus.
The second move is harder. It’s accepting that the cost of better hiring is reduced ambiguity about who made which call. The current system is comfortable because the responsibility is diffused across the loop. A measured system makes the responsibility legible, and the people whose calls are revealed to be poorly predictive will not enjoy the legibility. This is the political problem that the technical solution can’t reach.
The thing Brack is gesturing at
Brack closes his piece on the framework, which is the constructive move and the right one for an engineering audience. The piece she isn’t writing but is plainly aware of is the one about why the framework keeps not getting adopted.
The technical interview persists in its current form because the current form serves the people who would have to change it. Brilliant engineers will continue to be rejected. Mediocre engineers will continue to be hired. The cost of a bad hire will continue to be cited from a source nobody can produce. And the whole apparatus will continue to be defended as rigor.
The framework is sound. It is also not the bottleneck. What is, underneath, is the request hidden inside “measure the interview.” That request is to make hiring decisions accountable in a way that would expose how much of the current process is theatre.
The engineers who get rejected aren’t the wrong engineers. They’re the right engineers being filtered through a process that was never optimized for finding them. The process was optimized for something else. Until we name what that something else is, every reform proposal will continue to hit the same wall.


