AI isn’t a replacement problem…
… it’s a redesign problem
Everyone is having the wrong argument about AI and work.
The dominant narrative has two poles. On one side: AI will replace humans, millions of jobs will disappear, we need to prepare for a post-work future. On the other: AI is overhyped, it hallucinates, it can’t do what humans do, your job is safe. Both sides are arguing about whether AI can do human work. Almost nobody is asking the more interesting question: should it?
Because here’s what’s getting lost in the replacement debate. The structure of most knowledge work is, itself, the problem. We’ve built jobs that require humans to spend the majority of their time on tasks humans are bad at - repetitive, precise, rule-bound execution - and a minority of their time on the thing humans are irreplaceably good at: making judgment calls with incomplete information.
AI doesn’t need to replace anyone. It needs to let us redesign the work so that each actor in the system, human and machine, operates where they’re strongest. The result isn’t fewer humans. It’s humans doing fundamentally different work, for fewer hours, with greater output.
That sounds utopian. It’s not. It’s an engineering problem with real tradeoffs, and the biggest tradeoff is one nobody’s talking about.
The dirty secret of most knowledge work
Watch a senior financial analyst for a day. Track what they actually do with their time. You’ll find something depressing.
Maybe 15% of their day involves the thing they were hired for: interpreting ambiguous data, weighing competing signals, making a recommendation under uncertainty. The thing that requires their judgment, their experience, their ability to hold contradictory information and still choose a direction. The rest? Pulling data from three different systems. Formatting spreadsheets. Cross-referencing figures. Building slide decks that present what they already know in a format someone else requires. Sending follow-up emails. Attending status meetings where the only new information is that there is no new information.
This isn’t unique to finance. Software engineers spend more time reading documentation, configuring environments, and writing boilerplate than they spend on the architectural decisions that determine whether a system works. Lawyers spend more time reviewing standard clause language than they spend on the novel legal reasoning that wins cases. Doctors spend more time on charting and insurance coding than on diagnosis.
The knowledge economy promised us thinking work. What it delivered was thinking work buried under an avalanche of doing work - and most of the doing work is precisely the kind of precise, directed, rule-following execution that humans find draining and do inconsistently.
We’ve been coping. We call it “the boring parts of the job.” We treat it as the price of admission. But it’s not an immutable feature of the work. It’s an artifact of how we designed the workflow before we had a tool that could handle the boring parts reliably.
What AI is actually good at (and what it’s not)
The replacement debate gets muddled because people talk about AI as a monolith. “AI can do X” or “AI can’t do X.” This is like saying “engines can drive” or “engines can’t drive.” Engines are good at a specific thing - converting fuel into rotational force - and terrible at everything else. You don’t evaluate an engine by asking whether it can replace a driver. You evaluate it by asking what it contributes to the system of driving.
AI, as it currently exists, is exceptionally good at a well-defined category of work: tasks that are rote, precise, and directed. Summarize this document according to these criteria. Extract these fields from these records. Generate a first draft following this template. Cross-reference this dataset against that one. Translate this format into that format. Monitor this stream and flag anything that matches this pattern.
These tasks share three properties. They have clear inputs. They have clear completion criteria. And the judgment required to execute them is bounded: you can specify what “right” looks like in advance.
AI is conspicuously bad at a different category of work: decisions where the criteria for “right” can’t be fully specified in advance. Should we enter this market? Is this patient’s presentation consistent with their history, or is something being missed? Does this contract protect our interests in a scenario nobody has articulated yet? Should we ship this feature now or wait for more data? Is this employee struggling, or are they checked out?
These decisions require something AI fundamentally lacks: the ability to weigh factors that haven’t been enumerated, to draw on experiential knowledge that can’t be formalized, to make a call when the framework for making the call doesn’t yet exist. This is the domain of human judgment, and it’s not a gap that scales away with more training data. It’s a different kind of cognitive work.
The mistake is treating these two categories as points on a continuum, as if AI will eventually slide from one into the other. They’re not on a continuum. They’re different in kind. And recognizing that difference is what makes redesign possible.
The redesign, concretely
Here’s what redesigned work actually looks like, not in theory but in the three places I’ve watched it happen.
A mid-size investment firm restructured its analyst workflow last year. Previously, junior analysts spent roughly 60% of their time on data assembly - pulling figures, building comps, formatting models - and 40% on analysis. The firm didn’t replace the junior analysts. It handed the data assembly to AI tools and asked the analysts to spend the reclaimed time doing more analysis. Same headcount. Different job.
The result was uncomfortable. The analysts could now evaluate more opportunities per week. But “more evaluation” doesn’t mean “more of the same.” It means more decisions. More moments of applying judgment under uncertainty. More instances of putting your name on a recommendation and saying “I think this is the right call.” The cognitive intensity per hour went up substantially.
Two things happened. Output increased - the team covered roughly 2.5x more opportunities per quarter with the same number of analysts. And the analysts reported being more exhausted at the end of the day, even though several of them were leaving two hours earlier than they used to. They’d traded eight hours of mixed cognitive load, some intense, mostly routine, for six hours of almost exclusively intense cognitive work. The work got better. It also got harder.
A legal team at a technology company did something similar. They used AI to handle first-pass contract review, flagging deviations from standard terms, generating redline suggestions against a template, summarizing counterparty positions. The attorneys weren’t replaced. They were repositioned: their job became reviewing the AI’s work product and making the calls the AI couldn’t make. I wrote about why this framing - that the role is elevated, not diminished - is psychologically insufficient for the practitioners living through it, using the neofirm model in professional services as a case study. Which clause deviations actually matter given the business relationship? Where should we push back and where should we concede? What’s the real risk in this language, given context the AI doesn’t have?
Same pattern. Output went up and the team processed contracts faster. Cognitive load went up too. Every hour at the desk was now spent on judgment, not formatting. Hours went down - several attorneys shifted to a compressed schedule because the intense decision work was genuinely unsustainable for eight consecutive hours.
A clinical documentation team at a hospital system tried the most radical version. They used AI to generate draft clinical notes from recorded patient encounters, handle insurance pre-authorization paperwork, and flag potential coding errors. The clinicians reviewed and approved the AI’s output, then spent the reclaimed time on patient interaction and clinical decision-making.
The physicians I spoke with described the same paradox. They were doing less total work by the clock, and the work they were doing was what they’d gone to medical school for. But they were making more clinical decisions per hour with less buffer time between them. The administrative padding that used to give them unconscious recovery time between hard judgment calls was gone. They got the work they wanted. It turned out to be the hardest part of the job for a reason.
I wrote about how this is impacting software engineers in tech now by changing their jobs in unsustainable ways, but the pattern is the same everywhere.
The tradeoff nobody mentions
Every article about AI productivity tells half the story. Output goes up. Hours can go down. Efficiency improves. All true.
Here’s the other half: when you strip away the routine tasks that dilute a knowledge worker’s day, what remains is the concentrated essence of their role, and that essence is cognitively expensive.
Decision-making under uncertainty is one of the most metabolically costly activities the brain performs. Research on cognitive depletion shows that the quality of decisions degrades measurably over a sustained period of judgment-intensive work. This is why judges grant parole at different rates before and after lunch. It’s why surgeons have higher complication rates at the end of long shifts. The brain that makes good decisions is a brain that hasn’t been making decisions non-stop.
In the old model, routine work served an accidental function: it provided cognitive rest between bouts of judgment. The analyst formatting a spreadsheet wasn’t doing analytically demanding work, but that was the point. Their brain was recovering from the last hard decision and preparing for the next one. The lawyer reviewing standard clauses was in a low-intensity mode that allowed their judgment faculties to recharge.
Strip away the routine work and you lose the rest. What remains is a continuous stream of the hardest work the person does, with nothing in between to modulate the intensity. This is like removing the rest periods from interval training and expecting the same performance in each sprint. It doesn’t work that way. The rest was functional.
This means the redesigned workday can’t simply be “the old workday minus the boring parts.” It has to be structured around a realistic model of how long a human can sustain high-quality judgment work. And that number, according to most of the research and most of the people I’ve talked to who are living it, is somewhere between four and six hours.
Not four to six hours of being at work. Four to six hours of actual cognitive output - making decisions, evaluating tradeoffs, exercising judgment on ambiguous problems. After that, the quality drops. Not dramatically. Not immediately. But measurably, and in roles where the quality of judgment is the whole point, measurably is enough.
The six-hour day isn’t a perk. It’s a design requirement
This is the part that makes organizations uncomfortable. Not the AI. Not the task redistribution. The implication that an optimally designed human-AI workflow might require shorter human hours - not as a benefit, but as an engineering constraint.
If your redesigned role consists almost entirely of judgment work, and humans can sustain high-quality judgment work for roughly six hours, then scheduling eight hours of judgment work doesn’t get you two more hours of output. It gets you six hours of good output and two hours of degraded output that creates problems you’ll spend tomorrow fixing. I explored the organizational failure mode this creates in the context of high-intensity engineering cultures, where 'more work than people' looks like energy until it becomes attrition.
The old eight-hour day worked (to the extent it did) because it was never eight hours of a single mode of work. It was a mix of intensities. Remove the low-intensity components and the sustainable duration shrinks. This isn’t a motivational problem. It isn’t about willpower or work ethic. It’s a constraint imposed by the architecture of the human brain, and no amount of organizational culture can override it.
The organizations that are getting this right, and there aren’t many yet, are the ones treating human cognitive capacity as a design constraint rather than a performance variable. They’re not asking “how can we get people to sustain eight hours of judgment work?” They’re asking “given that our people can sustain six hours of judgment work, how do we design the system so those six hours produce more value than eight hours of the old mixed-intensity model?”
The answer, in every case I’ve seen, is that they do. Easily. Six hours of focused judgment work, supported by AI handling the execution, produces more high-quality output than eight hours of the old model where judgment was diluted by routine. The math works. The humans are better. The work is better. That math holds as long as the AI's output quality holds - which requires ongoing monitoring that most organizations haven't built. The redesign isn't a one-time event.
The organizational courage required to actually schedule a six-hour day (to tell people that their work is done and they should leave) is apparently harder than any technical implementation challenge.
What this means if you’re adopting AI
If you’re thinking about AI adoption as a question of which jobs to automate, you’re solving the wrong problem. The question isn’t which people to replace. It’s which tasks to reallocate and how to restructure what remains so that humans operate in their zone of maximum contribution.
Start with task decomposition, not role elimination. Take any role in your organization and break it into its component tasks. Categorize each task: Is it rote, precise, and directed - meaning the criteria for “done well” can be specified in advance? Or does it require judgment under ambiguity - meaning the person doing it needs to weigh factors that can’t be fully enumerated? The first category is AI work. The second is human work. The role isn’t going away. Its composition is changing. The same logic applies to deployment design more broadly: an unscoped AI adoption creates an unbounded surface area for human-systems failures, and the failures surface one at a time, each looking isolated.
Design for cognitive sustainability. Once you’ve reallocated the routine tasks, look at what remains. If what’s left is six hours of wall-to-wall judgment work, don’t schedule eight hours and wonder why quality degrades in the afternoon. Treat human cognitive capacity as a constraint you’re designing around, not a problem you’re managing through.
Expect the transition to be uncomfortable. People who’ve spent years in roles that mix routine and judgment have adapted to that rhythm. This is the specific identity mechanism I described in the case of a tenured professor confronting the same shift: the routine wasn't just padding, it was how people knew they were good at their jobs. Removing the routine doesn’t just free up time, it removes the cognitive pacing that made the judgment work sustainable. The transition period will feel harder, not easier. That’s not a sign it isn’t working. It’s a sign you need to redesign the cadence of the day, not just the content.
Measure output, not hours. This is the only way the redesign survives contact with organizational inertia. If your analysts cover 2.5x more opportunities in six hours than they used to in eight, the value case is clear. If you measure by hours-at-desk, you’ll never make the shift because the new model will always look like people working less. They are working less. They’re also producing more. You have to decide which number you care about. This distinction between production and understanding (what someone can generate versus what they can own) is the same gap that shows up in hiring, and in how organizations evaluate whether redesign is actually working.
The real transformation
The AI replacement debate is a distraction. It frames a design problem as an existential threat, which makes it both scarier and less actionable than it needs to be.
The actual opportunity is more mundane and more transformative: redesign knowledge work so that humans do the work only humans can do, and machines do the work machines do better. The result isn’t a jobless future. It’s a future where the nature of work changes, the intensity of work increases, the duration of work decreases, and the output improves.
The hardest part isn’t the technology. The technology is the easy part. The hardest part is accepting that an optimally designed system (one where humans operate at their cognitive best and AI handles the rest) might produce its best results in a six-hour day. And then having the organizational courage to actually implement that.
The work doesn’t need to be longer. It needs to be better. AI doesn’t make that possible by replacing humans. It makes it possible by finally letting humans do what they’ve been too busy to do all along: think.


