AI Threatens Enterprise Risk: It's Replacing the Experts It Needs

For artificial intelligence systems to achieve continued advancement in knowledge-based work, they require either a robust mechanism for autonomous self-improvement or human evaluators adept at identifying errors and providing high-quality feedback. The industry has poured immense resources into the former, while largely neglecting the latter.

It is crucial to approach the challenge of human evaluation with the same level of rigor and investment dedicated to building model capabilities themselves. Major tech firms have seen a halving in new graduate hiring since 2019. Tasks such as document review, preliminary research, data cleaning, and code review are now handled by AI. Economists refer to this as displacement; companies label it efficiency. Neither perspective adequately addresses the future implications.

The Limitations of Self-Improvement in Knowledge Work

The most immediate counterargument involves reinforcement learning (RL). Systems like AlphaZero have achieved superhuman performance in games such as Go, chess, and Shogi, learning without human data and even devising novel strategies. Move 37, played in the 2016 match against Lee Sedol—a move professional players deemed unconventional—was not derived from human annotation but emerged from AI self-play.

This advancement is facilitated by the stability of the game environment. Move 37 is a novel action within the fixed framework of Go’s state space. The rules are comprehensive, unambiguous, and unchanging. Crucially, the reward signal is definitive: win or lose, with immediate and unequivocal results. The system inherently understands the efficacy of a move based on the ultimate outcome of the game.

Knowledge work, however, lacks these stable properties. The operational rules within professional domains are inherently dynamic, constantly being reshaped by the individuals who participate in them. New legislation is enacted, innovative financial instruments are developed, and a legal strategy effective in one year might prove obsolete in a jurisdiction that has since altered its legal interpretations. The correctness of a medical diagnosis may only become apparent years later. Without a stable environment and an unambiguous reward signal, closing the learning loop becomes impossible, necessitating human involvement in the evaluation process to continue guiding the AI.

The Challenge of Expertise Formation

The AI systems currently being developed are trained on the accumulated expertise of individuals who underwent rigorous development processes. The critical difference today is that entry-level positions, which historically fostered such expertise, have been largely automated. Consequently, the next generation of potential experts is not acquiring the specialized judgment that makes human evaluators valuable contributors.

History offers examples of lost knowledge, such as Roman concrete formulas or Gothic construction techniques, and even mathematical traditions that took centuries to reconstruct. However, these instances were driven by external factors like plagues, invasions, or the collapse of supporting institutions. The current situation is distinct, as no external force is required. Fields can degrade not due to catastrophe, but as a result of numerous individually rational economic decisions, each logical in isolation. This represents a novel mechanism, and we possess limited experience in recognizing its unfolding.

When Entire Disciplines Face Obsolescence

At its extreme, this trend signifies not merely a pipeline issue, but a collapse in the demand for specialized expertise itself.

Consider advanced mathematics. Its decline would not stem from a lack of mathematicians in training, but from organizations ceasing to require their services for daily operations. This erodes the economic incentive for pursuing such careers, diminishes the pool of individuals capable of frontier mathematical reasoning, and ultimately leads to a quiet erosion of the field’s capacity for novel insight. The same logic extends to coding. The relevant question is not whether AI will write code, but rather, if AI handles all production code, who will cultivate the deep architectural intuition necessary for genuinely innovative system design?

There exists a fundamental distinction between automating a discipline and truly understanding it. While we can automate a significant portion of structural engineering today, the abstract knowledge underpinning why certain approaches succeed resides within individuals who have spent years learning from their mistakes. Eliminating the practical application of a skill does not merely eliminate practitioners; it eradicates the capacity to comprehend what has been lost.

Disciplines such as advanced mathematics, theoretical computer science, profound legal reasoning, and complex systems architecture face a critical risk. When the last expert in a specialized area of algebra retires without a successor due to dwindling funding and the disappearance of career paths, that knowledge may not be readily rediscovered.

It becomes lost, and this loss often goes unnoticed because models trained on previous work continue to perform adequately on benchmarks for another decade. This phenomenon can be described as a hollowing out: the superficial capability persists (models can still generate expert-like outputs), while the underlying human capacity to validate, expand, or correct that expertise quietly vanishes.

The Inadequacy of Rubrics as a Complete Solution

The prevailing approach relies on rubric-based evaluation. Methodologies like Constitutional AI, reinforcement learning from AI feedback (RLAIF), and structured criteria enabling AI to score AI performance represent significant advancements that demonstrably reduce reliance on human evaluators. These techniques are not to be dismissed.

However, their inherent limitation is that a rubric can only encapsulate what its author foresaw as measurable. Rigorous optimization against such a rubric results in a model proficient at meeting its criteria, which is not synonymous with achieving genuine correctness.

Rubrics effectively scale the explicit, articulable components of judgment. The deeper, intuitive aspects—the subtle sense that something is amiss—cannot be readily codified. This type of understanding arises from direct experience, which is a prerequisite for articulating what needs to be written down.

Practical Implications and Future Outlook

This analysis is not an argument for decelerating AI development. The observed capability gains are substantial. It remains possible that researchers will devise methods for closing the evaluation loop without human judgment, perhaps through sufficiently advanced synthetic data pipelines or novel self-correction mechanisms within AI models.

Yet, such solutions are not currently available. In the interim, we are inadvertently dismantling the human infrastructure that currently bridges this gap, not through deliberate policy but as a consequence of numerous individually rational economic decisions. A responsible approach to this transition involves treating the evaluation gap not as a problem that will resolve itself, but as an open research challenge demanding the same urgency applied to capability advancements.

The very element that AI most requires from humans is precisely what receives the least focus in terms of preservation. Whether this situation is permanent or temporary, the cost of neglecting it remains significant.

Ahmad Al-Dahle is CTO of Airbnb.

Business Style Takeaway: The automation of knowledge work, while driving efficiency, risks eroding the very human expertise needed to evaluate and advance AI systems, potentially leading to a silent degradation of deep understanding across critical fields. Businesses must proactively invest in preserving and cultivating human judgment alongside AI development to ensure long-term innovation and prevent the loss of irreplaceable specialized knowledge.

Information compiled from materials : venturebeat.com

No votes yet.

Please wait...

AI Threatens Enterprise Risk: It’s Replacing the Experts It Needs

The Limitations of Self-Improvement in Knowledge Work

The Challenge of Expertise Formation

When Entire Disciplines Face Obsolescence

The Inadequacy of Rubrics as a Complete Solution

Practical Implications and Future Outlook

Leave a ReplyCancel Reply