Eighty-eight percent of organizations now use artificial intelligence in at least one business function, yet only 39 percent report any enterprise-level EBIT impact from those deployments. That figure, drawn from McKinsey’s November 2025 global survey of nearly 2,000 executives across 105 countries, identifies a gap that has emerged as one of the defining management challenges of the current technology cycle. Adoption is near-universal at the function level; the financial results remain concentrated in a minority of organizations.

McKinsey’s analysis identified the variable separating companies capturing real returns from those running expensive pilots: whether organizations redesigned the workflows into which AI was deployed, or simply layered new tools onto old structures. AI high performers (companies attributing at least 5 percent of EBIT to AI use) are nearly three times as likely as their peers to have fundamentally restructured individual workflows. That redesign accounts for one of the strongest statistical contributions to meaningful business impact across all the organizational factors the survey tested.

Phaneesh Murthy, whose career spans the role of Worldwide Head of Sales and Marketing at Infosys and later CEO of iGATE, where he grew the company’s enterprise value from approximately $70 million to $4.8 billion over a decade, has articulated the problem in terms that cut to its managerial core. “The manager of the future does not just delegate to people,” he has written. “They design collaboration between people and machines.” That design obligation, in his framing, is now the central competency of effective leadership, and most organizations have not begun treating it with the seriousness it requires.

Why AI Adoption Isn’t Producing Enterprise-Level Returns

The McKinsey findings disrupt a widespread organizational assumption: that broader AI adoption leads naturally to better outcomes. Among the 88 percent of companies now using AI in at least one business function, roughly two-thirds remain in experimentation or piloting stages. Only around one-third have reached scaled deployment, and among those, only the organizations that simultaneously redesigned workflows, established human validation processes for AI outputs, and set growth objectives alongside efficiency targets are reporting meaningful bottom-line results.

Organizations deploying AI without restructuring the work around it generate a recognizable failure pattern. Teams produce more reports, more summaries, more initial drafts. The humans reviewing those outputs haven’t been freed from their prior responsibilities, because no one changed the role. The automation adds capacity in one part of a workflow without adjusting the load anywhere else. Output volume rises while decision quality stays flat.

Phaneesh Murthy has addressed this structural problem directly in his writing on enterprise AI. Managers who treat AI as a throughput accelerator without examining what throughput actually means in an AI-assisted environment are addressing the wrong problem. Whether the existing workflow is the right one for an AI-augmented team is the question that needs answering first, ahead of deployment decisions and tool selection.

How the Traditional Logic of Delegation Breaks Down

For most of the twentieth century, the internal logic of work allocation inside large organizations was relatively stable. Analytical tasks flowed to analysts. Reporting to coordinators. Strategic synthesis to senior staff. Experience determined who handled complexity, and tenure signaled readiness for greater autonomy. The hierarchy of delegation tracked the natural progression of skill development.

Artificial intelligence compresses this progression in ways that create genuine structural disruption. Pattern-based analysis, demand forecasting, document summarization, and first-draft report generation can now be automated or significantly assisted by intelligent systems. The work that historically occupied the lower and middle tiers of knowledge work hierarchies is precisely where AI has shown its most consistent performance advantages.

Gartner’s 2024 technology predictions projected that by 2026, approximately 20 percent of enterprises would use AI to flatten their organizational structures, eliminating more than half of current middle management positions in those companies. The rationale the firm identified: a substantial portion of middle management work consists of reading reports, synthesizing data, and translating information between organizational layers — precisely the class of tasks AI handles faster and more consistently than human intermediaries.

Work allocation now involves a prior question that most organizations have yet to build into their management processes: which parts of a given task should be handled by a machine, which should be reserved for human judgment, and how the outputs of each feed into what comes next. That design question did not exist a decade ago. For the majority of companies, it still isn’t being asked in any systematic way.

The Task-Type Distinction That Should Drive AI Workflow Design

A 2024 meta-analysis published in Nature Human Behaviour by researchers at MIT’s Center for Collective Intelligence introduced a finding that significantly complicates the prevailing enthusiasm for human-AI teaming. Drawing on 370 experimental results from 106 studies, the MIT research team found that for decision-making tasks (classifying deepfakes, forecasting demand, diagnosing medical cases), human-AI combinations frequently underperformed compared to AI systems operating alone. The addition of human judgment, in these contexts, degraded AI performance rather than improving it.

For creative tasks, the pattern reversed. Human-AI combinations outperformed both humans and AI working independently, producing outcomes neither could achieve alone. The research team, led by MIT doctoral student Michelle Vaccaro alongside professors Abdullah Almaatouq and Thomas Malone, attributed this advantage to the dual nature of creative work: it requires human knowledge and interpretive insight alongside repetitive execution work where AI consistency is an asset. Malone summarized the practical design principle directly: assign AI the background research, pattern recognition, predictions, and data analysis; then apply human judgment to spot nuances and contextual understanding that models reliably miss.

Tasks primarily involving pattern recognition and structured analysis should be configured to let AI lead, with human review focused on validation and exception-handling. Tasks requiring contextual synthesis, relationship intelligence, ethical judgment, and creative direction should center human expertise, with AI handling the preparatory and mechanical dimensions. Which approach fits depends entirely on the specific structure of the task at hand, assessed honestly and task by task.

Phaneesh Murthy has articulated the managerial obligation here in specific terms: “If AI is doing what humans are uniquely good at, leadership has failed to design the system correctly.” The goal is elevation of human contribution, structuring roles so that the work AI handles at scale frees human effort for the work that AI cannot do reliably. That outcome does not happen by default. It requires deliberate allocation decisions, mapped against an honest account of where each party performs best.

Rethinking Performance Metrics for AI-Augmented Teams

One of the organizational consequences of AI integration that companies consistently underestimate is the obsolescence of volume-based performance metrics. When an AI system can produce ten structured analyses in the time it previously took a team to produce one, measuring employees by analysis volume stops capturing any meaningful signal about their contribution. The competitive value shifts entirely to what people do with those analyses: the quality of interpretation, the soundness of judgment, the strategic alignment of the decisions that follow.

McKinsey’s survey data shows that high-performing AI organizations have deliberately reoriented their evaluation frameworks. Rather than output quantity, these organizations assess strategic contribution, interpretive quality, and the ability to identify where AI outputs require human correction. This shift requires explicit management action: if performance review processes still reward volume, employees will optimize for volume, and the human effort that should be migrating toward judgment and synthesis stays anchored to production.

The business case for getting this right is not abstract. Organizations that invest in workforce development alongside AI deployment, specifically in skills that complement machine capabilities rather than duplicate them, are 1.8 times more likely to report better financial results, according to Deloitte’s 2025 Human Capital Trends research, as reported by the World Economic Forum. Companies treating AI as a technology decision, without the parallel investment in human capability development and measurement redesign, are accepting a substantial performance gap.

The employee experience dimension reinforces this finding from a different angle. Teams that experience AI as a tool removing low-complexity work from their roles, creating genuine space for higher-order contribution, report greater engagement and stronger retention signals than teams experiencing AI as an output pressure added to unchanged role expectations. Murthy’s consistent advisory emphasis, expressed at Primentor through his work with technology companies and advisory board engagements, is that technology should expand what people can contribute. Making that principle operational requires pairing AI deployment with deliberate changes to how performance is defined and measured.

Passive Dependence and the Erosion of Analytical Judgment

Efficiency and dependency occupy the same space in organizations where AI review habits go unmanaged. When AI tools reliably produce accurate summaries, forecasts, and recommendations, the cognitive pressure to maintain independent analytical capacity gradually decreases. Teams begin accepting AI outputs with a level of scrutiny they would not apply to analysis produced by a senior colleague. The habit of critical review, which generates no visible productivity metrics, quietly atrophies.

Murthy has identified this pattern with precision: accountability for organizational decisions belongs to the human leaders who act on AI outputs, irrespective of which system generated the underlying analysis. Delegation to a machine does not transfer accountability. When an AI-generated forecast proves wrong, or when an AI-drafted communication misreads a client relationship, the manager who approved the output owns the consequence. The accountability structure of an organization remains unchanged by the origin of the analysis that informs a decision.

IBM’s Phaedra Boinodiris, the company’s Global Leader for Trustworthy AI, identifies human accountability as the critical structural requirement in responsible AI governance. “We need people in funded positions of power who are held accountable for the outcomes of these models,” she stated in IBM’s 2025 analysis of the governance landscape. The structure she describes, with clear and documented lines of human responsibility attached to AI-generated outputs, is what distinguishes organizations managing AI risk deliberately from those accumulating it through uncritical adoption.

The management intervention is cultural before it is procedural. Teams explicitly encouraged to interrogate AI outputs, to assess what a model might have missed, whether confidence levels are realistic, whether training data reflects current conditions, retain the analytical capacity that passive reliance erodes. Building a culture where questioning AI is treated as professional diligence rather than inefficiency is a managerial design choice, and one that Murthy’s framework treats as a leadership responsibility rather than a compliance exercise.

The Governance Obligation in AI-Delegated Decision-Making

When AI systems influence hiring decisions, customer communications, credit assessments, or strategic resource allocation, delegation introduces ethical obligations that cannot be addressed after problems surface. A model that produces one biased recommendation in isolation does limited harm. The same model, deployed at scale across thousands of consequential decisions, amplifies that bias proportionally. The leverage that makes AI powerful, its speed, consistency, and volume processing, is exactly what magnifies the downstream impact of a flawed underlying judgment.

Murthy addresses this in terms that locate it as a leadership problem: “Technology scales decisions. If those decisions lack ethical clarity, scale becomes dangerous.” This is the argument for embedding governance upstream of deployment rather than treating it as a corrective intervention. The EU AI Act, in enforcement since 2024, imposes binding requirements on organizations deploying AI in high-risk contexts (hiring, credit, and certain customer-facing applications) with penalties reaching 6 percent of global annual revenue for violations. The regulatory framework makes legally explicit what sound management practice should require regardless: documented evidence of how systems generate outputs, what their limitations are, and how consequential decisions maintain meaningful human oversight.

The current implementation gap is significant. IBM’s 2025 analysis of AI governance found that clear accountability structures remain the exception rather than standard practice, and compliance leaders across industries consistently rank AI as a top-tier risk category. The enterprise landscape is still catching up to deployment decisions already made. For managers without governance backgrounds, the practical obligation is narrower than a full compliance program: before delegating any consequential decision-making function to an AI system, understand how the system generates its outputs, what data informs it, and how errors will be identified and corrected.

What Designing Human-Machine Collaboration Actually Requires of Managers

The World Economic Forum’s 2025 Future of Jobs Report estimates that while AI-related automation may eliminate 92 million roles by 2030, 170 million new roles will emerge in parallel, a net addition of 78 million positions requiring fundamentally different competencies than those being displaced. The work that grows in value as pattern recognition and structured analysis shifts to machines is precisely the work that machines do least reliably: contextual synthesis, ethical reasoning, relationship management, and creative judgment.

These are the capabilities that Phaneesh Murthy has consistently argued should be the destination for human effort once AI absorbs the mechanical dimensions of knowledge work. His earlier development of the iTOPS model, which restructured IT service delivery around measurable client outcomes rather than task volumes, anticipated the measurement challenge that AI-integrated organizations now face at scale. When AI handles production, human value-add lies in the judgment applied to what AI produces. Evaluating that contribution, developing the talent required for it, and allocating work in ways that draw it out demand a clear account of what judgment actually consists of in each role.

The managerial task that AI adds is specific and demanding. Work allocation decisions now require a prior analysis of task structure: what kind of work this is, which elements benefit from AI scale and speed, which elements require human accountability and contextual intelligence, and whether the handoffs between machine-generated and human-applied work are designed to use each effectively. This analysis does not happen automatically. It requires managers who understand both the capabilities of their AI systems and the full range of capabilities within their teams.

McKinsey’s data suggests that relatively few organizations are doing this work systematically. AI high performers are the exception, companies that have made workflow redesign explicit, elevated it to a strategic priority, and established senior-level accountability for it. Murthy has put the challenge directly: the coming changes to how work gets organized are no longer hypothetical, and the open variable is whether leaders adapt their management approach in time to shape those changes rather than absorb them. For organizations still treating AI adoption as a technology deployment exercise, the answer to that question is already accumulating in their performance data.