Research · AI Transformation
Autonomy, Not Automation
Why leaders must stop treating agentic AI like RPA, and how to govern the difference between a clockwork script and a synthetic employee.
The transition from Robotic Process Automation (RPA) to agentic AI is being dangerously misunderstood as a simple upgrade in capability. It is not. It represents a fundamental shift from deterministic execution to delegated autonomy.
This paper argues that governing agentic systems with the tools designed for rigid software automation creates a critical liability gap. When an organisation deploys an agent, it is no longer issuing deterministic instructions; it is delegating authority. Recognising this shift demands a radical restructuring of enterprise economics, talent acquisition, and risk management - moving capital expenditure away from initial development and toward continuous behavioural assurance.
A linguistic drift is happening in the boardroom, and it is leading directly to a governance disaster.
Executives are desperate to contextualise AI within familiar historical bounds. They compare it to the advent of the Internet, or even the Printing Press, treating it as the next great phase of technological evolution. This is a psychological coping mechanism, and it is fundamentally, historically wrong.
As recent economic literature points out, previous General Purpose Technologies (GPTs) have almost exclusively been engines of physical power (the steam engine) or engines of distribution (the printing press, the internet). The Printing Press and the Internet share the exact same architectural constraint: they take human logic, human reasoning, and human endeavour, and they move it faster, further, or cheaper. They are deterministic tools that sit completely idle until a human provides the cognitive input.
Agentic AI is not an engine of distribution. As researchers framing the “cognitive revolution” argue, it is an engine of synthetic cognition. It does not speed up human reasoning; it generates its own. It thrives on unstructured ambiguity, prioritising options and deciding on directions without human intervention. You don't map the steps; you provide the boundaries and the objective. The system figures out the path. When the environment changes, it adapts. It isn't clockwork. It is, for all practical purposes, a synthetic employee.
Calling a reasoning engine “automation” is like calling a self-driving car a cruise control upgrade. The language masks the revolution. If you govern cognitive autonomy using the tools built for software efficiency, you will either strangle the value entirely, or expose your firm to risks you haven't even begun to measure.
RPA fails by doing nothing. Agentic AI fails by doing the wrong thing, confidently, at machine speed. Managing that requires a completely different playbook.
The brittleness of rules vs. the hallucination of autonomy
The primary failure mode of RPA is brittleness. It fails by halting. It throws an exception and logs it. It's annoying, but it's safe.
The primary failure mode of an LLM-backed agent is not halting. It is confabulation or, worse, uncontrolled adaptation. When Air Canada deployed an automated customer service chatbot, it autonomously fabricated a retroactive bereavement discount that directly contradicted the airline's actual policy. In the resulting 2024 tribunal case (Moffatt v. Air Canada), the airline attempted the extraordinary defence that the chatbot was a “separate legal entity” responsible for its own actions. The tribunal rejected this out of hand, cementing a brutal new legal precedent: when a firm delegates authority to an agentic system, it assumes total, unmitigated liability for the system's improvisations.
This changes the control environment completely. RPA testing is about exhaustive path mapping. Have we tested every branch of the logic tree? Yes? Deploy it.
Agentic testing is about behavioural bounds. You cannot write a unit test for every possible conversation a customer service agent might have. It's mathematically impossible. Instead, you have to test the agent's judgement under pressure, just exactly as you would test a human employee during a probationary period.
Delegation, not instruction
The distinction is stark: When you deploy RPA, you are giving instructions. When you deploy an agent, you are delegating authority.
Delegation requires an entirely different management architecture. Effective human managers do not operate by reviewing every single keystroke an employee makes. Instead, they manage through the establishment of clear boundaries that define what cannot be done, explicit escalation paths that dictate when an employee must pause to seek guidance, and rigorous post-action reviews to evaluate outcomes against original intent. Finally, they rely on reward mechanics to ensure the definition of a “good decision” is mutually understood.
This is exactly how agentic systems must be governed. Yet, look around. We see organisations trying to apply rigid change-advisory board (CAB) processes to prompt engineering. They are trying to pre-approve the exact words the agent will use in a live conversation. It is functionally impossible, and it destroys the very adaptive value they bought the technology for in the first place.
You are no longer writing instructions. You are delegating authority. And delegation requires trust, boundaries, and consequences.
The hidden economics of trust
RPA projects were capital-intensive to build and cheap to run. The ROI equation was simple: hours saved versus build cost. This deterministic predictability is exactly why legacy RPA providers like UiPath saw nearly 50% wiped off their market value in mid-2024, as enterprise budgets aggressively pivoted away from rigid task execution and toward generative reasoning.
Agentic systems flip this on its head. They are often astonishingly cheap to prototype. A clever developer can wire together an LLM, a search tool, and an email API over a weekend. The cost isn't in the build. The cost is in the assurance.
The new economics dictate that agentic AI shifts capital expenditure away from building automation and toward maintaining trustworthiness. Rather than paying for code, organisations must now pay for confidence. This transition requires funding continuous red-teaming and adversarial testing - essentially paying specialists to actively break the system. It demands incident response protocols that treat behavioural anomalies as serious operational risk events, rather than mere IT bugs. Furthermore, deploying autonomous systems safely necessitates robust human-in-the-loop oversight for high-stakes decisions and granular audit traceability to mathematically prove why an agent took a specific action.
If a leader tries to squeeze agentic AI into a one-off IT project budget, they are effectively choosing one of two outcomes: wildly under-controlled systems running amok, or systems that get permanently abandoned the moment they make their first unpredictable mistake.
We need behaviour engineers, not process mappers
RPA delivery is process-centric. You map the workflow, you automate the steps. That world belongs to process analysts and low-code developers.
Agentic AI delivery is behaviour-centric. The requirement to define goals, constraints, and acceptable parameters under deep uncertainty demands a radically different mix of talent. It requires AI product leaders who are accountable for commercial outcomes and risk trade-offs, rather than merely system uptime. The engineering function must evolve to include evaluation engineers who build behavioural test harnesses - acting essentially as psychologists for machines - alongside adversarial testers whose sole mandate is to deliberately induce failure before customers do. Most critically, domain experts must be embedded directly in the execution loop, because judging a “good” outcome versus a “wrong” one is now a matter of contextual nuance, not syntax.
Staff an agentic AI project like an RPA project, and you will get an RPA-shaped control environment wrapped around a wildly non-RPA system. That mismatch is exactly where compounding risk breeds.
The Autonomy Imperative
We have to stop soothing ourselves with yesterday's language. The shift from RPA to agentic AI is the shift from deterministic execution to delegated autonomy. To survive this transition, leadership teams must immediately begin by renaming the problem: classifying RPA strictly as “deterministic automation” and agentic AI as “goal-driven autonomy,” recognising that vocabulary serves as the first line of governance. With the taxonomy corrected, organisations must build rigorous behavioural assurance before attempting to scale; an inability to explicitly describe how an agent's judgement is evaluated means the firm does not have a production-grade agent, but rather a liability. Cultivating this assurance requires funding agentic systems as permanent products rather than temporary IT projects, acknowledging that trustworthiness demands ongoing investment. Finally, firms must staff for the reality of the work, moving past the era of hiring developers to build impressive demos and instead recruiting professionals who know how to evaluate, govern, and operate autonomous systems safely.
RPA taught organisations to love automation because it was legible. It looked like productivity with clean, neat edges. Agentic AI is not neat. It introduces genuine autonomy directly into the execution layer of your business.
Autonomy is the massive, untapped opportunity. But autonomy is also the hazard. It's time we started managing the difference.
About the authors
Angus Morrison is Transformation Director and Founder of Evolve Cubed. A former British Army Intelligence Officer, he has led £100m+ transformations for organisations including Lloyds Banking Group and Tesco. He holds an MSc in Digital Transformation from Henley Business School and brings 30 years of FTSE 100, private equity, and public sector experience to the governance of emerging technology.
Warren Paull is Strategic Advisor to Evolve Cubed. With 15 years building commercial systems across professional services, fintech, cryptocurrency, media and logistics, he ensures the firm's diagnostic methodology delivers measurable business outcomes — not just better thinking.
