What actually makes an AI system agentic?
“Agentic AI” entered the mainstream vocabulary in 2024, and the definition has been stretched to cover everything from a chatbot that calls a function to a fully autonomous system that plans, executes, and self-corrects over extended task sequences. The confusion is counterproductive: organisations evaluating agentic AI capabilities need a clear definition to make build-or-buy decisions, and the current marketing-driven terminology makes that difficult.
The distinction is actually straightforward. Generative AI produces output in response to input — a prompt goes in, a response comes out. The model does not take actions, does not maintain state across interactions (beyond the context window), and does not autonomously pursue a goal. Agentic AI uses a generative model as its reasoning engine but adds the capability to take actions (call APIs, query databases, execute code), maintain state across multiple steps, plan multi-step workflows toward a defined goal, and self-correct when intermediate steps produce unexpected results.
Gartner’s Predicts 2025: AI Agents report (October 2024) forecasts significant growth in agentic AI capabilities within enterprise software by 2028, up from less than 1% of applications in 2024. Venture capital trends point in the same direction: a growing share of AI startups receiving Series A funding now describe their product as ‘agentic’ — a signal of market direction, though not all of those products meet the technical definition below. These are market-adoption forecasts, not technical maturity claims — the label “agentic” is applied more broadly in marketing than in engineering practice.
The model architecture is often the same — GPT-4, Claude, Gemini, or Llama can serve as the reasoning engine for both generative and agentic applications. The difference is not in the model but in the surrounding system: the tool interfaces, the planning loop, the state management, and the autonomy level. The technical definition below is more restrictive than the market usage — deliberately so, because deployment decisions should be based on architectural properties, not on label adoption.
What makes a system agentic
An agentic system has four properties that a purely generative system does not:
Tool use. The model can invoke external tools — APIs, databases, calculators, code interpreters, web browsers, file systems — and incorporate the tool output into its reasoning. A generative model that can search the web and use the search results in its response is exhibiting tool use. A generative model that only produces text from its pre-trained knowledge is not.
Planning. The model decomposes a high-level goal into a sequence of steps and executes them in order. The plan may be explicit (the model generates a step-by-step plan before executing) or implicit (the model decides the next action based on the current state without generating a complete plan upfront). Planning enables multi-step task completion that goes beyond single-turn generation.
Memory and state. The agent maintains context across steps — remembering what it has already done, what results previous steps produced, and what remains to be done. This state management may use the model’s context window, an external memory store, or a combination. Without persistent state, the agent cannot reason about progress toward a goal across multiple actions.
Autonomy and self-correction. The agent detects when an action produces an unexpected result and adjusts its approach. If a database query returns no results, the agent reformulates the query rather than reporting failure. If a code execution produces an error, the agent reads the error message and modifies the code. This feedback loop — act, observe, adjust — is what separates agentic behaviour from scripted automation.
Where the boundary is drawn
The boundary between “generative AI with tool use” and “agentic AI” is fuzzy, and reasonable people draw it differently. Our working definition: a system is agentic when it autonomously executes multiple steps toward a goal, with the ability to branch, retry, or adapt based on intermediate results. A system that takes a single action (e.g., searching the web and summarising the results) is tool-augmented generative AI, not agentic.
This distinction matters for deployment because agentic systems introduce risks that single-turn generative systems do not:
- The agent can take a wrong action with real-world consequences (sending an email, modifying a database, executing a payment).
- The agent can enter infinite loops or take excessive actions, running up API costs.
- The agent’s decision-making is harder to audit — a 15-step reasoning chain is harder to review than a single response.
McKinsey’s 2023 analysis estimated that generative AI broadly could unlock up to $4.4 trillion in annual productivity value across the economy — and agentic workflows, which automate multi-step knowledge work, represent a significant share of that opportunity. This is a market-direction estimate that explains the intensity of vendor investment, not a near-term deployment forecast.
Agentic architecture example: invoice processing pipeline
A concrete example illustrates the agentic properties in practice. Consider an invoice processing pipeline with three agents:
Intake agent — monitors an email inbox and shared drive for new invoices. It extracts the document, identifies the format (PDF, scanned image, structured EDI), and routes it to the appropriate processing path. Control boundary: the intake agent can read inbound documents and write to the processing queue, but cannot modify financial records or approve payments.
Extraction agent — receives a routed document, extracts structured fields (vendor, amount, line items, payment terms, PO number), and validates the extracted data against the vendor master and purchase order database. If extraction confidence is below 85% on any field, the agent flags the invoice for human review rather than proceeding. Failure mode: OCR errors on scanned documents produce low-confidence extractions; the agent must recognise its own uncertainty rather than hallucinating field values.
Approval agent — matches the validated invoice against approval rules (amount thresholds, budget codes, duplicate detection) and either routes to the appropriate approver or auto-approves invoices below the threshold. Control boundary: the approval agent can route for approval and flag anomalies, but cannot execute payments — payment execution remains in the existing ERP workflow with human authorisation.
The pipeline’s failure boundaries are explicit: no single agent can both extract data and approve payment (separation of concerns), confidence thresholds force human review when the system is uncertain, and each agent’s write permissions are restricted to its specific function. These boundaries prevent the cascading failure mode where one agent’s error propagates unchecked through the entire workflow.
Current agentic frameworks
The practical implementation of agentic AI uses frameworks that provide the tool use, planning, memory, and orchestration infrastructure. Enterprise adoption has accelerated rapidly: industry surveys suggest that the proportion of AI teams evaluating or piloting agentic systems grew several-fold between 2023 and 2024, driven by the availability of production-ready orchestration frameworks.
LangChain / LangGraph provides a composable framework for building agentic workflows with tool use, state management, and conditional branching. LangGraph extends LangChain with explicit graph-based workflow definitions, enabling complex multi-step agents with defined control flow.
AutoGen (Microsoft) provides a multi-agent framework where multiple AI agents with different roles collaborate on tasks. The agents communicate through structured messages, with each agent specialising in a subset of the task (e.g., a “coder” agent, a “reviewer” agent, a “planner” agent).
CrewAI provides a role-based multi-agent framework focused on defining agent “crews” with specific roles, goals, and tools. The framework manages the agent coordination and task delegation.
OpenAI Assistants API / Anthropic Tool Use provide built-in agentic capabilities at the model API level — tool calling, code interpretation, and file retrieval as native API features rather than external framework layers.
The multi-agent coordination patterns that emerge from these frameworks are an active area of development — the architectures are not yet standardised, and best practices are still consolidating.
When agentic AI is appropriate — and when it is not
Agentic AI is appropriate when the task requires multiple autonomous steps, tool use, and adaptive decision-making: research tasks (gathering information from multiple sources, synthesising findings), workflow automation (processing multi-step business processes that require judgment), code generation and debugging (writing code, testing it, and fixing errors iteratively), and complex data analysis (querying multiple data sources, combining results, and interpreting findings).
Agentic AI is not appropriate when the task is a single-step generation (write a marketing email), when the error cost of autonomous action is too high (the agent should not autonomously modify production databases without human approval), or when the task is well-defined enough that traditional automation (scripts, workflow engines, rule-based systems) can handle it more reliably and cheaply.
We advise clients to evaluate whether the task genuinely requires the adaptive multi-step capability that agentic systems provide — or whether a simpler approach (a generative model with a single tool call, or a traditional automation pipeline) would achieve the same result with less complexity and lower risk. The GenAI feasibility assessment includes this determination as part of a broader GenAI Feasibility Assessment.