Deep Dive10 March 202618 min read

AI Agents: What They Actually Are and What They Can Do for Your Business

Everyone is talking about AI agents. Most of what you hear is vendor marketing. This guide explains what agents actually are, what works today in consumer-facing businesses, what does not, and how to think about them for your programme.

A note on currency. This article was published in March 2026. The AI agent landscape is moving at least as fast as the model landscape, possibly faster. New frameworks, platforms, and vendor offerings appear weekly. The principles and decision frameworks in this article will hold, but treat specific vendor capabilities, pricing, and adoption figures as a snapshot. Verify directly before making purchasing decisions.

What an agent actually is

An AI agent is not a chatbot. It is not an automated workflow. It is something genuinely different, and the distinction matters.

A chatbot takes a question, sends it to a large language model, and returns the answer. It is a single turn: question in, answer out. An automated workflow follows a predefined sequence of steps with no decision-making at runtime. Both are useful. Neither is an agent.

An agent is an LLM that can use tools and operate in a loop. It receives a goal, decides what to do, takes an action (calling an API, querying a database, sending a message), observes the result, and then decides what to do next. It keeps going until the goal is achieved or it determines it cannot proceed.

Think of it this way. A chatbot is like asking a colleague a question and getting an answer. An automated workflow is like a conveyor belt that moves items through fixed stations. An agent is like giving a competent team member a task and letting them figure out the steps, use the tools available, and come back when it is done.

The three components that make something an agent:

A reasoning engine (the LLM). This is what decides what to do next. It interprets the goal, evaluates the current state, and chooses the next action.
Tools. These are the things the agent can do: search a database, call an API, read a document, send an email, update a record. Without tools, it is just a chatbot.
A loop. The agent acts, observes the result, and decides what to do next. This iterative cycle is what separates agents from single-turn interactions.

This is not science fiction. It is production software running in enterprises today. According to recent industry data, 57% of companies already have AI agents in production, with another 22% in pilot.

The spectrum of autonomy

Not all agents are equal. There is a spectrum from human-assisted to fully autonomous, and where you sit on that spectrum has profound implications for risk, cost, and value.

Level	Description	Human Role	Example
Copilot	Suggests actions, human executes	Human decides and acts	GitHub Copilot suggesting code, email draft suggestions
Assistant	Takes actions with human approval	Human reviews and approves	Customer service agent drafting a response for an operator to send
Semi-autonomous	Acts independently within guardrails, escalates exceptions	Human handles edge cases	Returns processing agent that auto-approves standard returns, escalates unusual ones
Fully autonomous	Acts independently end-to-end	Human monitors outcomes	Inventory reorder agent that monitors stock levels and places purchase orders

Most production deployments today sit at the assistant and semi-autonomous levels. Fully autonomous agents exist but are typically limited to low-risk, well-defined tasks where the cost of error is small. The marketing often implies full autonomy. The reality is usually human-in-the-loop.

Gartner forecasts that 40% of enterprise applications will embed task-specific AI agents by 2026, up from less than 5% in 2025. That is rapid growth, but note the qualifier: task-specific. These are agents doing defined jobs, not general-purpose digital workers.

What works today in consumer-facing businesses

Here is where agents are delivering real value right now, in retail, hospitality, and direct-to-consumer operations.

Customer service triage and resolution

This is the most mature use case. AI agents handle first-line customer enquiries, resolve straightforward issues (order status, delivery tracking, FAQ responses), and route complex cases to human agents with full context. Wyndham Hotels uses AI agents that cut brand review time by 94% and significantly improved guest support response times. In hospitality more broadly, AI agents are automating up to 80% of guest enquiries, with some platforms reporting 66% of enquiries resolved without human involvement.

The key: these agents work because customer service has well-defined inputs (customer messages), clear success criteria (issue resolved), and an obvious escalation path (hand to a human). The failure mode is manageable.

Order management and returns processing

Agents that can look up an order, check return eligibility against policy, generate a return label, and process a refund. This is a natural fit because the rules are codifiable, the data is structured, and the volume is high enough to justify automation. Semi-autonomous agents handle standard cases and escalate exceptions like damaged goods or out-of-policy requests.

Inventory and demand queries

Internal-facing agents that let merchandisers, store managers, or operations teams ask natural language questions about stock levels, sell-through rates, and demand patterns. Instead of running reports or navigating BI tools, the team asks the agent and gets an answer. This is closer to the assistant end of the spectrum — the agent retrieves and synthesises information rather than taking actions.

Scheduling and workforce management

Agents that assist with shift scheduling, taking into account availability, labour rules, demand forecasts, and employee preferences. These typically operate as assistants — proposing schedules for manager approval rather than publishing them autonomously. The value is in reducing the hours spent on schedule construction, not in removing human judgement.

Content generation and personalisation at scale

Product descriptions, email campaigns, social media content, and localised marketing materials generated by agents that pull product data, brand guidelines, and customer segment information to produce tailored content. Migros, the Swiss retailer, demonstrated this approach with AI-generated product descriptions at scale. The agent model extends this by connecting directly to product databases and publishing systems.

Internal knowledge retrieval

Agents that sit on top of company policies, training materials, and operational documentation. Staff ask questions in natural language and get answers grounded in actual company documents. This is particularly valuable in hospitality and retail where staff turnover is high and policy knowledge is uneven.

What does not work yet and why

The gap between demo and production is real. Here is where agents consistently fall short.

Anything needing nuanced judgement in ambiguous situations

An agent can process a standard return. It struggles when a loyal customer with a complex history has a complaint that requires reading between the lines, making a commercial judgement call, and preserving the relationship. Agents are pattern-matchers, not relationship managers. They handle the predictable middle of the bell curve well. The tails — the unusual, the ambiguous, the emotionally charged — still need humans.

Multi-system orchestration with poor APIs or legacy systems

Agents are only as good as the tools they can access. If your order management system has a robust API, an agent can query and update it. If your workforce management tool requires screen-scraping or manual CSV uploads, the agent is stuck. Most consumer-facing businesses run a patchwork of systems, many of which were not designed for programmatic access. The agent bottleneck is often not the AI — it is the integration layer.

Customer-facing agents where brand risk is high

Every retailer and hospitality brand remembers the early chatbot disasters: agents offering absurd discounts, making offensive statements, or confidently providing wrong information. Customer-facing agents carry brand risk. The more autonomous the agent, the higher the risk. For premium brands where every interaction matters, the tolerance for error is close to zero, which pushes you back towards the assistant model with human review.

Anything requiring real-time data that is not API-accessible

An agent cannot tell a customer whether an item is in stock at their local store if the stock system does not expose that data via an API. By 2027, IDC estimates that 80% of agentic AI use cases will require real-time, contextual data access. Most organisations are not there yet. Data infrastructure is the unglamorous prerequisite that vendor demos gloss over.

The demo-to-production gap

A demo shows an agent handling a single, well-crafted scenario. Production means handling thousands of messy, unpredictable interactions simultaneously. The agent that works beautifully in a controlled demo may fail at scale due to latency, cost, edge cases, or simple reliability issues. Nearly 60% of hospitality firms express concerns about trust and accuracy of AI systems, and those concerns are well-founded for anyone who has tried to scale an agent from pilot to production.

Evaluating AI vendor platforms — the buy vs build decision is one of the first your team will face

The vendor landscape

The market for AI agents splits into five broad categories. The right one depends on your size, technical capability, and how much control you need.

Platform agents

These are AI agents built into the enterprise platforms you already use.

Salesforce Agentforce is the most visible. Salesforce reports it as their fastest-growing product ever, with 18,500 deals closed in 2025 (9,500 paid), 330% ARR growth year-over-year, and customers in production jumping 70% quarter-over-quarter. The proposition: agents that work natively with your Salesforce data, handling customer service, sales development, and commerce tasks.

Microsoft Copilot Studio lets organisations build agents on top of the Microsoft 365 and Dynamics ecosystem. Recent updates focus heavily on governance — agent lifecycle management, evaluations, and enterprise controls. Microsoft introduced Agent 365 as a unified control plane for managing agents across the organisation.

ServiceNow is pushing autonomous agents for IT service management, with their L1 Service Desk AI Specialist shipping in Q2 2026. They claim these agents handle more than 90% of employee IT requests, 99% faster than a human would. The broader play is agents across HR, facilities, and operations workflows.

When to choose platform agents: When you are already invested in the ecosystem, the use case aligns with what the platform supports, and you want the fastest path to production with managed governance.

Desktop and knowledge work agents

A newer category that brings agent capability to individual knowledge workers rather than requiring enterprise deployment.

Claude Cowork is Anthropic's entry here, launched in January 2026. It brings the agentic architecture from Claude Code (the developer tool) to general knowledge work. Cowork can read and write local files, break complex tasks into sub-agents, generate Excel spreadsheets with working formulas, build presentations, and connect to Google Drive, Gmail, DocuSign, and other services. Microsoft has integrated Claude Cowork into Copilot as part of a new E7 licensing tier. For a small business, this is one of the fastest routes from "we should use AI" to "AI is doing useful work" — no integration project required.

When to choose desktop agents: When you want to augment individual productivity without building infrastructure. Particularly useful for small teams where the bottleneck is knowledge work (reports, analysis, document preparation) rather than high-volume transactional processing.

No-code and low-code agent platforms

For businesses without dedicated engineering teams, workflow automation platforms now offer genuine agent capabilities.

n8n is the standout for teams with any technical inclination. It is open source, self-hostable (important for data privacy), and its AI Agent nodes support multi-step reasoning workflows that go well beyond simple automation. You can build an agent that receives a customer enquiry, classifies it, looks up the order, drafts a response, and routes exceptions to a human — all in a visual workflow builder. Because it is self-hosted, your data stays on your infrastructure. Pricing starts free for self-hosted; cloud plans from $24/month.

Make.com (formerly Integromat) takes a fully managed approach with a visual builder, pre-built AI integrations, and shareable agents. Less technical flexibility than n8n, but lower barrier to entry. It is building towards conversational workflow creation and visual multi-agent orchestration. Plans from $10/month.

Zapier has added AI agent capabilities to its established automation platform. If you already use Zapier for simple automations, the upgrade path to agents is natural. Less powerful than n8n for complex agent logic, but the ecosystem of 6,000+ app integrations is unmatched.

For most small and mid-size consumer-facing businesses, this category is where agent adoption will actually start. A typical stack — n8n or Make plus an LLM API — costs £200-500/month and can automate workflows that would otherwise require additional hires.

When to choose no-code platforms: When you do not have a development team, your use cases involve connecting existing SaaS tools, and you need to move fast on a modest budget.

Vertical-specific agents

Specialist vendors building agents for specific industries. In hospitality: Conduit, Canary Technologies, and others building guest communication agents, revenue management assistants, and operations copilots. In retail: AI shopping assistants, inventory optimisation agents, and merchandising copilots from vendors like Vue.ai, Lily AI, and others.

When to choose vertical agents: When your use case is industry-specific and the vendor has domain expertise and pre-built integrations with the systems you use.

Build-your-own frameworks

For organisations with engineering capability, open-source and commercial frameworks let you build custom agents.

Framework	Strengths	Considerations
LangChain / LangGraph	Largest ecosystem (47M+ PyPI downloads), model-agnostic, production-grade with state management and human-in-the-loop	1-2 week learning curve, complexity can be overkill for simple agents
CrewAI	Fast prototyping, role-based multi-agent collaboration, first-class MCP support	Less mature for production-scale deployments
OpenAI Agents SDK	Lowest barrier to entry, handoffs and guardrails in under 100 lines of code	Vendor-locked to OpenAI models
Anthropic Agent SDK	Strong tool-use capabilities, works with Claude models	Newer, smaller ecosystem
Microsoft AutoGen / Semantic Kernel	Enterprise-grade, strong Microsoft integration	Heavier frameworks, steeper learning curve

A notable development: Anthropic's Model Context Protocol (MCP) is emerging as the standard for how agents connect to external tools and data sources. Frameworks that support MCP gain access to a growing ecosystem of pre-built integrations, similar to how REST APIs standardised web service communication. If you are building custom agents, MCP compatibility should be on your checklist.

A word on OpenClaw. OpenClaw (formerly Clawdbot/Moltbot) is the open-source autonomous agent that went viral in late 2025, amassing 247,000 GitHub stars by March 2026. It runs locally, connects to any LLM (Claude, GPT, DeepSeek), and is accessed via messaging platforms like WhatsApp, Signal, or Telegram. The appeal is obvious: a personal AI agent that can take real actions on your behalf, running on your own hardware. The creator, Peter Steinberger, joined OpenAI in February 2026 and the project is transitioning to an independent foundation. The caution: OpenClaw gives an agent broad access to your systems with minimal guardrails by default. For personal experimentation, it is fascinating. For business use, you need to think carefully about what actions the agent can take, what data it can access, and what happens when it gets something wrong. The governance and accountability frameworks that enterprise platforms build in are largely absent here. Treat it as a glimpse of where things are heading, not as a production business tool — yet.

When to build your own: When your use case does not map to existing platform or vertical agents, you have engineering capability, and you need full control over the agent's behaviour, cost, and data handling.

Choosing your entry point

The right starting point depends on your size and technical capability, not on ambition.

If you are a small business (under 50 people, no dev team): Start with Claude Cowork for individual productivity or n8n/Make for workflow automation. You can have agents doing useful work within a week for under £500/month. Do not try to build custom agents — you will burn time and money.

If you are a mid-size business with some technical resource: Evaluate whether your existing platforms (Salesforce, Microsoft 365, ServiceNow) already offer agent capabilities. If they do, start there — the integration work is already done. If not, n8n with an LLM API gives you significant flexibility without a full engineering project.

If you are a larger business with engineering capability: You have the full spectrum available. Platform agents for standard use cases, vertical agents for industry-specific needs, and custom-built agents (using LangGraph, CrewAI, or similar) for genuinely differentiated applications. Most will use a mix.

The common mistake is over-engineering the first deployment. A Make.com workflow that routes customer enquiries to the right team and drafts responses is an agent in practical terms, and it can be built in an afternoon. Start there, prove the value, then invest in more sophisticated approaches.

Cost reality

Agents are more expensive to run than chatbots, and the difference is not marginal.

A simple chatbot makes one LLM call per user message. An agent might make 5-15 LLM calls per task: one to understand the request, one to decide which tool to use, one to formulate the tool call, one to interpret the result, one to decide the next step, and so on. Each call costs tokens.

Here is what that looks like in practice:

Scenario	LLM Calls Per Task	Estimated Token Usage	Monthly Cost (10,000 tasks)
Simple chatbot (FAQ)	1	~1,000 tokens	$50-200
Customer service agent	3-5	~3,000-5,000 tokens	$300-1,500
Order management agent	5-8	~5,000-8,000 tokens	$500-3,000
Multi-step research agent	8-15	~10,000-20,000 tokens	$1,000-8,000

These are LLM API costs only. Add infrastructure, monitoring, development, and maintenance, and production agent costs typically run $3,200-$13,000 per month for a real deployment serving real users.

The smart approach mirrors model selection: use cheap, fast models (GPT-5 Nano, Claude Haiku, Gemini Flash) for simple reasoning steps, and route complex decisions to more capable models. A well-designed routing strategy can cut API costs by 60-80% compared to using a frontier model for every call.

But cost is not just about API calls. The 2026 reality is that enterprises are discovering significant new line items on cloud bills from agentic AI. If you are planning agent deployments, build cost modelling into your business case from day one, not as an afterthought.

Risk and governance

Agents introduce risks that are qualitatively different from traditional software, and most organisations are not yet equipped to manage them.

Accountability when agents make mistakes

When an agent makes a wrong decision — refunding a fraudulent return, sending incorrect information to a customer, placing an order with the wrong supplier — who is accountable? The team that built it? The team that deployed it? The vendor? This is not a theoretical question. It needs answering before you put an agent into production, and the answer needs to be written down and agreed.

The human-in-the-loop question

How much autonomy to give an agent is a risk decision, not a technology decision. The framework is straightforward:

Low-risk, high-volume, reversible actions: More autonomy. Auto-approve standard returns, auto-respond to common enquiries, auto-update stock records.
High-risk, irreversible, or customer-facing actions: Human approval. Issuing large refunds, sending communications to customers, making purchasing commitments.

The mistake most organisations make is treating this as binary — either the agent is autonomous or it is not. In practice, the same agent should have different autonomy levels for different actions within the same workflow.

Data access and security

An agent with tool access can do things. That is the whole point. But an agent that can query your customer database, update order records, and send emails has a much larger blast radius than a chatbot that can only generate text. Every tool you give an agent is an attack surface. Principle of least privilege applies: give agents access only to the tools and data they need for their specific task, nothing more.

EU AI Act implications

The EU AI Act becomes fully enforceable on 2 August 2026. For autonomous agents, two requirements are particularly relevant:

High-risk classification. If your agent operates in areas the Act classifies as high-risk (which includes certain employment, essential services, and law enforcement contexts), it faces mandatory requirements for data governance, automated logging, technical documentation, and human oversight.
Human override. The Act explicitly requires that high-risk AI systems include a mechanism allowing a human to intervene or shut the system down safely. Fully autonomous agents without override capabilities are not compliant.

Even if your agents do not fall into high-risk categories, the Act requires transparency — users must be informed when they are interacting with an AI system. If you operate in the EU or serve EU customers, build compliance into agent design from the start. Only one in five companies currently has a mature model for AI governance. Do not be in the other four.

What this means for your programme

If you are running or planning an AI transformation in a consumer-facing business, here is how to think about agents.

Start internal before going external

Internal-facing agents (knowledge retrieval, scheduling assistance, inventory queries, reporting) carry lower risk than customer-facing ones. They let you build organisational capability, understand the technology, and develop governance frameworks before putting agents in front of customers. Most successful agent programmes follow this sequence.

Pick use cases where failure is low-cost and recoverable

The best first agent use case is one where the agent handles a high volume of repetitive, well-defined tasks, where errors are easily spotted and corrected, and where the cost of a single mistake is small. Returns processing, internal FAQ, and data retrieval are good starting points. Pricing decisions, legal compliance, and crisis communications are not.

Instrument everything

You need to see what the agent is doing: every reasoning step, every tool call, every decision point. This is not optional. Without observability, you cannot debug failures, optimise performance, manage cost, or demonstrate compliance. Build logging and monitoring into the agent from day one. Microsoft, Salesforce, and ServiceNow are all investing heavily in agent observability for good reason.

Plan for human escalation paths

Every agent needs a clear, well-designed path to hand off to a human. This means the human receives full context (not just "a customer needs help"), the handoff is seamless from the customer's perspective, and the system tracks what gets escalated so you can improve the agent over time.

Do not build agents where a simple workflow would do

Not everything needs an LLM reasoning in a loop. If the logic is deterministic — if the same input always produces the same output — a traditional automated workflow is cheaper, faster, and more reliable than an agent. Agents add value when the task requires judgement, interpretation, or dynamic decision-making. For everything else, a well-built integration or workflow automation is the better choice.

Budget realistically

Factor in not just development costs but ongoing operational costs: LLM API usage, infrastructure, monitoring, maintenance, and the human time required for oversight and escalation handling. Agent deployments that look compelling in a business case built on demo-day assumptions often look different when real-world usage, edge cases, and operational overhead are factored in.

Sources

The State of AI in the Enterprise 2026 (Deloitte)
AI Agent Statistics: Adoption and Insights 2026 (Warmly)
AI Agent Adoption Statistics by Industry 2026 (Salesmate)
G2 Enterprise AI Agents Report 2026 (G2)
Agentforce Becomes Salesforce's Fastest-Growing Product (CX Today)
Salesforce Q3 FY26 Results (Digital Commerce 360)
6 Core Capabilities to Scale Agent Adoption in 2026 (Microsoft)
ServiceNow Agentic AI 2026 Guide (Kellton)
Wyndham Boosts Speed and Service with AI Agents (PwC)
AI for Hotels: Best Use Cases in 2026 (Conduit)
IDC: Agentic AI Will Redefine Travel and Hospitality in 2026 (IDC)
AI Agents, Data Governance and Workforce Shifts Redefine Retail in 2026 (BizTech Magazine)
AI Agent Frameworks Compared 2026 (Arsum)
LangGraph vs CrewAI vs OpenAI Agents SDK 2026 (Particula)
EU AI Act 2026 Compliance Guide (Secure Privacy)
How AI Agents Are Governed Under the EU AI Act (The Future Society)
The $400M Cloud Leak: Why 2026 is the Year of AI FinOps (Analytics Week)
Agentic AI Explained (MIT Sloan)
Introducing Cowork (Anthropic)
First Impressions of Claude Cowork (Simon Willison)
Microsoft Copilot Cowork Integrates Anthropic's Claude (GeekWire)
From Clawdbot to Moltbot to OpenClaw (CNBC)
What Is OpenClaw? (DigitalOcean)
n8n AI Agents (n8n)
AI Agents in No-Code Automation Platforms Comparison (Made by Agents)
Agentic AI for Small Business: Integration Guide 2026 (Digital Applied)