AI Agents
Last updated: June 1, 2026
AI agents are the shift from artificial intelligence that answers to artificial intelligence that acts. Most explanations stop at the first-order definition: an AI agent is software that perceives, plans, uses tools, and works toward a goal. That definition is useful, but it is not enough. The second-order question is what changes when AI can act. The third-order question is what humans should delegate when action becomes cheap, scalable, and partly autonomous.
This page makes the argument in three movements: first, what an AI agent is; second, why the agent stack matters more than the model alone; third, how agentic AI turns individual agents into an operating layer for work, commerce, software, and institutions. The page ends where the word agentic began: with human agency.
Editorial source note: this guide combines primary AI-agent literature, dated product releases, modern agent infrastructure documents, failure records, and original museum interpretation. Definitions are cross-checked against the AI Agent Taxonomy, AI Agent History Timeline, and Primary Sources Library. Corrections and missing primary sources can be sent to curator@agentichistory.org.
On this page
- Evidence standard for this guide
- Thesis 1: An AI agent is software that can act
- So what: from answers to consequences
- Thesis 2: power comes from tools, memory, and permissions
- So what: the real agent is the runtime
- Thesis 3: agentic AI is the operating layer
- So what: the unit is the delegated workflow
- Agentic history as a mirror of human history
- Human agency is the boundary of agentic AI
Evidence Standard for This AI Agents Guide
How This Page Uses Sources
Primary Sources Before Marketing Claims
This page is an interpretive guide, not a marketing glossary. It uses primary papers and technical references for definitions, dated product releases for modern agent milestones, failure records for risk claims, and clearly labeled museum interpretation for the human-agency argument. When a claim is definitional, it should connect to the taxonomy or library. When a claim is historical, it should connect to the timeline. When a claim is about operational risk, it should connect to incidents, governance sources, or the Failure Archive.
How We Separate Definition from Interpretation
Technical Claim, Operational Claim, Human Claim
The first-order claim is technical: an agent perceives, decides, and acts toward a goal. The second-order claim is operational: action changes risk because an agent can affect external systems. The third-order claim is human and institutional: delegation changes accountability. This structure is intentional, so readers can separate accepted agent definitions from the museum's broader argument about human agency.
What Would Change This Page
Material Update Threshold
This page should be updated when a stronger primary definition emerges, when a major agent capability changes the practical boundary of delegation, when new public failures clarify risk, or when the museum revises its interpretation of agentic AI. Update dates and schema dates should change only when the page has materially changed.
Thesis 1: An AI Agent Is Software That Can Act
AI Agent Definition
Formal Definition
An AI agent is a software system that can observe some context, decide what action to take, and execute that action toward an objective. In older AI literature, the language was perception, environment, action, and goals. In modern LLM systems, the practical stack is usually a model connected to tools, memory, planning logic, permissions, and feedback.
Classical Agents and LLM Agents
This is why the phrase "AI agent" is broader than "LLM agent." A thermostat can be described as a simple agent in classical AI terms; a reinforcement-learning system is an agent in an environment; a BDI system is an agent with beliefs, desires, and intentions; a modern LLM agent is an agent whose reasoning engine is a language model and whose actions are tool calls, API calls, code execution, browser actions, or computer use. For the formal classification, see the AI Agent Taxonomy.
What Is Not an AI Agent?
Conversational Output Is Not Enough
A model that only produces text is not automatically an AI agent. A chatbot that waits for a prompt, returns an answer, and cannot take action outside the conversation is better described as an assistant or conversational interface. It may be useful, but it is not yet acting in the world.
External State Is the Boundary
The boundary changes when the system gains action channels: search, code execution, database writes, calendar access, payments, browser control, issue creation, CRM updates, pull requests, ticket routing, or physical actuation. Once the system can change external state, the problem is no longer just whether the answer is correct. The problem is whether the action should have happened.
The Minimum Test for an AI Agent
Five Practical Qualification Questions
| Question | If no | If yes |
|---|---|---|
| Can it pursue a goal across more than one step? | Likely a chatbot or single-turn tool | Candidate agent |
| Can it choose among actions? | Likely scripted automation | Candidate agent |
| Can it use tools or affect external systems? | Assistant, not operational agent | Operational agent |
| Can it observe results and adjust? | One-shot workflow | Agentic loop |
| Can humans inspect, limit, and reverse what it does? | Unsafe agentic deployment | Governable agentic system |
Research basis: IBM describes AI agents in terms of tools, memory, reasoning, and planning; NVIDIA emphasizes orchestration, tools, memory, and policy controls for autonomous agents; Wooldridge and Jennings' 1995 paper established autonomy, social ability, reactivity, and proactivity as core agent properties. See also the museum's AI Agent Taxonomy and multi-agent systems sources.
So What: The Shift Is from Producing Answers to Creating Consequences
Wrong Answers Are Content Risk
Failure Inside the Answer
A traditional chatbot can mislead, hallucinate, offend, or confuse. Those failures matter, especially in high-stakes settings. But the failure usually remains inside the answer until a human acts on it.
Wrong Actions Are Operational Risk
Failure Inside the System
An AI agent can update a ticket, send an email, submit code, change a configuration, book a trip, make a recommendation, delete a file, approve a workflow, or trigger a payment. The error leaves the conversation and enters an operational system. That is why the history of AI agents cannot be separated from the history of responsibility.
Failure History Shows the Boundary
Public Incidents as Evidence
The Failure Archive documents the early warning signs. Air Canada was held liable for its chatbot's false bereavement-fare guidance. The NEDA Tessa chatbot gave unsafe advice to vulnerable users. The Chevrolet of Watsonville chatbot accepted a hostile prompt and agreed to sell a vehicle for $1. The NYC MyCity chatbot gave businesses illegal guidance. These were not all "agents" in the strongest modern sense, but they show the same direction of travel: as AI interfaces become authoritative, embedded, and action-adjacent, consequences move faster than governance.
Human Review Changes Shape
Review Moves Upstream
In a chatbot world, human review usually means checking output. In an agent world, human review must mean approving goals, tools, permissions, thresholds, reversibility, escalation rules, and logs. Review moves from the end of the process to the design of the process.
Trust basis: risk examples are drawn from the AI Agent & Chatbot Failure Archive. The distinction between content risk and operational risk is museum interpretation based on those public records and on the technical distinction between output generation and external-state-changing action.
Thesis 2: AI Agents Become Powerful When They Gain Tools, Memory, and Permissions
The AI Agent Stack
Runtime Components
A modern AI agent is better understood as a stack than as a model. The stack usually includes:
- Model: the language model or reasoning engine that interprets the task and proposes actions.
- Tools: APIs, databases, code execution, search, browsers, file systems, and other action channels.
- Memory: short-term context, long-term retrieval, user preferences, project state, or institutional knowledge.
- Planning loop: the control pattern that breaks goals into steps, acts, observes results, and revises the plan.
- Permissions: the boundaries that decide what the agent can read, write, spend, send, change, or delete.
- Feedback: logs, evaluations, human corrections, automated tests, and runtime monitoring.
Why ReAct Matters
Reason, Act, Observe
The 2022 ReAct paper is one of the defining documents of modern LLM agents because it joined reasoning and action in an interleaved loop. The model reasons about the task, acts through an external source or environment, observes the result, and updates its next step. That pattern is now visible in many agent frameworks and products. It is not the whole history of agents, but it is the technical bridge between language models and modern tool-using agents.
Why MCP Matters
Connector Infrastructure
The Model Context Protocol, introduced by Anthropic on November 25, 2024, matters because it standardizes how AI systems connect to external tools and data sources. MCP is infrastructure for the agent stack. It moves the field from custom one-off integrations toward a shared connector layer. That makes agents easier to build, but it also raises the importance of identity, permissions, sandboxing, and audit trails.
Memory Is Power and Liability
Memory as an Influence System
Memory is what lets an agent persist beyond a single prompt. It can make an agent more useful, more personalized, and more coherent across long tasks. But memory also creates risk: stale memory, overbroad recall, hidden assumptions, privacy exposure, and unreviewed context can all shape future actions. A memory system is not just storage. It is an influence system.
So What: The Real Agent Is the Whole Runtime, Not the Model
Model Capability Is Not Operational Readiness
Benchmarks Are Not Deployment Proof
A model can be impressive in a benchmark and still fail as an agent. Production agents encounter messy state, ambiguous goals, changing interfaces, unreliable tools, missing permissions, stale memory, malformed data, and users who do not behave like test cases. Agent reliability is a software engineering problem as much as a model intelligence problem.
Permissions Define the Agent's Real Power
Same Model, Different Agent
An AI agent with read-only search access is not the same system as an AI agent with write access to code, email, calendars, CRM records, payment systems, customer accounts, or production infrastructure. The model may be identical. The agent is not. In agentic systems, permissions are part of identity.
Governance Must Be in the Runtime
Agent-Grade Controls
Gartner's warning that over 40% of agentic AI projects may be canceled by the end of 2027 because of cost, unclear value, or inadequate risk controls is not just a market prediction. It is a diagnosis of the runtime gap. Many teams are deploying agent-shaped demos without agent-grade governance.
Research basis: Gartner's June 2025 prediction cites escalating costs, unclear business value, inadequate risk controls, and agent washing. NVIDIA's agent glossary emphasizes sandboxes, identity controls, and policy engines. IBM's AgentOps work focuses on observing whether deployed agents operate as intended. The museum tracks the broader claim in the Hype Cycle Annotation and the Failure Archive.
Thesis 3: Agentic AI Begins When Many Agents Become an Operating Layer
AI Agents vs. Agentic AI
Unit vs. Operating Model
An AI agent is the unit: a system that can act. Agentic AI is the broader architecture and operating model: agents connected to tools, memory, workflows, supervisors, policies, and other agents. One agent can complete a task. Agentic AI changes how tasks are organized. For the terminology history, see agentic AI in the terminology archaeology.
From Single Agents to Agentic Systems
Coordination Across Roles
A single coding agent can fix a bug. An agentic software-development system can triage issues, assign tasks, inspect repositories, write tests, open pull requests, request review, monitor deployment, and roll back if checks fail. The second system is not just "more AI." It is a workflow layer.
From Tools to Delegated Workflows
Workflow as the New Unit
Traditional software tools wait for humans to operate them. AI agents can operate tools on behalf of humans. Agentic AI emerges when the important unit is no longer the tool or the prompt, but the delegated workflow: "Handle this claim," "prepare this research memo," "monitor this system," "book this trip," "resolve this customer request," "ship this patch."
The Agentic Economy
Delegated Cognitive Work
The agentic economy is the economic layer built from delegated cognitive work. It is not merely automation. It is the movement of decision, coordination, and action into software systems that can operate with varying degrees of autonomy. That shift changes cost, speed, accountability, and the meaning of work. The commercial evidence appears in the funding and ecosystem record and the primary product documents.
So What: The Important Unit Is No Longer the Prompt, but the Delegated Workflow
The Autonomy Ladder
Seven Levels of Delegation
- Chatbot: answers a user's prompt.
- Assistant: helps complete a task but waits for human direction.
- Copilot: works beside a human inside a domain, often suggesting or drafting actions.
- Tool-using agent: chooses and calls tools to complete steps toward a goal.
- Computer-use agent: operates software interfaces directly, using screens, clicks, and typing.
- Multi-agent system: coordinates specialized agents across roles or subtasks.
- Agentic operating layer: delegates whole workflows across tools, agents, policies, and human review points.
Delegated Work Creates Delegated Risk
Accountability Remains Human
Every delegation creates a question: who remains responsible? If a human delegates a task to an agent, the human may no longer make every micro-decision, but the organization still owns the result. That is why auditability, reversibility, and human escalation are not optional features. They are the moral and operational frame that makes delegation legitimate.
What AI Agents Reveal About Work
Work Contains Judgment
AI agents reveal which parts of work are routine, which are judgment-heavy, which require trust, which require memory, and which depend on tacit human context. The more carefully we study what agents can and cannot do, the more clearly we see what human work actually contains.
Thesis 4: Agentic History Is a Mirror of Human History Patterns
What We Automate Shows What We Value
Automation as a Historical Signal
Humans automate what is repetitive, expensive, dangerous, boring, or too fast for human attention. That is why the history of agents runs through distributed sensing, planning, logistics, games, software development, customer support, search, scheduling, and commerce. Each wave of agents marks a boundary where humans decided that some form of action could be transferred to software.
What We Refuse to Automate Shows Where We Locate Meaning
Meaning Marks the Boundary
The opposite boundary matters just as much. Humans hesitate to delegate care, justice, trust, moral judgment, irreversible decisions, and intimate human relationships. When organizations push agents into those domains too quickly, the failures are not merely technical. They reveal that the delegated task carried human significance the system did not understand.
Agentic Progress Can Motivate Human Progress
Better Human Goals
Agentic progress should not be understood only as machines becoming more capable. It should force a human question: if software can take over more execution, what should humans become better at? The answer is not less agency. It is clearer agency: better goals, better judgment, better institutions, better safeguards, better education, better definitions of responsibility.
Human Agency Is the Boundary of Agentic AI Progress
The Human Moves Upstream
Purpose, Constraint, Review
As agents become more capable, the human role does not disappear. It moves upstream. Humans define purposes, choose constraints, decide what tools are allowed, set review thresholds, design institutions, audit results, and accept responsibility. The more autonomous the system becomes downstream, the more important human judgment becomes upstream.
The Full Circle of Agentic History
Artificial Agency Reflects Human Agency
The word agentic originally belongs to human agency: the capacity to act intentionally. AI agents borrow that language because they imitate part of that capacity. But imitation is not replacement. The history of AI agents comes full circle when it teaches us to ask more precise questions about human agency: What do we want? What should we delegate? What must remain accountable to people? What kinds of progress are worth accelerating?
The Practical Test
Five Deployment Questions
Before deploying or trusting any AI agent, ask five questions:
- Goal: What is the agent trying to accomplish?
- Power: What tools, memory, and permissions does it have?
- Boundary: What is it forbidden to do?
- Review: When must a human approve, inspect, or interrupt?
- Responsibility: Who owns the consequence if the agent acts incorrectly?
Those questions are not only technical. They are historical and human. They are why AI agents belong inside Agentic History.
Editorial position: the human-agency argument is museum interpretation. It is grounded in the historical record on this site, especially the timeline, failure archive, predictions tracker, and editorial positions.
Sources and Further Reading Guide
- IBM, "What Are AI Agents?"
- IBM, "What is AI Agent Planning?"
- IBM, "What Is AI Agent Memory?"
- NVIDIA, "What are Autonomous AI Agents?"
- Gartner, "Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027"
- Shunyu Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models", arXiv:2210.03629, 2022.
- Anthropic, "Introducing the Model Context Protocol", November 25, 2024.
- Michael Wooldridge and Nicholas R. Jennings, "Intelligent Agents: Theory and Practice", The Knowledge Engineering Review, 1995.
- Agentic History, AI Agent History Timeline.
- Agentic History, AI Agent Taxonomy.
- Agentic History, AI Agent Terminology Archaeology.
- Agentic History, AI Agent and Chatbot Failure Archive.
Related: Full AI agent timeline · AI Agent Taxonomy · Terminology Archaeology · Agentic AI vs. AI agent FAQ · Failure Archive