AI agents, agentic AI, and human agency

AI Agents

Last updated: June 1, 2026

AI agents are the shift from artificial intelligence that answers to artificial intelligence that acts. Most explanations stop at the first-order definition: an AI agent is software that perceives, plans, uses tools, and works toward a goal. That definition is useful, but it is not enough. The second-order question is what changes when AI can act. The third-order question is what humans should delegate when action becomes cheap, scalable, and partly autonomous.

This page makes the argument in three movements: first, what an AI agent is; second, why the agent stack matters more than the model alone; third, how agentic AI turns individual agents into an operating layer for work, commerce, software, and institutions. The page ends where the word agentic began: with human agency.

Editorial source note: this guide combines primary AI-agent literature, dated product releases, modern agent infrastructure documents, failure records, and original museum interpretation. Definitions are cross-checked against the AI Agent Taxonomy, AI Agent History Timeline, and Primary Sources Library. Corrections and missing primary sources can be sent to curator@agentichistory.org.


Evidence Standard for This AI Agents Guide

How This Page Uses Sources

Primary Sources Before Marketing Claims

This page is an interpretive guide, not a marketing glossary. It uses primary papers and technical references for definitions, dated product releases for modern agent milestones, failure records for risk claims, and clearly labeled museum interpretation for the human-agency argument. When a claim is definitional, it should connect to the taxonomy or library. When a claim is historical, it should connect to the timeline. When a claim is about operational risk, it should connect to incidents, governance sources, or the Failure Archive.

How We Separate Definition from Interpretation

Technical Claim, Operational Claim, Human Claim

The first-order claim is technical: an agent perceives, decides, and acts toward a goal. The second-order claim is operational: action changes risk because an agent can affect external systems. The third-order claim is human and institutional: delegation changes accountability. This structure is intentional, so readers can separate accepted agent definitions from the museum's broader argument about human agency.

What Would Change This Page

Material Update Threshold

This page should be updated when a stronger primary definition emerges, when a major agent capability changes the practical boundary of delegation, when new public failures clarify risk, or when the museum revises its interpretation of agentic AI. Update dates and schema dates should change only when the page has materially changed.


Thesis 1: An AI Agent Is Software That Can Act

First-order claim An AI agent is not defined by sounding intelligent. It is defined by the ability to connect perception, decision, and action in pursuit of a goal.

AI Agent Definition

Formal Definition

An AI agent is a software system that can observe some context, decide what action to take, and execute that action toward an objective. In older AI literature, the language was perception, environment, action, and goals. In modern LLM systems, the practical stack is usually a model connected to tools, memory, planning logic, permissions, and feedback.

Classical Agents and LLM Agents

This is why the phrase "AI agent" is broader than "LLM agent." A thermostat can be described as a simple agent in classical AI terms; a reinforcement-learning system is an agent in an environment; a BDI system is an agent with beliefs, desires, and intentions; a modern LLM agent is an agent whose reasoning engine is a language model and whose actions are tool calls, API calls, code execution, browser actions, or computer use. For the formal classification, see the AI Agent Taxonomy.

What Is Not an AI Agent?

Conversational Output Is Not Enough

A model that only produces text is not automatically an AI agent. A chatbot that waits for a prompt, returns an answer, and cannot take action outside the conversation is better described as an assistant or conversational interface. It may be useful, but it is not yet acting in the world.

External State Is the Boundary

The boundary changes when the system gains action channels: search, code execution, database writes, calendar access, payments, browser control, issue creation, CRM updates, pull requests, ticket routing, or physical actuation. Once the system can change external state, the problem is no longer just whether the answer is correct. The problem is whether the action should have happened.

The Minimum Test for an AI Agent

Five Practical Qualification Questions

Question If no If yes
Can it pursue a goal across more than one step? Likely a chatbot or single-turn tool Candidate agent
Can it choose among actions? Likely scripted automation Candidate agent
Can it use tools or affect external systems? Assistant, not operational agent Operational agent
Can it observe results and adjust? One-shot workflow Agentic loop
Can humans inspect, limit, and reverse what it does? Unsafe agentic deployment Governable agentic system

Research basis: IBM describes AI agents in terms of tools, memory, reasoning, and planning; NVIDIA emphasizes orchestration, tools, memory, and policy controls for autonomous agents; Wooldridge and Jennings' 1995 paper established autonomy, social ability, reactivity, and proactivity as core agent properties. See also the museum's AI Agent Taxonomy and multi-agent systems sources.


So What: The Shift Is from Producing Answers to Creating Consequences

Why this matters The central risk of an AI chatbot is a bad answer. The central risk of an AI agent is a bad action.

Wrong Answers Are Content Risk

Failure Inside the Answer

A traditional chatbot can mislead, hallucinate, offend, or confuse. Those failures matter, especially in high-stakes settings. But the failure usually remains inside the answer until a human acts on it.

Wrong Actions Are Operational Risk

Failure Inside the System

An AI agent can update a ticket, send an email, submit code, change a configuration, book a trip, make a recommendation, delete a file, approve a workflow, or trigger a payment. The error leaves the conversation and enters an operational system. That is why the history of AI agents cannot be separated from the history of responsibility.

Failure History Shows the Boundary

Public Incidents as Evidence

The Failure Archive documents the early warning signs. Air Canada was held liable for its chatbot's false bereavement-fare guidance. The NEDA Tessa chatbot gave unsafe advice to vulnerable users. The Chevrolet of Watsonville chatbot accepted a hostile prompt and agreed to sell a vehicle for $1. The NYC MyCity chatbot gave businesses illegal guidance. These were not all "agents" in the strongest modern sense, but they show the same direction of travel: as AI interfaces become authoritative, embedded, and action-adjacent, consequences move faster than governance.

Human Review Changes Shape

Review Moves Upstream

In a chatbot world, human review usually means checking output. In an agent world, human review must mean approving goals, tools, permissions, thresholds, reversibility, escalation rules, and logs. Review moves from the end of the process to the design of the process.

Trust basis: risk examples are drawn from the AI Agent & Chatbot Failure Archive. The distinction between content risk and operational risk is museum interpretation based on those public records and on the technical distinction between output generation and external-state-changing action.


Thesis 2: AI Agents Become Powerful When They Gain Tools, Memory, and Permissions

Second-order claim The model is only one component. The power of an AI agent comes from the system wrapped around the model.

The AI Agent Stack

Runtime Components

A modern AI agent is better understood as a stack than as a model. The stack usually includes:

Why ReAct Matters

Reason, Act, Observe

The 2022 ReAct paper is one of the defining documents of modern LLM agents because it joined reasoning and action in an interleaved loop. The model reasons about the task, acts through an external source or environment, observes the result, and updates its next step. That pattern is now visible in many agent frameworks and products. It is not the whole history of agents, but it is the technical bridge between language models and modern tool-using agents.

Why MCP Matters

Connector Infrastructure

The Model Context Protocol, introduced by Anthropic on November 25, 2024, matters because it standardizes how AI systems connect to external tools and data sources. MCP is infrastructure for the agent stack. It moves the field from custom one-off integrations toward a shared connector layer. That makes agents easier to build, but it also raises the importance of identity, permissions, sandboxing, and audit trails.

Memory Is Power and Liability

Memory as an Influence System

Memory is what lets an agent persist beyond a single prompt. It can make an agent more useful, more personalized, and more coherent across long tasks. But memory also creates risk: stale memory, overbroad recall, hidden assumptions, privacy exposure, and unreviewed context can all shape future actions. A memory system is not just storage. It is an influence system.


So What: The Real Agent Is the Whole Runtime, Not the Model

Why this matters Better models alone do not create reliable agents. Reliability comes from the runtime: tools, permissions, state, tests, monitoring, and rollback.

Model Capability Is Not Operational Readiness

Benchmarks Are Not Deployment Proof

A model can be impressive in a benchmark and still fail as an agent. Production agents encounter messy state, ambiguous goals, changing interfaces, unreliable tools, missing permissions, stale memory, malformed data, and users who do not behave like test cases. Agent reliability is a software engineering problem as much as a model intelligence problem.

Permissions Define the Agent's Real Power

Same Model, Different Agent

An AI agent with read-only search access is not the same system as an AI agent with write access to code, email, calendars, CRM records, payment systems, customer accounts, or production infrastructure. The model may be identical. The agent is not. In agentic systems, permissions are part of identity.

Governance Must Be in the Runtime

Agent-Grade Controls

Gartner's warning that over 40% of agentic AI projects may be canceled by the end of 2027 because of cost, unclear value, or inadequate risk controls is not just a market prediction. It is a diagnosis of the runtime gap. Many teams are deploying agent-shaped demos without agent-grade governance.

Research basis: Gartner's June 2025 prediction cites escalating costs, unclear business value, inadequate risk controls, and agent washing. NVIDIA's agent glossary emphasizes sandboxes, identity controls, and policy engines. IBM's AgentOps work focuses on observing whether deployed agents operate as intended. The museum tracks the broader claim in the Hype Cycle Annotation and the Failure Archive.


Thesis 3: Agentic AI Begins When Many Agents Become an Operating Layer

Third-order claim Agentic AI is not simply "an AI agent." It is what happens when agents become the operating layer through which work is delegated, coordinated, and governed.

AI Agents vs. Agentic AI

Unit vs. Operating Model

An AI agent is the unit: a system that can act. Agentic AI is the broader architecture and operating model: agents connected to tools, memory, workflows, supervisors, policies, and other agents. One agent can complete a task. Agentic AI changes how tasks are organized. For the terminology history, see agentic AI in the terminology archaeology.

From Single Agents to Agentic Systems

Coordination Across Roles

A single coding agent can fix a bug. An agentic software-development system can triage issues, assign tasks, inspect repositories, write tests, open pull requests, request review, monitor deployment, and roll back if checks fail. The second system is not just "more AI." It is a workflow layer.

From Tools to Delegated Workflows

Workflow as the New Unit

Traditional software tools wait for humans to operate them. AI agents can operate tools on behalf of humans. Agentic AI emerges when the important unit is no longer the tool or the prompt, but the delegated workflow: "Handle this claim," "prepare this research memo," "monitor this system," "book this trip," "resolve this customer request," "ship this patch."

The Agentic Economy

Delegated Cognitive Work

The agentic economy is the economic layer built from delegated cognitive work. It is not merely automation. It is the movement of decision, coordination, and action into software systems that can operate with varying degrees of autonomy. That shift changes cost, speed, accountability, and the meaning of work. The commercial evidence appears in the funding and ecosystem record and the primary product documents.


So What: The Important Unit Is No Longer the Prompt, but the Delegated Workflow

Why this matters Prompts produce outputs. Delegated workflows produce outcomes. Outcomes require accountability.

The Autonomy Ladder

Seven Levels of Delegation

  1. Chatbot: answers a user's prompt.
  2. Assistant: helps complete a task but waits for human direction.
  3. Copilot: works beside a human inside a domain, often suggesting or drafting actions.
  4. Tool-using agent: chooses and calls tools to complete steps toward a goal.
  5. Computer-use agent: operates software interfaces directly, using screens, clicks, and typing.
  6. Multi-agent system: coordinates specialized agents across roles or subtasks.
  7. Agentic operating layer: delegates whole workflows across tools, agents, policies, and human review points.

Delegated Work Creates Delegated Risk

Accountability Remains Human

Every delegation creates a question: who remains responsible? If a human delegates a task to an agent, the human may no longer make every micro-decision, but the organization still owns the result. That is why auditability, reversibility, and human escalation are not optional features. They are the moral and operational frame that makes delegation legitimate.

What AI Agents Reveal About Work

Work Contains Judgment

AI agents reveal which parts of work are routine, which are judgment-heavy, which require trust, which require memory, and which depend on tacit human context. The more carefully we study what agents can and cannot do, the more clearly we see what human work actually contains.


Thesis 4: Agentic History Is a Mirror of Human History Patterns

Historical claim Every artificial agent reveals something about human agency: what humans delegate, what they automate, what they fear, and what they refuse to surrender.

What We Automate Shows What We Value

Automation as a Historical Signal

Humans automate what is repetitive, expensive, dangerous, boring, or too fast for human attention. That is why the history of agents runs through distributed sensing, planning, logistics, games, software development, customer support, search, scheduling, and commerce. Each wave of agents marks a boundary where humans decided that some form of action could be transferred to software.

What We Refuse to Automate Shows Where We Locate Meaning

Meaning Marks the Boundary

The opposite boundary matters just as much. Humans hesitate to delegate care, justice, trust, moral judgment, irreversible decisions, and intimate human relationships. When organizations push agents into those domains too quickly, the failures are not merely technical. They reveal that the delegated task carried human significance the system did not understand.

Agentic Progress Can Motivate Human Progress

Better Human Goals

Agentic progress should not be understood only as machines becoming more capable. It should force a human question: if software can take over more execution, what should humans become better at? The answer is not less agency. It is clearer agency: better goals, better judgment, better institutions, better safeguards, better education, better definitions of responsibility.


Human Agency Is the Boundary of Agentic AI Progress

Final so what The deepest question is not what an AI agent can do. The deepest question is what a human should delegate.

The Human Moves Upstream

Purpose, Constraint, Review

As agents become more capable, the human role does not disappear. It moves upstream. Humans define purposes, choose constraints, decide what tools are allowed, set review thresholds, design institutions, audit results, and accept responsibility. The more autonomous the system becomes downstream, the more important human judgment becomes upstream.

The Full Circle of Agentic History

Artificial Agency Reflects Human Agency

The word agentic originally belongs to human agency: the capacity to act intentionally. AI agents borrow that language because they imitate part of that capacity. But imitation is not replacement. The history of AI agents comes full circle when it teaches us to ask more precise questions about human agency: What do we want? What should we delegate? What must remain accountable to people? What kinds of progress are worth accelerating?

The Practical Test

Five Deployment Questions

Before deploying or trusting any AI agent, ask five questions:

Those questions are not only technical. They are historical and human. They are why AI agents belong inside Agentic History.

Editorial position: the human-agency argument is museum interpretation. It is grounded in the historical record on this site, especially the timeline, failure archive, predictions tracker, and editorial positions.


Sources and Further Reading Guide

Related: Full AI agent timeline · AI Agent Taxonomy · Terminology Archaeology · Agentic AI vs. AI agent FAQ · Failure Archive