AI Agent Taxonomy
Last updated: June 1, 2026
Every definition article defines "an AI agent" as one thing. This page maps the actual diversity of agent types — ten distinct categories, each with a formal definition, the paper that named it, historical examples from the field's 70-year history, and the current products that fall within it.
The word "agent" covers a thermostat (simple reflex), a chess engine (model-based reflex), a BDI-programmed industrial system (deliberative), AlphaGo (learning), AutoGPT (tool-use LLM), Anthropic's computer use (computer-use), Boston Dynamics Spot (embodied), and the Sakana AI Scientist (autonomous research). These are not the same thing. Using one word for all of them — as the current industry does — produces definitional confusion at scale, which is exactly what agent washing exploits.
This taxonomy uses two organizing frameworks: the classical framework from Russell and Norvig (1995), which classifies agents by their internal architecture (reflex → model → goal → utility → learning), and the modern framework that distinguishes agents by their operational context (tool-use LLM, computer-use, embodied, research). The two frameworks are complementary: the classical framework describes how an agent reasons; the modern framework describes where and how it acts.
Editorial source note: Taxonomy claims are checked against named papers, dated systems, and cross-page evidence in the Primary Sources Library, AI agent history timeline, and editorial methodology. When a modern product spans multiple categories, this page names the primary type first and explains secondary types instead of treating marketing language as the classification.
- Evidence Standard for This Taxonomy — how categories are sourced and reviewed
- Reactive Agents — perceive and act with no world model
- Model-Based Reflex Agents — perceive, track state, act
- Deliberative Agents — symbolic world model + planning
- BDI Agents — beliefs, desires, intentions
- Learning Agents — improve through experience
- Multi-Agent Systems — coordination among multiple agents
- Tool-Use LLM Agents — the modern agentic AI paradigm
- Computer-Use Agents — operating software interfaces directly
- Embodied Agents — acting in the physical world
- Autonomous Research Agents — full scientific lifecycle automation
Jump to full comparison table · Jump to "where does X product fall?"
Evidence Standard for This Taxonomy
How This Taxonomy Is Sourced
Classical Literature and Modern Agent Evidence
Each category is anchored to either a formal research lineage or a documented product capability. Classical categories rely on Russell and Norvig, Brooks, Wooldridge and Jennings, Rao and Georgeff, Watkins, Sutton and Barto, and the multi-agent systems literature. Modern categories rely on primary papers and releases for ReAct, Toolformer, SWE-bench, computer use, embodied AI, and autonomous research agents.
How Categories Are Assigned
Primary Type, Secondary Types, and Boundary Cases
The taxonomy assigns a primary type based on the feature that most defines the system's behavior. Secondary types are listed when a product spans categories: Claude with tools is primarily a tool-use LLM agent, Anthropic Computer Use is primarily a computer-use agent, Waymo is primarily an embodied agent, and Sakana AI Scientist is primarily an autonomous research agent. Borderline cases are treated explicitly because many current systems combine old architectures with modern interfaces.
How Product Placement Is Reviewed
Agent Washing and Category Discipline
Product placement is not copied from vendor marketing. A chatbot is not treated as an agent unless it can pursue a goal across steps, use tools or actions, maintain task state, or operate in an environment. This is the same category discipline used on the agent washing terminology entry: the label must describe observable behavior, not only the way a product is sold.
Update and Correction Policy
Material Taxonomy Changes
Material changes to category definitions, product placement, or source interpretation should be visible on the page through updated text, corrected dates, or revised source notes. Corrections and primary-source additions can be sent to curator@agentichistory.org.
Classical AI Agent Types
Russell & Norvig 1995 framework: internal agent architectures from reactive rules through learning systems.
1. Reactive Agents
Definition
Reactive agents are the simplest agent type: sense the environment, fire a rule, take action. They cannot reason about what they haven't observed, cannot plan ahead, and cannot improve over time. What they can do — and do extremely well — is respond to environmental stimuli in real time without the computational overhead of deliberation.
Rodney Brooks at MIT developed the defining implementation: the subsumption architecture, introduced in his 1986 paper "A Robust Layered Control System for a Mobile Robot" (IEEE Journal of Robotics and Automation, 2(1), DOI: 10.1109/JRA.1986.1087032). Brooks argued that intelligence emerges from direct coupling of perception and action in the physical world — without explicit internal representations, world models, or planning. The agent is defined by layered behaviors (avoid obstacles → wander → follow a person), where higher layers can suppress lower ones. Brooks's robots — Herbert, Allen, Squirt — were physically embodied reactive agents.
Russell and Norvig formalize the type as the "simple reflex agent": "if condition → then action." The thermostat is the canonical example: temperature below threshold → activate heat. There is no concept of "last temperature" or "target by 6pm" — only the current percept and its associated action.
Historical Examples
Modern Examples and Boundary Cases
Pure reactive agents are rare in modern commercial AI — they have been superseded by model-based and learning approaches. However, the reactive pattern persists inside larger systems: the collision-avoidance layer in an autonomous vehicle is reactive even when the overall vehicle system is deliberative. Many "AI agents" marketed in 2025 that simply trigger workflows based on keyword detection or webhook events are functionally reactive, even when built on LLMs.
Primary Sources
2. Model-Based Reflex Agents
Definition
The model-based reflex agent extends the reactive agent with memory. Where a reactive agent sees only the current state, a model-based agent remembers relevant history and maintains a representation of the world sufficient to make better decisions than current perception alone would allow. The agent updates its internal state based on (a) what has happened, (b) what it just did, and (c) a model of how its actions affect the world.
The classic Russell and Norvig example is an automated vacuum cleaner with a map of rooms it has already cleaned: the vacuum's next action depends on where it currently is and which rooms it has visited, not just its current percept. This is structurally distinct from a reactive agent, which would re-clean the same room if randomly placed there again.
Historical Examples
Primary Sources
3. Deliberative Agents
Definition
The deliberative agent adds to the model-based reflex agent the capacity for explicit reasoning about that model — using symbolic logic, planning algorithms, or inference to decide what to do. Where a model-based reflex agent selects actions by matching internal state to condition-action rules, a deliberative agent can reason across chains of hypothetical actions, evaluate their consequences, and select a plan.
The deliberative approach faces two classic problems identified by Wooldridge and Jennings: the transduction problem (how do you translate the messy real world into a precise symbolic description, fast enough for the description to be useful?) and the representation/reasoning problem (how do you represent complex real-world knowledge symbolically, and reason with it in real time?). These are the same problems that limited classical AI — what critics called GOFAI (Good Old-Fashioned Artificial Intelligence) — through the 1980s and 1990s.
Most BDI agents (Type 4 below) are deliberative agents, but not all deliberative agents are BDI agents. STRIPS-based planners, for instance, are deliberative without being BDI. The distinction is whether the deliberation is organized around a belief-desire-intention mental model.
Historical Examples
Primary Sources
4. BDI Agents (Belief-Desire-Intention)
Definition
BDI agents are a specific subtype of deliberative agent, organized around the philosophical framework introduced by Michael Bratman in 1987. Bratman's key insight is that rational agents don't continuously reconsider their goals — they commit to plans (intentions) and follow through, reconsidering only when there is a clear reason to do so. This commitment is what enables action over extended time horizons without constant re-planning.
The computational operationalization came from Anand Rao and Michael Georgeff (1991), who formalized BDI in modal logic and described the control loop: update beliefs from percepts → generate options (desires) → filter options using beliefs → form intentions → execute. This architecture was implemented in the Procedural Reasoning System (PRS) and later dMARS, then formalized as the programming language AgentSpeak(L) by Rao (1996), and implemented in the Jason interpreter by Hübner and Bordini.
Historical Examples
BDI and LLM Agents: The Conceptual Parallel
Modern LLM-based agents implement a structurally similar architecture without knowing the BDI literature: the system prompt functions as beliefs (world model and context), the user's objective functions as desire (goal), and the ReAct reasoning trace functions as intention (committed plan). The LLM's tendency to "commit" to a line of reasoning across a multi-step task mirrors Bratman's insight about intention-as-commitment. This parallel is noted in the Agentic History Terminology Archaeology and in the arxiv paper "Agentic AI and Multiagentic: Are We Reinventing the Wheel?" (arXiv:2506.01463, 2025).Primary Sources
5. Learning Agents
Definition
Learning agents improve their performance over time through experience. Unlike the types above — which operate with fixed rules, fixed models, or fixed architectures — a learning agent adapts based on what happens when it acts. The dominant implementation paradigm is reinforcement learning (RL): the agent receives reward signals from the environment and learns a policy that maximizes cumulative reward.
Deep reinforcement learning — combining deep neural networks with RL — produced the most dramatic AI capability demonstrations of the 2013–2022 era: DQN playing Atari (Mnih et al., 2013), AlphaGo defeating world champion Lee Sedol (Silver et al., 2016), AlphaZero mastering chess, Go, and shogi from scratch (Silver et al., 2017), and AlphaStar reaching Grandmaster level at StarCraft II (Vinyals et al., 2019). These are all learning agents.
Modern LLMs are also learning agents in a specific sense: they are trained via reinforcement learning from human feedback (RLHF, Christiano et al. 2017) and Constitutional AI (Bai et al. 2022). The "learning" in this case happens during training, not at runtime — which distinguishes them from RL agents that continue learning during deployment.
Historical Examples
Primary Sources
Multi-Agent Coordination
Systems where the organizing question is how multiple agents allocate tasks, communicate, and coordinate.
6. Multi-Agent Systems (MAS)
Definition
Multi-agent systems are not a type of individual agent — they are an organizational structure in which multiple agents (of any of the types above) interact. The key research questions in MAS: how do agents communicate (KQML, FIPA-ACL, MCP); how do they coordinate tasks (Contract Net Protocol, auctions, organizational structures); how do they reach agreement when interests conflict (game theory, mechanism design); and how does emergent collective behavior arise.
The modern LLM-based MAS — AutoGen, CrewAI, LangGraph, OpenAI's multi-agent patterns — are direct descendants of this 40-year research tradition, usually without knowing it. AutoGen's "GroupChat" is structurally analogous to the KQML blackboard architecture. CrewAI's role-based collaboration is structurally analogous to organizational agent frameworks from the 1990s. LangGraph's graph-based orchestration is a formal state-machine approach that mirrors FIPA-compliant agent coordination protocols.
Historical Examples
Modern LLM-Based Multi-Agent Systems
Primary Sources
Modern AI Agent Types Explained
LLM-era agents defined by tool use, software operation, and autonomous task execution.
7. Tool-Use LLM Agents
Definition
The tool-use LLM agent is what most people mean when they say "AI agent" in 2025–2026. It is the category into which AutoGPT, BabyAGI, Claude with tools, ChatGPT with plugins, Devin, and most enterprise agentic AI products fall. The defining architecture is the ReAct loop (Yao et al. 2022): reason → act (call a tool) → observe (receive tool output) → reason → repeat.
The key architectural components: an LLM as the reasoning and planning core; a tool registry (search, code execution, file I/O, API calls); a memory system (conversation history, vector database, or scratchpad); an observation mechanism (tool outputs returned to the context); and a termination condition (task complete, maximum steps reached, or handoff to human).
Tool-use LLM agents are, in the classical framework, a hybrid of model-based and deliberative agents: they maintain context across steps (model-based) and reason about what to do next using that context (deliberative). But the "deliberation" happens in natural language via LLM inference rather than formal symbolic logic, which makes them qualitatively different from classical deliberative agents.
Defining Moments
The category is large enough to have meaningful internal distinctions:
- Single-turn tool-use (chatbot with plugins) — one round of tool calls per user message; not truly agentic
- Autonomous loop agent (BabyAGI, AutoGPT pattern) — self-directed multi-step loops; the core "agent" pattern
- Long-horizon agent (Devin, Claude Code) — sustains coherent goal pursuit over extended sessions (30+ minutes, 100+ steps)
- Specialized domain agent (legal research agent, financial analysis agent) — tool-use LLM agent fine-tuned or constrained to a domain
Primary Sources
8. Computer-Use Agents
Definition
Computer-use agents are a subtype of tool-use LLM agent distinguished by how they interact with software. Where tool-use agents call APIs (structured interfaces designed for machine access), computer-use agents interact with GUIs (interfaces designed for humans). This distinction is practically enormous: most enterprise software has no API, or its API is incomplete, or API access requires developer integration. A computer-use agent can access all of it through the same interface a human employee would use.
The conceptual predecessor is Adept AI's ACT-1 (Action Transformer, September 2022) — the first widely publicized demonstration of a transformer model operating web interfaces. Adept co-founders included several authors of the original Transformer paper (Vaswani et al. 2017). Anthropic's October 22, 2024 release was the first production deployment from a frontier lab; OpenAI's Operator (January 23, 2025) was the first consumer product in the category.
Defining Products
Computer-Use vs Tool-Use: The Key Distinction
Tool-use agents call APIs:search(query="climate change"). Computer-use agents take GUI actions: click the search bar, type "climate change," press Enter, read the results from the screen. The same underlying task; completely different interface. Tool use requires the tool to have been pre-integrated; computer use requires only that the target software have a graphical interface. This is why computer use is considered a qualitative capability expansion: it removes the API-integration bottleneck.
Primary Sources
Physical and Frontier Agent Types Compared
Agents that operate in physical environments or attempt full research lifecycle automation.
9. Embodied Agents
Definition
Embodied agents face a set of challenges that purely digital agents do not: the physical world is continuous, noisy, and unpredictable; actions have irreversible consequences; perception is limited by sensor placement and quality; and the agent's own body must be modeled as part of the environment. Rodney Brooks argued in 1990 that intelligence cannot be separated from embodiment — "elephants don't play chess" — and that the kind of intelligence needed in the physical world is fundamentally different from the kind that beats chess engines.
The integration of large language models and vision models with physical robotic platforms is the defining trend in embodied AI as of 2024–2026. Google DeepMind's RT-2 (Robotic Transformer 2, 2023) demonstrated that a vision-language model trained on internet data could transfer semantic knowledge to robot control — a robot told "move to the Coke can" could identify it from visual context even for objects not seen during robot training. Physical Intelligence's π₀ (2024) and Google's Gemini Robotics (2025) extend this to more complex manipulation tasks.
Historical Lineage
Modern Embodied AI
Primary Sources
10. Autonomous Research Agents
Definition
Autonomous research agents represent the most ambitious current application of agentic AI. Rather than assisting human researchers, they are designed to conduct research autonomously — forming the hypothesis, running the experiment, analyzing the data, and writing the paper. This is the class of agent that comes closest to the predictions made by Dario Amodei ("AI smarter than Nobel Prize winners") and Sam Altman ("AI compressing decades of scientific progress").
The landmark system is Sakana AI's AI Scientist, released August 2024 and developed in collaboration with scientists from Oxford and the University of British Columbia. It generates machine learning research papers for approximately $15 each. AI Scientist v2 (2025) produced the first workshop paper written entirely by AI and accepted through peer review. A Nature paper summarizing the AI Scientist's capabilities was published in 2026 (Lu et al., Nature 651, 914–919, 2026). An independent evaluation (Beel et al., arXiv:2502.14297, 2025) found that the system "significantly limits autonomy" in practice through its reliance on human-authored templates, and that Sakana's claims were more optimistic than independent replication supported.
AlphaFold 2 (Jumper et al., 2020) represents the learning-agent route to scientific discovery: rather than reasoning through the problem, AlphaFold learned protein structure directly from sequence data. The 2024 Nobel Prize in Chemistry awarded to Demis Hassabis and John Jumper explicitly recognized AI's role — the first Nobel to acknowledge an AI system's scientific contribution as primary rather than merely assistive.
Defining Systems
Frontier Status and Honest Caveats
Autonomous research agents are the most hyped and least mature category. The AI Scientist v1 requires human-authored templates and cannot yet operate without significant human scaffolding, per the independent Beel et al. (2025) evaluation. The systems produce papers — but the quality, novelty, and reproducibility of those papers is actively debated in the research community. This category represents where the most ambitious agent predictions are concentrated; it is also where the evidence base is thinnest relative to the claims.Primary Sources
AI Agent Comparison and Product Placement Matrix
Full Comparison Table
All ten types across the key dimensions that define agent behavior.
| Type | World model? | Plans ahead? | Learns over time? | Uses external tools? | Needs physical body? | Natural language? | Defined by |
|---|---|---|---|---|---|---|---|
| Reactive | No | No | No | No | No | No | Condition-action rules; current percept only |
| Model-Based Reflex | Yes | No | No | No | No | No | Internal state tracks past; rules still govern action |
| Deliberative | Yes | Yes | No | Sometimes | No | No | Symbolic reasoning over explicit world model |
| BDI | Yes (beliefs) | Yes (intentions) | Sometimes | Sometimes | No | No | Beliefs + desires + committed intentions |
| Learning | Often | Often | Yes | Sometimes | Sometimes | Sometimes | Updates behavior from feedback/reward signals |
| Multi-Agent System | Per agent | Per agent | Per agent | Per agent | Per agent | Increasingly | Organizational: coordination among multiple agents |
| Tool-Use LLM Agent | Yes (context window) | Yes (ReAct) | No (at runtime) | Yes (core feature) | No | Yes (required) | LLM + tool calls in a reason-act-observe loop |
| Computer-Use Agent | Yes (screenshot) | Yes | No | Yes (GUI is the tool) | No | Yes | LLM + GUI interaction (click, type, scroll) |
| Embodied Agent | Yes | Yes | Often | Yes (physical actuators) | Yes (defining) | Increasingly | Physical body; sensors + actuators in real world |
| Autonomous Research | Yes | Yes | Yes | Yes | No | Yes | Full research lifecycle from hypothesis to paper |
Where Does Each Product Fall?
The ten types above are not mutually exclusive — most commercial products span multiple categories. Here is how the most frequently discussed AI agent products map to the taxonomy.
Placement review note: The table below is an editorial classification, not a vendor category list. It separates "not an agent," "tool-use agent," "computer-use agent," "embodied agent," and "autonomous research agent" by observable behavior so readers can compare products without relying on inconsistent commercial labels.
| Product | Primary type | Secondary type(s) | Notes |
|---|---|---|---|
| ChatGPT (no tools) | Not an agent | — | Single-turn chatbot. Responds to messages; does not pursue goals across steps. |
| ChatGPT (with tools, Code Interpreter) | Tool-use LLM agent | — | Calls tools within a response; limited multi-step autonomy. |
| Claude (with tools) | Tool-use LLM agent | Computer-use (via computer use API) | Native tool use + MCP + computer use. Benchmarked on long-horizon agentic tasks. |
| AutoGPT | Tool-use LLM agent | Multi-agent (task delegation loop) | Autonomous loop; web browsing + file I/O + code execution. Cultural origin of "AI agent" as a term. |
| Devin | Tool-use LLM agent | Long-horizon | Software engineering specialist; shell + editor + browser. ARR $1M → $73M in 9 months. |
| OpenAI Operator | Computer-use agent | Tool-use LLM agent | Browser-native; fills forms, books travel, places orders via GUI interaction. |
| Anthropic Computer Use | Computer-use agent | Tool-use LLM agent | Claude 3.5 Sonnet with screenshot, mouse, keyboard capabilities. First frontier lab production computer-use release. |
| Manus | Computer-use agent | Tool-use LLM agent, multi-agent | General-purpose autonomous agent; combines GUI interaction with multi-step planning. |
| CrewAI / AutoGen / LangGraph | Multi-agent system | Tool-use LLM agent (each sub-agent) | Frameworks, not products. Each coordinates multiple LLM tool-use agents with different roles. |
| AlphaGo | Learning agent | — | Pure RL + MCTS. Not an LLM agent; not a tool-use agent. Canonical RL learning agent. |
| AlphaFold 2 | Learning agent | Autonomous research agent (capability) | Solved protein folding via deep learning. Not agentic in the loop-based sense; achieves scientific discovery through training. |
| Waymo | Embodied agent | Learning agent | Hybrid deliberative + reactive + learning; physically embodied in vehicles. Commercial robotaxi in San Francisco and Phoenix. |
| Boston Dynamics Spot | Embodied agent | Reactive (low-level), deliberative (high-level) | Multi-layer: reactive for balance/collision avoidance; deliberative for mission planning. LLM integration in 2023+ versions. |
| Sakana AI Scientist | Autonomous research agent | Tool-use LLM agent, multi-agent | Full research lifecycle; $15/paper; first workshop paper accepted through peer review (v2). Requires human-authored templates (v1). |
| Jason / BDI agents | BDI agent | Multi-agent system | AgentSpeak(L) interpreter; academic and industrial deployments; the canonical BDI agent platform. |
| Thermostat | Reactive agent | — | The textbook example. Temperature below threshold → heat. No world model, no planning, no learning. |
Related: What is an AI agent? · Primary Sources Library · Full AI agent timeline · Editorial methodology · Terminology Archaeology · Failure Archive · Predictions Tracker · Hype Cycle Annotation