Primary sources

AI Agent Primary Sources Library

Last updated: June 1, 2026 · Jump to reading paths · Suggest a paper

Every paper that matters to the history of AI agents, cited in full, with DOIs, open-access links, and the context to understand why each one was consequential. This library is organized by era and by concept. At the bottom are guided reading paths for specific questions: what to read if you want to understand BDI, understand the LLM-agent literature from scratch, or trace the line from multi-agent systems to modern agentic frameworks.

AI Agent Papers and Primary Sources

The problem this page solves: everyone in the field says "the ReAct paper" or "the Contract Net Protocol" but most links go to summaries, blog posts, or secondary sources rather than the actual papers. This library goes directly to the primary sources — the original publication, the DOI, the arXiv number where available, and the full citation.

Editorial source noteThis library is a curated primary-source index, not a general bibliography. Entries are included when they introduced a concept, documented a dated capability, became a canonical reference in the AI-agent literature, or provide the original source for a claim used elsewhere on Agentic History. Definitions are cross-checked against the AI Agent Taxonomy, dated events against the timeline, and terminology claims against Terminology Archaeology.

33Papers in library

1971–2026Date range

22Open access

7Reading paths

Access key: Open = freely available PDF. Paywall = institutional access or purchase required; many are available via Google Scholar or Semantic Scholar. Book = full-length book; check your library.

Sections

Evidence standard for the library
Foundations (1971–1995): The pre-LLM agent canon
Multi-agent systems era (1980–2001): Protocols, architectures, languages
Reinforcement learning agents (1992–2019): From Q-learning to AlphaStar
LLM substrate (2017–2020): The technology that made modern agents possible
LLM agents emerge (2022–2023): ReAct, tool use, and the first frameworks
Benchmarking agents (2023–2026): Measuring what agents can actually do
Safety, governance, alignment (2022–2026)
Primary product documents (2023–2025): Official releases
Guided reading paths

Evidence Standard for the Library

What Qualifies as a Primary Source

Original Papers, Official Releases, and Archival Records

A source qualifies for this page when it is the original research paper, book, technical report, product announcement, benchmark release, standards document, or repository that established the claim being cited. Secondary explainers are useful for context, but they do not replace the paper, DOI, arXiv record, official announcement, or repository that documents what happened.

How Sources Are Prioritized

Canonical Influence, Dated Evidence, and Concept Origin

Priority goes to sources that introduced durable vocabulary, became standard citations, or connect directly to claims made across the museum. A paper can belong here because it defined a concept, created an architecture, introduced a benchmark, established a safety problem, or documented the first product release of a capability such as computer use or MCP.

How Access Links Are Maintained

DOIs, arXiv Records, Official PDFs, and Open Copies

When possible, each entry includes a DOI or arXiv identifier and an open-access path. Paywalled entries remain in the library when they are historically important, but the access badge makes that limitation visible. If a PDF link changes, the citation should remain stable through the DOI, publisher page, arXiv record, or repository history.

Correction and Expansion Policy

How New Papers Enter the Canon

New entries should explain why the source belongs in the core canon rather than the broader literature. The standard is not novelty alone; the source should materially affect the history of AI agents, agentic AI, multi-agent systems, tool-use agents, computer-use agents, embodied agents, benchmarks, or governance. Corrections and missing primary sources can be sent to curator@agentichistory.org.

Era I — Foundations (1971–1995)

The papers that established the conceptual vocabulary for AI agents before LLMs existed. Every modern AI agent inherits ideas from this era, usually without knowing it.

Goal-directed reasoning and planning

Fikes, R. E., & Nilsson, N. J. (1971). STRIPS: A New Approach to the Application of Theorem Proving to Problem Solving. Artificial Intelligence, 2(3–4), 189–208.

DOI: 10.1016/0004-3702(71)90010-5 PDF (Stanford) Open Semantic Scholar: Corpus ID 7367095

Introduced the STRIPS planner — a formalism for representing actions by their preconditions and effects, enabling automated sequential planning toward a goal. The first widely used AI planning system and the origin of virtually all subsequent planning-based agent architectures.

Why it matters for agent history Every agent that plans — from BDI systems in the 1990s to ReAct-based LLM agents in 2022 — is solving the same problem STRIPS solved: how does an agent represent actions and their consequences in order to plan a sequence that achieves a goal? The vocabulary of preconditions, effects, and goal states that STRIPS introduced still appears in the LLM agent literature, usually without citation.

Cited by → BDI architectures (1987–1996) · PDDL planning language (1998) · STRIPS is the canonical precursor cited in Bratman 1987's philosophical treatment of intention.

Bratman, M. E. (1987). Intention, Plans, and Practical Reason. Harvard University Press. ISBN: 978-1-57586-054-5.

Book Available via library; excerpts via Google Books

A work of analytical philosophy that provided the theoretical foundation for the Belief-Desire-Intention (BDI) agent architecture. Bratman distinguishes intention from desire: intention involves commitment to a plan and requires that the agent not reconsider unless there is a reason to do so. This stability of commitment is what makes rational action over extended time horizons possible.

Why it matters for agent history BDI is the dominant architecture in academic agent programming from the late 1980s through the 2010s. Bratman's philosophical analysis — beliefs (world model), desires (goals), intentions (committed plans) — maps directly onto the "beliefs, goals, intentions" components of the PRS and dMARS systems, and conceptually onto the "system prompt + objectives + action plan" structure of modern LLM agents. The operationalization of Bratman's framework is Rao & Georgeff (1991) and AgentSpeak(L) (Rao, 1996).

Cited by → Rao & Georgeff 1991 (BDI formalization) · AgentSpeak(L) 1996 · Wooldridge & Jennings 1995 · Shoham 1993

Rao, A. S., & Georgeff, M. P. (1991). Modeling Rational Agents within a BDI-Architecture. In J. Allen, R. Fikes, & E. Sandewall (Eds.), Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning (KR91), 473–484. Morgan Kaufmann.

Semantic Scholar: Corpus ID 1775228 Open — PDF via CiteSeerX and Semantic Scholar

The paper that formally operationalized Bratman's philosophical BDI framework into a computational model. Rao and Georgeff introduce a modal logic with three primitive modalities — Bel (belief), Goal (desire), and Intend (intention) — and specify the semantics that make BDI agents computationally tractable.

Why it matters for agent history This is the bridge between Bratman's philosophy and the 1990s implementation of agent programming systems (PRS, dMARS, AgentSpeak, Jason). Without Rao & Georgeff's formalization, BDI remains a philosophical theory rather than a programming paradigm. Anand Rao's AgentSpeak(L) (1996) is the direct descendant of this formalization.

Cites → Bratman 1987 · Cited by → AgentSpeak(L) 1996 · Jason interpreter · Wooldridge 1992 (BDI agent logics)

Rao, A. S. (1996). AgentSpeak(L): BDI Agents Speak Out in a Logical Computable Language. In W. Van de Velde & J. W. Perram (Eds.), Agents Breaking Away: Proceedings of the 7th European Workshop on Modelling Autonomous Agents in a Multi-Agent World (MAAMAW-96). Lecture Notes in Artificial Intelligence, vol. 1038, pp. 42–55. Springer.

DOI: 10.1007/BFb0031845 Paywall — Springer; PDF via ResearchGate and CiteSeerX Semantic Scholar: Corpus ID 2787463

Introduced AgentSpeak(L), a logic-based agent programming language that operationalized BDI theory in an executable notation. Agents are specified as sets of beliefs and plans; plan selection is governed by triggering events and context conditions. AgentSpeak(L) provided both a formal specification and a reference for implementation.

Why it matters for agent history AgentSpeak(L) is the direct predecessor of the Jason interpreter (Hübner & Bordini, 2004), which remains the most widely deployed BDI agent programming platform for academic research. It closes the loop from Bratman's philosophy (1987) to a practical, runnable programming language — the same conceptual arc that ReAct (2022) closes for LLM-based agents. This paper was brought to our attention by a curator submission in May 2026 as a significantly underrepresented contribution.

Cites → Bratman 1987 · Rao & Georgeff 1991 · Cited by → Hübner & Bordini, Jason (2004) · Bordini et al. 2007 (Programming Multi-Agent Systems in AgentSpeak using Jason)

Russell, S., & Norvig, P. (1995). Artificial Intelligence: A Modern Approach (1st ed.). Prentice Hall. ISBN: 978-0-13-103805-9. (4th ed., 2020: ISBN 978-0-13-468189-7.)

Book aima.cs.berkeley.edu — companion site with resources

The most widely used AI textbook in the world, organized around the concept of the rational agent. Russell and Norvig define AI as the study of agents that perceive their environment through sensors and act upon it through actuators to maximize a performance measure. The rational agent abstraction becomes the standard organizing principle of academic AI education for the next three decades.

Why it matters for agent history Russell and Norvig cemented "agent" as the standard unit of analysis in AI. Every student who learned AI between 1995 and 2020 learned it through this framework. This is why the word "agent" was already available as vocabulary when LLM-based autonomous systems appeared in 2022 — the concept had been defined as the foundation of AI for 25 years. The textbook's rational-agent definition is broad enough to encompass thermostats and chess engines, which is both its strength (generality) and the source of the definitional ambiguity in "AI agent" today.

Cites → STRIPS (Fikes & Nilsson 1971) · BDI work (Bratman 1987 et al.) · Smith 1980 · Cited by → virtually all subsequent AI agent literature

Shoham's Agent-Oriented Programming

Shoham, Y. (1993). Agent-Oriented Programming. Artificial Intelligence, 60(1), 51–92.

DOI: 10.1016/0004-3702(93)90034-9 Paywall — ScienceDirect; PDF via Semantic Scholar and CiteSeerX Semantic Scholar: Corpus ID 2612202

Proposed treating agents — software systems with beliefs, capabilities, and commitments — as the primary unit of software design, analogous to how object-oriented programming elevated objects. Introduced AGENT-0, a simple agent programming language based on these principles.

Why it matters for agent history Shoham gave the multi-agent systems field its software-engineering formalization. Rather than treating agent theory as purely academic, AOP argued that agents should be the natural unit of software construction. This argument is now being made again — in 2024–2025 language — about "agentic AI" as the new paradigm for enterprise software. The conceptual argument is the same; only the implementation layer changed.

Cites → Bratman 1987 · Rao & Georgeff 1991 · Cited by → Wooldridge & Jennings 1995 · Franklin & Graesser 1997 · virtually all MAS textbooks

Era II — Multi-Agent Systems (1980–2001)

The papers that defined coordination, communication, and organizational structure among multiple autonomous agents.

Coordination protocols

Smith, R. G. (1980). The Contract Net Protocol: High-Level Communication and Control in a Distributed Problem Solver. IEEE Transactions on Computers, C-29(12), 1104–1113.

DOI: 10.1109/TC.1980.1675516 ISSN: 0018-9340 · S2CID: 15267324 PDF (reidgsmith.com) Open 4,259 citations (Semantic Scholar)

Introduced the Contract Net Protocol — a formalization of how autonomous nodes in a distributed problem solver negotiate task allocation through a manager-contractor bidding process. A manager node announces a task; potential contractors submit bids; the manager awards the contract; the contractor executes and reports. The paper also describes the first distributed sensing system using this framework.

Why it matters for agent history The CNP is the founding document of formal multi-agent coordination. It solved a real problem — how do autonomous nodes decide who does what, without a central controller — using a protocol that remains structurally recognizable in how modern multi-agent LLM systems (CrewAI, AutoGen, LangGraph) assign tasks across specialized agents. The paper is the canonical "first" citation in the MAS literature with over 4,000 Semantic Scholar citations. Its basic mechanism has not been superseded in 46 years.

Cited by → Wooldridge & Jennings 1995 · Jennings 1993 (coordination in industrial multi-agent systems) · virtually all MAS surveys · FIPA Contract Net Interaction Protocol (1998)

Defining "intelligent agent"

Wooldridge, M. J., & Jennings, N. R. (1995). Intelligent agents: Theory and practice. The Knowledge Engineering Review, 10(2), 115–152. Cambridge University Press.

DOI: 10.1017/S0269888900008122 PDF (University of Southampton) Open Semantic Scholar: Corpus ID 5303386

The paper that defined "intelligent agent" as a technical term: a hardware or software system with autonomy (operates without direct human intervention), social ability (interacts with other agents), reactivity (responds to environment changes), and pro-activeness (exhibits goal-directed behavior and takes initiative). Surveyed agent architectures, languages, and applications as a field-defining overview.

Why it matters for agent history Wooldridge and Jennings gave the field its canonical definition. The four properties they specify — autonomy, social ability, reactivity, pro-activeness — are the framework against which nearly all subsequent agent definitions are compared. When Gartner, IBM, or Anthropic defines "AI agent" in 2025, they are implicitly working within or against this definition. The paper is also the most cited survey in the MAS literature.

Cites → Smith 1980 · Bratman 1987 · Shoham 1993 · Cited by → Russell & Norvig 1995 · Franklin & Graesser 1997 · Weiss (ed.) Multiagent Systems 1999

Franklin, S., & Graesser, A. (1997). Is It an Agent, or Just a Program?: A Taxonomy for Autonomous Agents. In J. Müller, M. Wooldridge, & N. Jennings (Eds.), Intelligent Agents III: Agent Theories, Architectures, and Languages. Lecture Notes in Computer Science, vol. 1193, pp. 21–35. Springer.

DOI: 10.1007/BFb0013570 PDF (University of Memphis) Open Semantic Scholar: Corpus ID 2204430

Built a taxonomy of autonomous agents that distinguished between simple reactive agents, utility-based agents, goal-based agents, learning agents, and more. The paper was written in 1996 precisely because "agent" had already proliferated without a shared definition in the mid-1990s — an observation equally applicable to 2025.

Why it matters for agent history The definitional problem Franklin and Graesser documented in 1996 — "agent" being applied to anything without a consistent standard — recurs identically in 2025 with "AI agent" and "agentic AI." This paper is both a historical document and a mirror for current definitional debates. The taxonomy it proposes (reactive/deliberative/hybrid/learning) maps onto the types of agents deployed today.

Cites → Wooldridge & Jennings 1995 · Shoham 1993 · Cited by → most agent taxonomy discussions in 2023–2026 literature

Maes, P. (Ed.). (1991). Designing Autonomous Agents: Theory and Practice from Biology to Engineering and Back. MIT Press / Elsevier. ISBN: 978-0-262-63138-5.

Book MIT Press catalog; some chapters available via Google Scholar

The edited volume that gave "autonomous agent" its first major platform as a named technical term. Maes's introduction defines an autonomous agent as "a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda." The book collected foundational papers across robotics, biology, and AI.

Why it matters for agent history This volume and its 1995 companion are the earliest primary sources for "autonomous agent" as a defined compound term. Pattie Maes's MIT Media Lab work on agents pioneered the concept of software agents for personal computing — the direct ancestor of what Anthropic and OpenAI are now commercializing. See also the Terminology Archaeology entry.

Maes, P. (1994). Agents that reduce work and information overload. Communications of the ACM, 37(7), 31–40.

DOI: 10.1145/176789.176792 Paywall — ACM DL; PDF via Semantic Scholar Semantic Scholar: Corpus ID 15025575

Appeared in the CACM special issue "Intelligent Agents" (July 1994) — the issue that introduced software agents to mainstream computer science. Maes describes agents that learn user preferences and autonomously filter information and take actions on the user's behalf. The paper describes "personal agents" that would anticipate user needs — essentially describing the AI personal assistant that Bill Gates predicted in 2023.

Why it matters for agent history The 1994 CACM special issue on Intelligent Agents is the publication event that introduced software agents to the mainstream. Maes's specific framing — an agent that reduces work overload by acting autonomously on your behalf — is almost word-for-word the value proposition of modern consumer AI agents (Operator, Apple Intelligence, Claude). The idea is 32 years old; the implementation has finally caught up.

Era III — Reinforcement Learning Agents (1992–2019)

The papers that built the empirical science of agents learning through interaction with environments.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction (1st ed.). MIT Press. (2nd ed. 2018: ISBN 978-0-262-03924-6.)

Free online (2nd ed.) Open incompleteideas.net

The definitive textbook on reinforcement learning — an agent learning to maximize cumulative reward through trial-and-error interaction with an environment. Defines the core framework: state, action, reward, policy, and value function. The 2nd edition (2018) incorporates deep RL developments including the DQN work (Mnih et al. 2013).

Why it matters for agent history RL agents — systems that learn through reward signals rather than explicit programming — are the dominant paradigm for game-playing agents and the conceptual framework behind RLHF (reinforcement learning from human feedback), which is how modern LLMs are fine-tuned. The LLM agent and the RL agent are two distinct lineages that converge: LLMs provide the reasoning substrate; RL provides the learning-from-feedback mechanism. RLHF is what trains ChatGPT, Claude, and every other frontier model to be helpful.

Cited by → Mnih et al. 2013 (DQN) · Silver et al. 2016 (AlphaGo) · RLHF literature · Christiano et al. 2017

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing Atari with Deep Reinforcement Learning. arXiv preprint arXiv:1312.5602. DeepMind Technologies.

arXiv: 1312.5602 Open DOI: 10.48550/arXiv.1312.5602 Semantic Scholar: Corpus ID 15238391 · 13,161 citations

Presented the first deep learning model to learn control policies directly from raw pixel input using reinforcement learning. A convolutional neural network trained with Q-learning achieved superhuman performance on six Atari 2600 games without game-specific engineering. Submitted to NIPS 2013 Workshop on Deep Learning.

Why it matters for agent history The DQN paper launched the era of deep reinforcement learning agents and led directly to AlphaGo (2016), AlphaZero (2017), and AlphaStar (2019). It demonstrated that a general learning algorithm plus sufficient compute could produce superhuman performance on complex tasks — the same fundamental claim being made about LLM agents in 2025. The DQN is the RL-agent lineage's equivalent of AutoGPT: the moment a capability became undeniably real.

Cites → Sutton & Barto 1998 · Cited by → Silver et al. 2016 (AlphaGo) · Schulman et al. 2017 (PPO) · OpenAI Five · AlphaStar

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G., … Hassabis, D. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587), 484–489.

DOI: 10.1038/nature16961 Open — nature.com Semantic Scholar: Corpus ID 515925 · 14,000+ citations

Introduced AlphaGo, which defeated European champion Fan Hui 5-0 using a combination of deep neural networks and Monte Carlo tree search. Widely regarded as a pivotal demonstration that deep reinforcement learning could achieve superhuman performance on tasks previously thought to require uniquely human intuition.

Why it matters for agent history AlphaGo is the RL-agent lineage's most important public milestone. Its March 2016 defeat of Lee Sedol received global coverage and established in public consciousness that AI agents could perform at a level genuinely beyond human capability on complex, intuitive tasks. AlphaZero (2017) and AlphaStar (2019) extended this to other domains. The 2024 Nobel Prize in Chemistry awarded to Demis Hassabis and John Jumper (AlphaFold) is the direct continuation of this lineage.

Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S., & Amodei, D. (2017). Deep Reinforcement Learning from Human Preferences. Advances in Neural Information Processing Systems 30 (NeurIPS 2017), 4302–4310.

arXiv: 1706.03741 Open Semantic Scholar: Corpus ID 1685732

Introduced the core technique behind RLHF (Reinforcement Learning from Human Feedback) — using human comparisons between agent behaviors to train a reward model, then using that reward model to train the agent. Demonstrated that agents could be trained from human preferences without a programmed reward function, enabling alignment of agent behavior with human intent.

Why it matters for agent history RLHF is the training technique that transformed LLMs from academic curiosities into useful assistants. ChatGPT, Claude, and all frontier models that followed are trained with variants of this technique. Dario Amodei is a co-author — this paper is a direct link between the RL-agent lineage and the founding of Anthropic. Understanding RLHF is essential to understanding why modern AI agents behave as instructed rather than as purely reward-maximizing.

Cites → Sutton & Barto 1998 · Cited by → InstructGPT (Ouyang et al. 2022) · Constitutional AI (Anthropic 2022) · essentially all modern LLM training literature

Era IV — The LLM Substrate (2017–2020)

The papers that built the technology on which all modern AI agents run.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention Is All You Need. Advances in Neural Information Processing Systems 30 (NeurIPS 2017), 5998–6008.

arXiv: 1706.03762 Open DOI: 10.48550/arXiv.1706.03762 Submitted June 12, 2017 · NeurIPS 2017 · ~93,950 citations (SciSpace)

Proposed the Transformer architecture — a network based entirely on self-attention mechanisms, dispensing with recurrent and convolutional layers. The Transformer processes all input positions in parallel rather than sequentially, enabling far faster training and better modeling of long-range dependencies. Demonstrated state-of-the-art performance on English-German and English-French translation.

Why it matters for agent history Every modern AI agent — GPT-4, Claude, Gemini, Llama, and all their successors — runs on a Transformer or a direct descendant of one. "Attention Is All You Need" is one of the most cited papers in machine learning history. Without the Transformer, none of the modern LLM-agent era exists. It is the substrate on which the entire 2022–2026 agent story runs. Note: Ashish Vaswani, Niki Parmar, and other co-authors later founded Adept AI (ACT-1, 2022), making this paper also the origin of the first computer-use agent team.

Cited by → GPT (Radford et al. 2018) · BERT (Devlin et al. 2018) · GPT-2 (2019) · GPT-3 (2020) · every subsequent LLM · ReAct (Yao et al. 2022)

Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., … Amodei, D. (2020). Language Models are Few-Shot Learners. Advances in Neural Information Processing Systems 33 (NeurIPS 2020), 1877–1901.

arXiv: 2005.14165 Open DOI: 10.48550/arXiv.2005.14165 Semantic Scholar: Corpus ID 218971783 · 20,000+ citations

Introduced GPT-3, demonstrating that scaling language models to 175 billion parameters produces few-shot learning capabilities — the ability to perform new tasks from just a few demonstrations in the prompt, without gradient updates. This emergent capability made GPT-3 the first LLM capable of being repurposed for new tasks via prompting alone.

Why it matters for agent history GPT-3 established the key property that makes LLM-based agents possible: few-shot instruction following. An agent that can be told what to do in natural language, and follows those instructions reliably, is different in kind from any previous AI system. The OpenAI API, opened in June 2020, made GPT-3 callable from external programs — the infrastructure event that enabled the agent ecosystem of 2022–2023.

Cites → Vaswani et al. 2017 · Cited by → InstructGPT (2022) · ReAct (2022) · Chain of Thought (2022) · BabyAGI · AutoGPT

Era V — LLM Agents Emerge (2022–2023)

The papers that defined the modern LLM-agent paradigm: the architecture, the tool-use mechanism, and the first frameworks.

Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q. V., & Zhou, D. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35 (NeurIPS 2022).

arXiv: 2201.11903 Open Submitted January 28, 2022 · NeurIPS 2022 DOI: 10.48550/arXiv.2201.11903

Demonstrated that prompting large language models with intermediate reasoning steps ("chain of thought") dramatically improves performance on arithmetic, commonsense, and symbolic reasoning tasks. Chain-of-thought prompting provides a few examples where the model shows its reasoning before giving an answer, enabling the model to decompose complex problems step by step.

Why it matters for agent history Chain-of-thought prompting is the direct precursor to ReAct. Once Wei et al. demonstrated that LLMs could reason step-by-step, the next question was: could those reasoning steps interleave with actions on the world? The answer — yes — is what ReAct (Yao et al. 2022) demonstrated four months later. CoT is the reasoning half of ReAct's reason-act-observe loop.

Cited by → ReAct (Yao et al. 2022) · Tree of Thoughts (Yao et al. 2023) · virtually all subsequent LLM reasoning work

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022/2023). ReAct: Synergizing Reasoning and Acting in Language Models. International Conference on Learning Representations (ICLR 2023).

arXiv: 2210.03629 Open DOI: 10.48550/arXiv.2210.03629 Submitted October 6, 2022 · Published ICLR 2023, Kigali GitHub: ysymyth/ReAct

Proposed ReAct: interleaving reasoning traces ("Thought") with actions ("Act") and observations from an external environment in a single LLM loop. The model reasons about what to do, takes an action (e.g., searches Wikipedia), observes the result, reasons again, and repeats. Demonstrated improvements on question-answering, fact-checking, and interactive tasks over pure reasoning or pure action generation in isolation.

Why it matters for agent history ReAct is the architectural template that virtually every modern LLM agent framework implements. AutoGPT, BabyAGI, LangChain agents, CrewAI, AutoGen, OpenAI Agents SDK, Anthropic's agent scaffolding — all implement a version of the reason-act-observe loop that ReAct defined. It is the single most cited methodological reference in the modern AI-agent literature. The paper was submitted to arXiv six months before AutoGPT; it is the conceptual bridge between GPT-4's capabilities and the agent products of 2023. See the full entry in the timeline.

Cites → Wei et al. 2022 (CoT) · Cited by → AutoGPT (2023) · BabyAGI (2023) · LangChain · CrewAI · AutoGen · OpenAI Agents SDK documentation

Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., & Scialom, T. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. Advances in Neural Information Processing Systems (NeurIPS 2023).

arXiv: 2302.04761 Open DOI: 10.48550/arXiv.2302.04761 Submitted February 9, 2023 · Meta AI Research & Universitat Pompeu Fabra Semantic Scholar: Corpus ID 256697342

Introduced Toolformer — a language model fine-tuned to autonomously decide which external APIs to call, when, with what arguments, and how to incorporate results into subsequent generation. Tools include a calculator, search engine, translation system, Q&A system, and calendar. Learned entirely through self-supervised training on a handful of demonstrations per tool.

Why it matters for agent history Toolformer demonstrated that tool use could be learned as a model capability rather than bolted on through prompting. Where ReAct showed that a model could be prompted to use tools, Toolformer showed a model could be trained to use tools. This distinction matters for the development of capable agents: OpenAI's function calling API (June 2023) and Claude's tool use are implementations of this idea at production scale. Toolformer cites ReAct and sits alongside it as a foundational paper in the modern tool-use literature.

Cites → Yao et al. 2022 (ReAct) · Cited by → OpenAI function calling · GPT-4 tool use · HuggingGPT · Gorilla

Shen, Y., Song, K., Tan, X., Li, D., Lu, W., & Zhuang, Y. (2023). HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace. Advances in Neural Information Processing Systems (NeurIPS 2023).

arXiv: 2303.17580 Open Submitted March 30, 2023 · Zhejiang University & Microsoft Research Asia

Proposed using ChatGPT as a "task planner" that decomposes complex requests into subtasks, routes each subtask to the appropriate specialist model on HuggingFace Hub, collects the results, and synthesizes a final response. Demonstrates multi-model coordination as an agent architecture.

Why it matters for agent history HuggingGPT is one of the first papers to demonstrate LLM-as-orchestrator for a multi-model agent system — the architectural pattern that underlies modern agentic frameworks where an LLM coordinates multiple specialized tools or models. It was published the same week as AutoGPT and captures the rapid parallel exploration of LLM agent architectures in March 2023.

Richards, T. B. (2023). Auto-GPT: An Autonomous GPT-4 Experiment. GitHub. March 30, 2023.

GitHub: Significant-Gravitas/AutoGPT Open No formal arXiv / DOI — primary source is the GitHub repository and README

AutoGPT is an open-source Python application that enables GPT-4 to autonomously pursue user-defined goals through a self-prompting loop with web browsing, file operations, code execution, and memory management. Upon release on March 30, 2023, it became the top-trending GitHub repository by April 3 and reached 100,000 stars within weeks — the fastest-growing open-source project of its era.

Why it matters for agent history AutoGPT is the cultural origin of what the public calls "AI agents." The paper that defines the modern autonomous-agent paradigm is ReAct (Yao et al. 2022); the project that introduced it to mass consciousness is AutoGPT. Cataloged here as a primary source because the GitHub repository — not any secondary coverage — is the authoritative record of what was released and when. See the full timeline entry.

Era VI — Benchmarking Agents (2023–2026)

The papers that created rigorous ways to measure what AI agents can actually do — distinct from what their creators claim they can do.

Jimenez, C. E., Yang, J., Wettig, A., Yao, S., Pei, K., Press, O., & Narasimhan, K. (2023/2024). SWE-bench: Can Language Models Resolve Real-World GitHub Issues? International Conference on Learning Representations (ICLR 2024).

arXiv: 2310.06770 Open Submitted October 10, 2023 · ICLR 2024 DOI: 10.48550/arXiv.2310.06770 Live leaderboard: swebench.com

Introduced SWE-bench, an evaluation framework of 2,294 software engineering problems drawn from real GitHub issues across 12 popular Python repositories. A model is given a codebase and an issue description and must generate a patch that passes the associated tests. At launch, the best-performing model (Claude 2) solved only 1.96% of issues.

Why it matters for agent history SWE-bench became the standard benchmark for measuring AI agent capability in software engineering — the most economically important agentic use case. The trajectory from 1.96% (October 2023) to 78.4% (April 2026) is the most precisely documented capability improvement curve in the modern agent literature. It is the empirical backbone for claims about AI coding productivity. Notable: Shunyu Yao (ReAct lead author) is also a co-author of SWE-bench.

Cites → Yao et al. 2022 (ReAct) · Cited by → SWE-agent (2024) · Claude Code evaluations · Devin benchmarks · all 2024–2026 coding agent literature

Liu, X., Yu, H., Zhang, H., Xu, Y., Lei, X., Lai, H., … Tang, J. (2023). AgentBench: Evaluating LLMs as Agents. International Conference on Learning Representations (ICLR 2024).

arXiv: 2308.03688 Open Submitted August 7, 2023 · Tsinghua University & co-authors

Introduced AgentBench, a multi-task benchmark evaluating LLMs across eight distinct environments including web browsing, online shopping, household tasks, digital card games, and coding challenges. The benchmark tests agents across varied reasoning, planning, and tool-use scenarios to provide a broader capability assessment than any single-domain benchmark.

Why it matters for agent history AgentBench recognized that SWE-bench, while rigorous, measured only software engineering. Real-world agent deployment spans far more domains. AgentBench is the multi-domain complement to SWE-bench and reveals that capability gains in one domain (coding) do not automatically transfer to others — an important constraint on claims about "general" AI agents.

Era VII — Safety, Alignment, and Governance (2022–2026)

Bai, Y., Jones, A., Ndousse, K., Askell, A., Chen, A., DasSarma, N., … Kaplan, J. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv preprint arXiv:2212.08073. Anthropic.

arXiv: 2212.08073 Open Submitted December 15, 2022

Introduced Constitutional AI (CAI), a technique for training AI systems to be helpful and harmless by using AI feedback — rather than human feedback alone — against a set of explicit principles (a "constitution"). The model is trained to critique and revise its own outputs against the principles, then further trained via RLHF on the refined outputs.

Why it matters for agent history CAI is Anthropic's core alignment technique and the basis for Claude's training. For agent history specifically, CAI addresses one of the most critical unsolved problems: how do you ensure an autonomous agent that takes actions in the world — not just generates text — behaves according to human values? The "constitutional" framing explicitly acknowledges that agents need rules governing their behavior, not just capability.

Perez, F., & Ribeiro, I. (2022). Ignore Previous Prompt: Attack Techniques For Language Models. Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022.

arXiv: 2211.09527 Open Submitted November 17, 2022

Formally named and characterized prompt injection attacks — where malicious content in the LLM's input causes it to ignore its original instructions and follow attacker-specified ones instead. Documented "direct prompt injection" (user overrides system prompt) and "indirect prompt injection" (environmental content overrides instructions).

Why it matters for agent history Prompt injection is the primary attack vector for AI agents, and this paper is the first formal treatment of the attack class. Every incident in the Failure Archive involving a chatbot that was made to "override its rules" — the Chevy $1 car, the DPD swearing incident, Bing Sydney's persona shift — is a prompt injection incident. As agents gain access to more tools and take more consequential actions, prompt injection becomes a critical security problem rather than a curiosity.

Agentic History Blog. (2026). Long-Horizon Drift Is Becoming the Real Safety Boundary for Enterprise Agents. blog.agentichistory.org.

blog.agentichistory.org Open

Documents the emerging safety problem of long-horizon drift — the phenomenon by which agents pursuing extended goals gradually diverge from their original objectives in subtle ways, particularly as they gain memory and autonomy. Traces the connection to MCP-era tool connectivity and enterprise governance gaps.

Why it matters for agent history Long-horizon drift is the safety problem that emerges specifically from agentic AI (as distinct from chatbot AI): a single-turn response cannot drift; a multi-day autonomous agent can. This is documented contemporaneously by Agentic History's research blog and links to the Gartner 2026 Hype Cycle's concern about agents gaining autonomy at scale.

Era VIII — Primary Product Documents and Releases (2023–2025)

Official announcements and blog posts that are themselves primary sources for the history of AI agents as deployed products.

Anthropic. (2024, November 25). Introducing the Model Context Protocol. Anthropic Blog.

anthropic.com/news/model-context-protocol Open GitHub spec: modelcontextprotocol/specification

Introduced and open-sourced the Model Context Protocol (MCP), an open standard for connecting AI models to external tools, data sources, and applications. MCP defines a client-server architecture using JSON-RPC 2.0 where AI systems (clients) connect to data/tool providers (servers) through a standardized interface.

Why it matters for agent history MCP became the de facto agent-to-tool connectivity standard, adopted by OpenAI, Google, Microsoft, and hundreds of third-party providers within months. It is the modern equivalent of FIPA-ACL (1997) — an industry-wide communication standard for agents. The speed of MCP's adoption is historically remarkable.

Anthropic. (2024, October 22). Introducing computer use, a new Claude 3.5 Sonnet, and Claude 3.5 Haiku. Anthropic Blog.

anthropic.com/news/3-5-models-and-computer-use Open

Announced computer use in public beta — the ability for Claude 3.5 Sonnet to view a computer screen and interact with it by moving the cursor, clicking, and typing. First widely available computer-use capability from a frontier AI lab.

Why it matters for agent history Computer use marks the transition from "agents that call APIs" to "agents that operate any software a human can use." It is a category shift in what agents can access. The product announcement is the primary source for the October 22, 2024 date, and the capability's vocabulary — "computer use" — entered industry usage from this document.

OpenAI. (2025, January 23). Introducing Operator. OpenAI Blog.

openai.com/index/introducing-operator/ Open Wikipedia: OpenAI Operator — initial release January 23, 2025

Announced OpenAI Operator — an AI agent capable of autonomously performing tasks through web browser interactions including filling forms, placing orders, scheduling appointments, and other browser-based tasks. Initially available to ChatGPT Pro subscribers in the United States.

Why it matters for agent history Operator is the first consumer-facing browser-use agent from a frontier lab offered as a commercial product rather than a research preview. It operationalizes the computer-use capability for a mass-market context. The January 23, 2025 date is documented in the Wikipedia article on OpenAI Operator (sourced to the official announcement).

Guided Reading Paths by Question

The library above contains 33 primary sources spanning 55 years. These reading paths are starting points for specific questions. Each path is ordered — read the papers in sequence, not at random, to build the right conceptual scaffolding.

Path 1 — "I want to understand the BDI architecture from first principles"

Estimated time: 8–12 hours across all four papers.

Start: Fikes & Nilsson 1971 (STRIPS) — understand the planning problem first. 30 min.
Bratman 1987 (Intention, Plans, and Practical Reason) — read Chapters 1–3 on the structure of intention. 3 hours.
Rao & Georgeff 1991 (Modeling Rational Agents within a BDI Architecture) — the computational formalization. 2 hours.
Finish: Rao 1996 (AgentSpeak(L)) — the executable language that implements BDI theory. 2 hours.

Path 2 — "I want to understand multi-agent systems from the beginning"

Estimated time: 6–8 hours.

Start: Smith 1980 (Contract Net Protocol) — the foundational coordination mechanism. 1.5 hours.
Shoham 1993 (Agent-Oriented Programming) — agents as software design units. 2 hours.
Wooldridge & Jennings 1995 (Intelligent Agents: Theory and Practice) — the canonical field definition and survey. 3 hours.
Finish: Franklin & Graesser 1997 (Is It an Agent, or Just a Program?) — the taxonomy that reveals why definitions still matter. 1 hour.

Path 3 — "I want to understand modern LLM agents from first principles"

Estimated time: 5–7 hours. No prior AI knowledge required beyond basic familiarity with language models.

Start: Wei et al. 2022 (Chain-of-Thought Prompting) — how LLMs can reason step-by-step. 1.5 hours.
Yao et al. 2022 (ReAct) — the core agent architecture, reason-act-observe. 2 hours.
Schick et al. 2023 (Toolformer) — how models learn to use external tools. 1.5 hours.
Finish: Jimenez et al. 2023 (SWE-bench) — how we actually measure whether agents work. 1 hour.

Path 4 — "I want to trace the complete lineage from 1980 to 2023"

Estimated time: 12–16 hours. The comprehensive path.

Smith 1980 (Contract Net Protocol)
Bratman 1987 (Intention, Plans, Practical Reason) — Chapters 1–3
Wooldridge & Jennings 1995 (Intelligent Agents)
Russell & Norvig 1995 (AI: A Modern Approach) — Chapters 1–2 (rational agent abstraction)
Sutton & Barto 1998 (Reinforcement Learning) — Chapters 1–3 (core framework)
Mnih et al. 2013 (Playing Atari)
Vaswani et al. 2017 (Attention Is All You Need)
Brown et al. 2020 (GPT-3)
Wei et al. 2022 (Chain-of-Thought Prompting)
Yao et al. 2022 (ReAct)
Finish: Richards 2023 (AutoGPT) — the public arrival

Path 5 — "I want to understand AI agent safety and governance risks"

Estimated time: 4–6 hours.

Start: Perez & Ribeiro 2022 (Prompt Injection) — the core attack class. 1 hour.
Bai et al. 2022 (Constitutional AI) — how Anthropic addresses alignment. 2 hours.
Agentic History Failure Archive — what has already gone wrong, with sources.
Finish: Gartner 2026 Hype Cycle governance annotation — the enterprise risk landscape.

Path 6 — "I want to understand reinforcement learning as an agent paradigm"

Estimated time: 8–10 hours.

Start: Sutton & Barto 1998 — Chapters 1–4 (tabular methods and the core framework). 4 hours.
Christiano et al. 2017 (RLHF) — how human preferences become reward signals. 2 hours.
Mnih et al. 2013 (Playing Atari) — RL meets deep learning. 2 hours.
Finish: Silver et al. 2016 (AlphaGo) — RL at superhuman performance. 2 hours.

Path 7 — "I'm building an LLM agent and need the practical literature"

Estimated time: 4–5 hours. Skip history; focus on what you'll implement.

Start: Yao et al. 2022 (ReAct) — the pattern you'll implement. 2 hours.
Schick et al. 2023 (Toolformer) — tool use mechanics. 1.5 hours.
Jimenez et al. 2023 (SWE-bench) — how to measure if it works. 1 hour.
Finish: Perez & Ribeiro 2022 (Prompt Injection) — what to guard against. 1 hour.
Then read: Gartner Hype Cycle annotation for the production deployment landscape.

Suggest a Primary Paper

This library is maintained with the same sourcing standards as the main timeline. To suggest a paper for inclusion, please send to curator@agentichistory.org:

Full citation (authors, title, venue, year, DOI or arXiv number).
Open-access link if available.
A brief explanation of why it belongs in the core primary-sources canon rather than the broader literature.

Priority is given to papers that: (a) introduced a concept the field subsequently adopted, (b) are the original source for claims widely made without citation, or (c) connect the historical MAS/RL/philosophy lineage to the modern LLM-agent literature.