I. Theoretical Definition of Agency
Agency in Large Language Models (LLMs) is defined as the capacity to generate goal-directed actions alongside token predictions. Unlike standard "chat" models which optimize strictly for next-token likelihood \( P(x_t | x_{<t}) \), an agentic system optimizes for a multi-step objective function where intermediate tokens (Reasoning/Thoughts) serve as latent variables to guide external execution.
Formalized by the Agency Hypothesis, true intelligence requires valid grounding in an interactive environment. The system operates in a continuous control loop: $$ S_t \\xrightarrow{\\pi} A_t \\xrightarrow{env} O_{t+1}, R_{t+1} $$ where \( S \) is the State, \( A \) is the Action, \( O \) is the Observation, and \( R \) is the Reward/Feedback.
The Perception Loop
Converting unstructured environmental feedback (API errors, HTML DOMs, sensor logs) into textual embeddings that fit the model's context window without cache saturation.
Primary Sources & Further Reading
- Yao et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models.
- Shinn et al. (2023). Reflexion: Language Agents with Iterative Design Learning.
- Wang et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models.
- Microsoft (2024). AutoGen: Enabling Next-Gen LLM Applications.
- Lilian Weng (2023). LLM Powered Autonomous Agents (Blog).
- OpenAI (2024). Function Calling and Tool Use Guides.