I. Theoretical Definition of Agency
Agency in Large Language Models (LLMs) is defined as the capacity to
generate goal-directed actions alongside token predictions. Unlike standard "chat" models
which optimize strictly for next-token likelihood $P(x_t | x_{ Formalized by the Agency Hypothesis, true intelligence requires
valid
grounding in an interactive environment. The system operates in a continuous control
loop:
$$ S_t \xrightarrow{\pi} A_t \xrightarrow{env} O_{t+1}, R_{t+1} $$ where $S$ is the
State, $A$ is the Action, $O$ is the Observation, and $R$ is the
Reward/Feedback. Converting unstructured environmental feedback (API errors, HTML DOMs, sensor
logs)
into textual embeddings that fit the model's context window without cache
saturation.The Perception Loop
Primary Sources & Further Reading
- Yao et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models.
- Shinn et al. (2023). Reflexion: Language Agents with Iterative Design Learning.
- Wang et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models.
- Microsoft (2024). AutoGen: Enabling Next-Gen LLM Applications.
- Lilian Weng (2023). LLM Powered Autonomous Agents (Blog).
- OpenAI (2024). Function Calling and Tool Use Guides.