Mathematics Of Agency Study Guide And Gap Analysis

Source: mathematics-of-agency-study-guide-and-gap-analysis.md (ingested 2026-03-28)

The Mathematics of Agency: A Critical Review and Study Guide

Executive Summary

This document synthesizes a comprehensive survey of foundational mathematical frameworks for autonomous agents, examining why critical work was marginalized during hype cycles and what we're currently missing in the "agentic AI" era. The analysis reveals a cyclical pattern: each AI wave builds on benchmarks and demonstrations while ignoring computational costs, uncertainty, and long-term stability—only to rediscover these fundamentals when systems fail at scale.

Part I: Historical Foundations (What We Forgot)

1. Bounded Rationality and Realistic Agent Models

Core Insight: Perfect optimization is neither achievable nor desirable for real agents.

Key Work:

Herbert Simon (1955): "A Behavioral Model of Rational Choice"
- Introduced satisficing vs optimizing
- Agents operate under cognitive/computational limits
- Formalized as: find "good enough" solutions within resource constraints

Why It Matters Now: Modern LLM agents assume infinite compute for chain-of-thought reasoning. No cost model exists for thinking itself.

Mathematical Framework:

Agent objective = max[reward - λ·computation_cost - μ·time]

Currently, λ = μ = 0 in most implementations.

Study Focus:

Anytime algorithms
Resource-rational analysis
Metareasoning (when to stop planning)

2. Cybernetics: The Original Agent Theory

Core Insight: Agents are control systems maintaining invariants through feedback.

Key Work:

W. Ross Ashby (1952): "Design for a Brain"
- Ultrastability: adaptive systems that preserve essential variables
- Homeostasis as a control law
- Variety and requisite variety (Law of Requisite Variety)
Ashby (1956): "An Introduction to Cybernetics"
- Formalized regulation: agent must have internal variety matching environment
- Anticipation and predictive control

Why It Was Ignored:

Too interdisciplinary (engineering + biology + psychology)
Symbolic AI focused on logic, not dynamics
No clear benchmark tasks

Why It's Critical Now:

Agents must maintain goal stability over time
Self-correction requires feedback loops
Multi-agent systems need coordination through shared regulation

Mathematical Tools:

State-space models
Lyapunov stability
Feedback control theory

3. Decision Theory: Belief-Based Agency

Core Insight: Rational agents reason under uncertainty with subjective probabilities.

Key Work:

Leonard Savage (1954): "The Foundations of Statistics"
- Subjective expected utility (SEU)
- Personal probability
- Coherence conditions (Dutch book arguments)
Richard Jeffrey (1965): "The Logic of Decision"
- Jeffrey conditioning: belief updates without certainty
- More realistic than Bayesian conditioning with discrete events

Current Gap: LLM agents don't maintain coherent probability distributions. Confidence scores are post-hoc, not decision-theoretic.

Study Applications:

Belief state tracking
Preference elicitation under uncertainty
Value of information calculations

4. Multi-Agent Systems: Game Theory Foundations

Core Insight: Agent interaction requires modeling beliefs about beliefs.

Key Work:

Harsanyi (1967-68): Games with incomplete information
- Type spaces: agents have beliefs about others' types
- Bayesian games
- Foundation for mechanism design
Aumann (1976): "Agreeing to Disagree"
- Common knowledge and its limits
- Rational agents with common priors must agree given common knowledge
- Implications for communication protocols

Current Gap: "Multi-agent systems" today often lack:

Explicit belief modeling (theory of mind)
Incentive compatibility
Communication semantics beyond message passing

Mathematical Framework:

Agent i's belief hierarchy:
- θ_i: agent i's private type
- p_i(θ_{-i}): beliefs about others' types  
- p_i(p_{-i}(·)): beliefs about others' beliefs
- ... (infinite hierarchy)

Harsanyi types collapse this into a single state space.

5. Information-Theoretic Agency

Core Insight: Agents compress observations into actionable representations.

Key Work:

Tishby, Pereira, Bialek (1999): The Information Bottleneck
- Optimal tradeoff: I(X;Z) vs I(Z;Y)
- Representation Z captures task-relevant information about Y from input X
- Explains representation learning in deep networks

Extension to Agency:

Agent's representation should:
- Maximize predictive power for value function
- Minimize memory/computation

Study Applications:

State abstraction in MDPs
Hierarchical representations
Transfer learning (what to compress vs preserve)

6. Control Theory: Stability and Robustness

Core Insight: Real agents operate under noise, delays, and model mismatch.

Key Work:

Kalman (1960): Kalman Filter
- Optimal state estimation under Gaussian noise
- Belief state tracking for POMDPs
Todorov (2004): Linearly-solvable optimal control
- Control as inference
- Elegant handling of stochastic dynamics
- Path integral control

Why It Was Ignored by AI:

Lived in EE/ME departments
Assumed known dynamics
RL preferred model-free methods

Why It's Essential Now:

Guarantees on convergence
Robustness analysis
Sample efficiency through model exploitation

Mathematical Tools:

Lyapunov functions (stability proofs)
LQR/LQG (optimal linear control)
H-infinity control (worst-case robustness)

Part II: Current Gaps in "Agentic AI"

Gap 1: No Theory of Agency Itself

What's Missing:

When does a system count as one agent vs multiple?
What defines goal persistence?
When should an agent stop acting?

Philosophical Foundations (ignored):

Bratman's intention theory
Dennett's intentional stance
Frankfurt's hierarchy of desires

Practical Consequence: Agents that:

Loop indefinitely
Have contradictory goals
Can't distinguish "done" from "stuck"

Gap 2: No Cost Model for Cognition

Current Practice:

# Pseudocode of typical agent loop
while not task_complete:
    thought = llm.generate(context, max_tokens=infinite)
    action = parse(thought)
    execute(action)

What's Wrong: No cost for thinking, planning, or tool use.

Correct Formulation (from bounded rationality):

V*(s) = max_a [R(s,a) - c_think·|reasoning| - c_time·delay + γE[V(s')]]

Study Framework:

Rational metareasoning (Russell & Wefald)
Information value theory
Anytime algorithms with performance profiles

Gap 3: Planning Without World Models

Current Practice:

Prompt engineering for "planning"
Chain-of-thought as surrogate for simulation
No explicit dynamics model

What's Missing:

Predictive models of action outcomes
Counterfactual reasoning (what if I hadn't?)
Epistemic vs aleatoric uncertainty

Classical Solution: POMDP framework

Belief state: b(s) = P(s | history)
Action selection: a* = argmax_a ∑_s b(s)·Q(s,a)
Belief update: b'(s') ∝ P(o|s',a)·∑_s P(s'|s,a)·b(s)

Why It Matters: Without world models, agents cannot:

Anticipate side effects
Plan multi-step sequences reliably
Learn from counterfactuals

Gap 4: Memory as Database, Not Cognition

Current Practice:

memory = VectorDB()
memory.add(experience)
relevant = memory.query(current_context, k=5)

What's Missing:

Forgetting (when and what)
Abstraction (concept formation)
Consolidation (integration of new beliefs)
Non-monotonic reasoning (belief revision)

Cognitive Science Foundations:

Spacing effect
Interference theory
Schema theory
ACT-R memory model

Mathematical Framework:

Jeffrey conditioning (soft updates)
AGM belief revision postulates
Memory decay models (exponential, power law)

Gap 5: Multi-Agent Coordination Is Superficial

Current "Multi-Agent" Systems:

agents = [Agent1(), Agent2(), Agent3()]
for agent in agents:
    agent.act(shared_context)  # No real coordination

Missing Components:

Common knowledge establishment
Theory of mind (belief about beliefs)
Incentive alignment
Coalition formation

Game-Theoretic Foundations:

Correlated equilibrium (coordination devices)
Mechanism design (incentive compatibility)
Communication complexity
Signaling games

Gap 6: Embodiment Constraints Ignored

Even "software agents" are embodied in:

API rate limits
Latency
Irreversible actions
Safety constraints

Missing Theory:

Affordances (Gibson): what actions are possible?
Viability kernels: what states are safe?
Precondition modeling
Graceful degradation

Example: An agent calling a payment API should model:

Idempotency (can I retry?)
Reversibility (can I undo?)
Rate limits (when can I act again?)

Gap 7: Evaluation Is Broken

Current Metrics:

Task success rate
Benchmark scores
Human preference

What's Not Measured:

Long-term stability (does performance decay?)
Goal drift (does the agent change what it wants?)
Robustness to distribution shift
Recovery from self-error

Needed Framework:

Agent quality = f(
    short_term_reward,
    long_term_stability,
    robustness_to_perturbation,
    interpretability,
    cost_efficiency
)

Gap 8: Alignment Beyond Instruction Following

Current Approach:

RLHF on human preferences
Constitutional AI (rules + feedback)
Prompt engineering for "safety"

Deep Problems:

Value learning under uncertainty
Corrigibility (accepting correction)
Shutdownability
Handling preference change over time

Research Areas (under-explored):

Cooperative inverse RL
Assistance games
Value alignment problem (Stuart Russell)
Embedded agency (agent models including itself)

Part III: Post-Bubble Fundamentals (What's Coming)

The Pattern Recognition

Every AI hype cycle:

Demo phase: Focus on impressive outputs
Scale phase: "More compute/data will fix it"
Reality phase: Costs explode, edge cases dominate
Fundamentals phase: Return to ignored math

We're currently at stage 2-3 for "agentic AI."

What the Correction Will Look Like

Shift 1: From Vibes to Variational Principles

Current:

"The agent uses chain-of-thought to reason about the problem"

Future:

"The agent minimizes expected free energy under computational constraints"

Shift 2: Explicit Cost Accounting

All agent systems will include:

Compute budgets
Time discounting for thinking
Memory costs
Communication costs

Shift 3: Fewer, Better Agents

Instead of:

"Swarms" of simple agents
One mega-agent for everything

We'll see:

Carefully designed agent boundaries
Explicit interfaces and contracts
Provable coordination properties

Shift 4: Return to Control Theory

Model-based RL will dominate
Stability guarantees required
Safety envelopes explicit
Formal verification for critical applications

Part IV: Study Roadmap

Prerequisites

Mathematics:

Probability theory (measure-theoretic)
Optimization (convex, stochastic)
Dynamical systems
Information theory
Game theory basics

Computer Science:

Algorithms (complexity, approximation)
Formal methods (logic, verification)
Systems (distributed computing, databases)

Core Curriculum

Module 1: Decision Theory (4 weeks)

Texts:

Savage: "The Foundations of Statistics" (Ch 1-5)
Jeffrey: "The Logic of Decision" (Ch 1, 4, 11)
Gilboa: "Theory of Decision Under Uncertainty"

Exercises:

Prove Dutch book theorem
Implement Jeffrey conditioning
Model agent preference learning from observations

Module 2: Reinforcement Learning Foundations (6 weeks)

Texts:

Sutton & Barto: "Reinforcement Learning" (2nd ed)
Bertsekas: "Dynamic Programming and Optimal Control"
Puterman: "Markov Decision Processes"

Focus Areas:

Bellman operators and contraction mappings
Policy gradient methods (proper derivations)
POMDP planning algorithms
Regret bounds and sample complexity

Module 3: Control Theory (5 weeks)

Texts:

Åström & Murray: "Feedback Systems"
Todorov: Papers on optimal control theory
Tedrake: "Underactuated Robotics"

Key Concepts:

Lyapunov stability
LQR/LQG
Model predictive control
Robust control (H-infinity)

Module 4: Multi-Agent Systems (5 weeks)

Texts:

Osborne & Rubinstein: "A Course in Game Theory"
Shoham & Leyton-Brown: "Multiagent Systems"
Fudenberg & Tirole: "Game Theory"

Focus:

Bayesian games (Harsanyi)
Mechanism design
Communication complexity
Learning in games

Module 5: Information Theory & Bounded Rationality (4 weeks)

Texts:

Cover & Thomas: "Elements of Information Theory"
Ortega & Braun: Papers on thermodynamics of computation
Tishby: Information bottleneck papers

Applications:

Rate-distortion theory for agents
Free energy principle (Friston)
Resource-rational analysis

Module 6: Cybernetics & Systems Theory (3 weeks)

Texts:

Ashby: "An Introduction to Cybernetics"
Ashby: "Design for a Brain"
Wiener: "Cybernetics"

Modern Connections:

Homeostatic control in RL
Meta-learning as ultrastability
Variety matching in multi-agent systems

Advanced Topics

Topic A: Formal Verification for Agents

Goal: Prove properties of agent systems

Tools:

Temporal logic (LTL, CTL)
Model checking
Abstract interpretation
Probabilistic verification

Application: Safety guarantees for autonomous systems

Topic B: Embedded Agency

Goal: Agents that model themselves

Challenges:

Self-reference and paradoxes
Logical uncertainty
Updatelessness
Grain of truth problem

Key Papers: MIRI's agent foundations work

Topic C: Compositional Agent Design

Goal: Build complex agents from verified components

Framework:

Category theory for agent composition
Open games (Ghani et al.)
String diagrams
Interface specifications

Part V: Practical Implementation Guide

Building a "Real" Agent: Checklist

1. Define the Agent Boundary

What is the agent's scope of control?
What is environment vs self?
What actions are reversible?
What states are safe/viable?

2. Specify Objectives with Costs

class AgentObjective:
    def __init__(self):
        self.reward_function = ...
        self.compute_cost = ...  # per inference step
        self.time_cost = ...     # per wall-clock second
        self.memory_cost = ...   # per byte stored
        
    def value(self, state, action, resources_used):
        return (self.reward_function(state, action) 
                - self.compute_cost * resources_used.compute
                - self.time_cost * resources_used.time
                - self.memory_cost * resources_used.memory)

3. Implement Explicit World Models

class WorldModel:
    def predict(self, state, action) -> Distribution[State]:
        """Returns distribution over next states"""
        pass
    
    def epistemic_uncertainty(self, state) -> float:
        """Model uncertainty (reducible with more data)"""
        pass
    
    def aleatoric_uncertainty(self, state, action) -> float:
        """Environmental stochasticity (irreducible)"""
        pass

4. Design Memory with Forgetting

class AgentMemory:
    def store(self, experience, importance):
        """Store with priority/importance weighting"""
        pass
    
    def decay(self, time_elapsed):
        """Probabilistic forgetting based on age and access"""
        pass
    
    def consolidate(self):
        """Abstract/compress old memories"""
        pass

5. Add Metareasoning

class MetaReasoner:
    def should_continue_planning(self, current_plan, time_spent) -> bool:
        """VOI: is more planning worth the cost?"""
        expected_improvement = self.estimate_plan_improvement()
        cost_of_thinking = self.compute_cost * time_spent
        return expected_improvement > cost_of_thinking

6. Instrument for Long-Term Evaluation

class AgentMonitor:
    def track_metrics(self):
        return {
            'immediate_reward': ...,
            'goal_drift': self.measure_goal_change_over_time(),
            'belief_coherence': self.dutch_book_vulnerability(),
            'robustness': self.performance_under_perturbation(),
            'cost_efficiency': self.reward_per_unit_compute()
        }

Part VI: Research Frontiers

Open Problems

1. Computational Theory of Agency

Question: What is the minimal formal definition of an agent?

Approaches:

Category-theoretic agents
Agent-environment boundary as a Markov blanket
Information-theoretic definitions (causal emergence)

2. Bounded Rationality at Scale

Question: How should agents allocate compute across:

Planning
Learning
Acting
Monitoring

Tools:

Optimal stopping theory
Multi-armed bandits for metareasoning
Rational metareasoning frameworks

3. Multi-Agent Emergence

Question: When do independent agents form coalitions, protocols, or institutions?

Relevant Fields:

Evolutionary game theory
Mechanism design
Social choice theory
Distributed systems (consensus)

4. Safe Exploration for Embodied Agents

Question: How do agents learn without catastrophic failures?

Approaches:

Viability theory
Reachability analysis
Safe RL (constrained MDPs)
Shield synthesis

Conclusion: The Coming Correction

Why the Current Bubble Will Pop

Symptom 1: Agents are too expensive

Cost per task grows linearly with complexity
No architectural efficiency improvements
Compute scaling plateaus

Symptom 2: Reliability doesn't improve with scale

More parameters ≠ better long-term behavior
Edge cases dominate real deployments
No stability guarantees

Symptom 3: Multi-agent systems don't compose

Swarms collapse into chaos
No coordination without centralized control
Emergent behavior is unpredictable

What Comes Next

Phase 1 (2024-2025): Reality Check

High-profile agent failures
Cost explosion for production systems
Realization that demos ≠ products

Phase 2 (2025-2027): Fundamentals Renaissance

Return to control theory
Explicit cost models
Formal verification requirements
Model-based methods resurgence

Phase 3 (2027+): Mature Agent Engineering

Standardized agent architectures
Compositional design patterns
Provable properties
Boring reliability

How to Position for the Correction

For Researchers:

Work on cost-aware agent design
Develop long-term evaluation benchmarks
Bridge to control theory and decision theory
Focus on guarantees, not demos

For Engineers:

Instrument everything (costs, stability, drift)
Build world models explicitly
Design for composition and verification
Avoid "vibes-based" architecture

For Organizations:

Invest in fundamentals teams
Hire people who know the ignored math
Build for 10-year timelines, not 10-month demos
Establish rigorous evaluation beyond benchmarks

Appendix: Reading List

Tier 1: Essential Foundations

Simon - "A Behavioral Model of Rational Choice"
Ashby - "An Introduction to Cybernetics"
Savage - "The Foundations of Statistics"
Sutton & Barto - "Reinforcement Learning" (full book)
Osborne & Rubinstein - "A Course in Game Theory"

Tier 2: Deep Dives

Bertsekas - "Dynamic Programming and Optimal Control"
Åström & Murray - "Feedback Systems"
Tishby et al. - "The Information Bottleneck Method"
Harsanyi - "Games with Incomplete Information"
Todorov - "Optimal Control Theory" (survey paper)

Tier 3: Frontier Topics

Friston - "The Free Energy Principle"
Ortega & Braun - "Thermodynamics as a theory of decision-making"
Soares et al. - "Agent Foundations for Aligning Machine Intelligence"
Russell - "Human Compatible"
Open Games literature (Jules Hedges et al.)

Final Note: The companies that survive the correction will be those that invested in these fundamentals early—treating agents as engineered systems with costs, constraints, and formal properties, not magical text-generators with tool use bolted on.