Mathematics Of Agency Study Guide And Gap Analysis

assetactive

Mathematics Of Agency Study Guide And Gap Analysis

Source: mathematics-of-agency-study-guide-and-gap-analysis.md (ingested 2026-03-28)

The Mathematics of Agency: A Critical Review and Study Guide

Executive Summary

This document synthesizes a comprehensive survey of foundational mathematical frameworks for autonomous agents, examining why critical work was marginalized during hype cycles and what we're currently missing in the "agentic AI" era. The analysis reveals a cyclical pattern: each AI wave builds on benchmarks and demonstrations while ignoring computational costs, uncertainty, and long-term stability—only to rediscover these fundamentals when systems fail at scale.


Part I: Historical Foundations (What We Forgot)

1. Bounded Rationality and Realistic Agent Models

Core Insight: Perfect optimization is neither achievable nor desirable for real agents.

Key Work:

  • Herbert Simon (1955): "A Behavioral Model of Rational Choice"
    • Introduced satisficing vs optimizing
    • Agents operate under cognitive/computational limits
    • Formalized as: find "good enough" solutions within resource constraints

Why It Matters Now: Modern LLM agents assume infinite compute for chain-of-thought reasoning. No cost model exists for thinking itself.

Mathematical Framework:

Agent objective = max[reward - λ·computation_cost - μ·time]

Currently, λ = μ = 0 in most implementations.

Study Focus:

  • Anytime algorithms
  • Resource-rational analysis
  • Metareasoning (when to stop planning)

2. Cybernetics: The Original Agent Theory

Core Insight: Agents are control systems maintaining invariants through feedback.

Key Work:

  • W. Ross Ashby (1952): "Design for a Brain"

    • Ultrastability: adaptive systems that preserve essential variables
    • Homeostasis as a control law
    • Variety and requisite variety (Law of Requisite Variety)
  • Ashby (1956): "An Introduction to Cybernetics"

    • Formalized regulation: agent must have internal variety matching environment
    • Anticipation and predictive control

Why It Was Ignored:

  • Too interdisciplinary (engineering + biology + psychology)
  • Symbolic AI focused on logic, not dynamics
  • No clear benchmark tasks

Why It's Critical Now:

  • Agents must maintain goal stability over time
  • Self-correction requires feedback loops
  • Multi-agent systems need coordination through shared regulation

Mathematical Tools:

  • State-space models
  • Lyapunov stability
  • Feedback control theory

3. Decision Theory: Belief-Based Agency

Core Insight: Rational agents reason under uncertainty with subjective probabilities.

Key Work:

  • Leonard Savage (1954): "The Foundations of Statistics"

    • Subjective expected utility (SEU)
    • Personal probability
    • Coherence conditions (Dutch book arguments)
  • Richard Jeffrey (1965): "The Logic of Decision"

    • Jeffrey conditioning: belief updates without certainty
    • More realistic than Bayesian conditioning with discrete events

Current Gap: LLM agents don't maintain coherent probability distributions. Confidence scores are post-hoc, not decision-theoretic.

Study Applications:

  • Belief state tracking
  • Preference elicitation under uncertainty
  • Value of information calculations

4. Multi-Agent Systems: Game Theory Foundations

Core Insight: Agent interaction requires modeling beliefs about beliefs.

Key Work:

  • Harsanyi (1967-68): Games with incomplete information

    • Type spaces: agents have beliefs about others' types
    • Bayesian games
    • Foundation for mechanism design
  • Aumann (1976): "Agreeing to Disagree"

    • Common knowledge and its limits
    • Rational agents with common priors must agree given common knowledge
    • Implications for communication protocols

Current Gap: "Multi-agent systems" today often lack:

  • Explicit belief modeling (theory of mind)
  • Incentive compatibility
  • Communication semantics beyond message passing

Mathematical Framework:

Agent i's belief hierarchy:
- θ_i: agent i's private type
- p_i(θ_{-i}): beliefs about others' types  
- p_i(p_{-i}(·)): beliefs about others' beliefs
- ... (infinite hierarchy)

Harsanyi types collapse this into a single state space.


5. Information-Theoretic Agency

Core Insight: Agents compress observations into actionable representations.

Key Work:

  • Tishby, Pereira, Bialek (1999): The Information Bottleneck
    • Optimal tradeoff: I(X;Z) vs I(Z;Y)
    • Representation Z captures task-relevant information about Y from input X
    • Explains representation learning in deep networks

Extension to Agency:

Agent's representation should:
- Maximize predictive power for value function
- Minimize memory/computation

Study Applications:

  • State abstraction in MDPs
  • Hierarchical representations
  • Transfer learning (what to compress vs preserve)

6. Control Theory: Stability and Robustness

Core Insight: Real agents operate under noise, delays, and model mismatch.

Key Work:

  • Kalman (1960): Kalman Filter

    • Optimal state estimation under Gaussian noise
    • Belief state tracking for POMDPs
  • Todorov (2004): Linearly-solvable optimal control

    • Control as inference
    • Elegant handling of stochastic dynamics
    • Path integral control

Why It Was Ignored by AI:

  • Lived in EE/ME departments
  • Assumed known dynamics
  • RL preferred model-free methods

Why It's Essential Now:

  • Guarantees on convergence
  • Robustness analysis
  • Sample efficiency through model exploitation

Mathematical Tools:

  • Lyapunov functions (stability proofs)
  • LQR/LQG (optimal linear control)
  • H-infinity control (worst-case robustness)

Part II: Current Gaps in "Agentic AI"

Gap 1: No Theory of Agency Itself

What's Missing:

  • When does a system count as one agent vs multiple?
  • What defines goal persistence?
  • When should an agent stop acting?

Philosophical Foundations (ignored):

  • Bratman's intention theory
  • Dennett's intentional stance
  • Frankfurt's hierarchy of desires

Practical Consequence: Agents that:

  • Loop indefinitely
  • Have contradictory goals
  • Can't distinguish "done" from "stuck"

Gap 2: No Cost Model for Cognition

Current Practice:

# Pseudocode of typical agent loop
while not task_complete:
    thought = llm.generate(context, max_tokens=infinite)
    action = parse(thought)
    execute(action)

What's Wrong: No cost for thinking, planning, or tool use.

Correct Formulation (from bounded rationality):

V*(s) = max_a [R(s,a) - c_think·|reasoning| - c_time·delay + γE[V(s')]]

Study Framework:

  • Rational metareasoning (Russell & Wefald)
  • Information value theory
  • Anytime algorithms with performance profiles

Gap 3: Planning Without World Models

Current Practice:

  • Prompt engineering for "planning"
  • Chain-of-thought as surrogate for simulation
  • No explicit dynamics model

What's Missing:

  • Predictive models of action outcomes
  • Counterfactual reasoning (what if I hadn't?)
  • Epistemic vs aleatoric uncertainty

Classical Solution: POMDP framework

Belief state: b(s) = P(s | history)
Action selection: a* = argmax_a ∑_s b(s)·Q(s,a)
Belief update: b'(s') ∝ P(o|s',a)·∑_s P(s'|s,a)·b(s)

Why It Matters: Without world models, agents cannot:

  • Anticipate side effects
  • Plan multi-step sequences reliably
  • Learn from counterfactuals

Gap 4: Memory as Database, Not Cognition

Current Practice:

memory = VectorDB()
memory.add(experience)
relevant = memory.query(current_context, k=5)

What's Missing:

  • Forgetting (when and what)
  • Abstraction (concept formation)
  • Consolidation (integration of new beliefs)
  • Non-monotonic reasoning (belief revision)

Cognitive Science Foundations:

  • Spacing effect
  • Interference theory
  • Schema theory
  • ACT-R memory model

Mathematical Framework:

  • Jeffrey conditioning (soft updates)
  • AGM belief revision postulates
  • Memory decay models (exponential, power law)

Gap 5: Multi-Agent Coordination Is Superficial

Current "Multi-Agent" Systems:

agents = [Agent1(), Agent2(), Agent3()]
for agent in agents:
    agent.act(shared_context)  # No real coordination

Missing Components:

  • Common knowledge establishment
  • Theory of mind (belief about beliefs)
  • Incentive alignment
  • Coalition formation

Game-Theoretic Foundations:

  • Correlated equilibrium (coordination devices)
  • Mechanism design (incentive compatibility)
  • Communication complexity
  • Signaling games

Gap 6: Embodiment Constraints Ignored

Even "software agents" are embodied in:

  • API rate limits
  • Latency
  • Irreversible actions
  • Safety constraints

Missing Theory:

  • Affordances (Gibson): what actions are possible?
  • Viability kernels: what states are safe?
  • Precondition modeling
  • Graceful degradation

Example: An agent calling a payment API should model:

  • Idempotency (can I retry?)
  • Reversibility (can I undo?)
  • Rate limits (when can I act again?)

Gap 7: Evaluation Is Broken

Current Metrics:

  • Task success rate
  • Benchmark scores
  • Human preference

What's Not Measured:

  • Long-term stability (does performance decay?)
  • Goal drift (does the agent change what it wants?)
  • Robustness to distribution shift
  • Recovery from self-error

Needed Framework:

Agent quality = f(
    short_term_reward,
    long_term_stability,
    robustness_to_perturbation,
    interpretability,
    cost_efficiency
)

Gap 8: Alignment Beyond Instruction Following

Current Approach:

  • RLHF on human preferences
  • Constitutional AI (rules + feedback)
  • Prompt engineering for "safety"

Deep Problems:

  • Value learning under uncertainty
  • Corrigibility (accepting correction)
  • Shutdownability
  • Handling preference change over time

Research Areas (under-explored):

  • Cooperative inverse RL
  • Assistance games
  • Value alignment problem (Stuart Russell)
  • Embedded agency (agent models including itself)

Part III: Post-Bubble Fundamentals (What's Coming)

The Pattern Recognition

Every AI hype cycle:

  1. Demo phase: Focus on impressive outputs
  2. Scale phase: "More compute/data will fix it"
  3. Reality phase: Costs explode, edge cases dominate
  4. Fundamentals phase: Return to ignored math

We're currently at stage 2-3 for "agentic AI."


What the Correction Will Look Like

Shift 1: From Vibes to Variational Principles

Current:

"The agent uses chain-of-thought to reason about the problem"

Future:

"The agent minimizes expected free energy under computational constraints"

Shift 2: Explicit Cost Accounting

All agent systems will include:

  • Compute budgets
  • Time discounting for thinking
  • Memory costs
  • Communication costs

Shift 3: Fewer, Better Agents

Instead of:

  • "Swarms" of simple agents
  • One mega-agent for everything

We'll see:

  • Carefully designed agent boundaries
  • Explicit interfaces and contracts
  • Provable coordination properties

Shift 4: Return to Control Theory

  • Model-based RL will dominate
  • Stability guarantees required
  • Safety envelopes explicit
  • Formal verification for critical applications

Part IV: Study Roadmap

Prerequisites

Mathematics:

  • Probability theory (measure-theoretic)
  • Optimization (convex, stochastic)
  • Dynamical systems
  • Information theory
  • Game theory basics

Computer Science:

  • Algorithms (complexity, approximation)
  • Formal methods (logic, verification)
  • Systems (distributed computing, databases)

Core Curriculum

Module 1: Decision Theory (4 weeks)

Texts:

  • Savage: "The Foundations of Statistics" (Ch 1-5)
  • Jeffrey: "The Logic of Decision" (Ch 1, 4, 11)
  • Gilboa: "Theory of Decision Under Uncertainty"

Exercises:

  • Prove Dutch book theorem
  • Implement Jeffrey conditioning
  • Model agent preference learning from observations

Module 2: Reinforcement Learning Foundations (6 weeks)

Texts:

  • Sutton & Barto: "Reinforcement Learning" (2nd ed)
  • Bertsekas: "Dynamic Programming and Optimal Control"
  • Puterman: "Markov Decision Processes"

Focus Areas:

  • Bellman operators and contraction mappings
  • Policy gradient methods (proper derivations)
  • POMDP planning algorithms
  • Regret bounds and sample complexity

Module 3: Control Theory (5 weeks)

Texts:

  • Åström & Murray: "Feedback Systems"
  • Todorov: Papers on optimal control theory
  • Tedrake: "Underactuated Robotics"

Key Concepts:

  • Lyapunov stability
  • LQR/LQG
  • Model predictive control
  • Robust control (H-infinity)

Module 4: Multi-Agent Systems (5 weeks)

Texts:

  • Osborne & Rubinstein: "A Course in Game Theory"
  • Shoham & Leyton-Brown: "Multiagent Systems"
  • Fudenberg & Tirole: "Game Theory"

Focus:

  • Bayesian games (Harsanyi)
  • Mechanism design
  • Communication complexity
  • Learning in games

Module 5: Information Theory & Bounded Rationality (4 weeks)

Texts:

  • Cover & Thomas: "Elements of Information Theory"
  • Ortega & Braun: Papers on thermodynamics of computation
  • Tishby: Information bottleneck papers

Applications:

  • Rate-distortion theory for agents
  • Free energy principle (Friston)
  • Resource-rational analysis

Module 6: Cybernetics & Systems Theory (3 weeks)

Texts:

  • Ashby: "An Introduction to Cybernetics"
  • Ashby: "Design for a Brain"
  • Wiener: "Cybernetics"

Modern Connections:

  • Homeostatic control in RL
  • Meta-learning as ultrastability
  • Variety matching in multi-agent systems

Advanced Topics

Topic A: Formal Verification for Agents

Goal: Prove properties of agent systems

Tools:

  • Temporal logic (LTL, CTL)
  • Model checking
  • Abstract interpretation
  • Probabilistic verification

Application: Safety guarantees for autonomous systems

Topic B: Embedded Agency

Goal: Agents that model themselves

Challenges:

  • Self-reference and paradoxes
  • Logical uncertainty
  • Updatelessness
  • Grain of truth problem

Key Papers: MIRI's agent foundations work

Topic C: Compositional Agent Design

Goal: Build complex agents from verified components

Framework:

  • Category theory for agent composition
  • Open games (Ghani et al.)
  • String diagrams
  • Interface specifications

Part V: Practical Implementation Guide

Building a "Real" Agent: Checklist

1. Define the Agent Boundary

  • What is the agent's scope of control?
  • What is environment vs self?
  • What actions are reversible?
  • What states are safe/viable?

2. Specify Objectives with Costs

class AgentObjective:
    def __init__(self):
        self.reward_function = ...
        self.compute_cost = ...  # per inference step
        self.time_cost = ...     # per wall-clock second
        self.memory_cost = ...   # per byte stored
        
    def value(self, state, action, resources_used):
        return (self.reward_function(state, action) 
                - self.compute_cost * resources_used.compute
                - self.time_cost * resources_used.time
                - self.memory_cost * resources_used.memory)

3. Implement Explicit World Models

class WorldModel:
    def predict(self, state, action) -> Distribution[State]:
        """Returns distribution over next states"""
        pass
    
    def epistemic_uncertainty(self, state) -> float:
        """Model uncertainty (reducible with more data)"""
        pass
    
    def aleatoric_uncertainty(self, state, action) -> float:
        """Environmental stochasticity (irreducible)"""
        pass

4. Design Memory with Forgetting

class AgentMemory:
    def store(self, experience, importance):
        """Store with priority/importance weighting"""
        pass
    
    def decay(self, time_elapsed):
        """Probabilistic forgetting based on age and access"""
        pass
    
    def consolidate(self):
        """Abstract/compress old memories"""
        pass

5. Add Metareasoning

class MetaReasoner:
    def should_continue_planning(self, current_plan, time_spent) -> bool:
        """VOI: is more planning worth the cost?"""
        expected_improvement = self.estimate_plan_improvement()
        cost_of_thinking = self.compute_cost * time_spent
        return expected_improvement > cost_of_thinking

6. Instrument for Long-Term Evaluation

class AgentMonitor:
    def track_metrics(self):
        return {
            'immediate_reward': ...,
            'goal_drift': self.measure_goal_change_over_time(),
            'belief_coherence': self.dutch_book_vulnerability(),
            'robustness': self.performance_under_perturbation(),
            'cost_efficiency': self.reward_per_unit_compute()
        }

Part VI: Research Frontiers

Open Problems

1. Computational Theory of Agency

Question: What is the minimal formal definition of an agent?

Approaches:

  • Category-theoretic agents
  • Agent-environment boundary as a Markov blanket
  • Information-theoretic definitions (causal emergence)

2. Bounded Rationality at Scale

Question: How should agents allocate compute across:

  • Planning
  • Learning
  • Acting
  • Monitoring

Tools:

  • Optimal stopping theory
  • Multi-armed bandits for metareasoning
  • Rational metareasoning frameworks

3. Multi-Agent Emergence

Question: When do independent agents form coalitions, protocols, or institutions?

Relevant Fields:

  • Evolutionary game theory
  • Mechanism design
  • Social choice theory
  • Distributed systems (consensus)

4. Safe Exploration for Embodied Agents

Question: How do agents learn without catastrophic failures?

Approaches:

  • Viability theory
  • Reachability analysis
  • Safe RL (constrained MDPs)
  • Shield synthesis

Conclusion: The Coming Correction

Why the Current Bubble Will Pop

Symptom 1: Agents are too expensive

  • Cost per task grows linearly with complexity
  • No architectural efficiency improvements
  • Compute scaling plateaus

Symptom 2: Reliability doesn't improve with scale

  • More parameters ≠ better long-term behavior
  • Edge cases dominate real deployments
  • No stability guarantees

Symptom 3: Multi-agent systems don't compose

  • Swarms collapse into chaos
  • No coordination without centralized control
  • Emergent behavior is unpredictable

What Comes Next

Phase 1 (2024-2025): Reality Check

  • High-profile agent failures
  • Cost explosion for production systems
  • Realization that demos ≠ products

Phase 2 (2025-2027): Fundamentals Renaissance

  • Return to control theory
  • Explicit cost models
  • Formal verification requirements
  • Model-based methods resurgence

Phase 3 (2027+): Mature Agent Engineering

  • Standardized agent architectures
  • Compositional design patterns
  • Provable properties
  • Boring reliability

How to Position for the Correction

For Researchers:

  • Work on cost-aware agent design
  • Develop long-term evaluation benchmarks
  • Bridge to control theory and decision theory
  • Focus on guarantees, not demos

For Engineers:

  • Instrument everything (costs, stability, drift)
  • Build world models explicitly
  • Design for composition and verification
  • Avoid "vibes-based" architecture

For Organizations:

  • Invest in fundamentals teams
  • Hire people who know the ignored math
  • Build for 10-year timelines, not 10-month demos
  • Establish rigorous evaluation beyond benchmarks

Appendix: Reading List

Tier 1: Essential Foundations

  1. Simon - "A Behavioral Model of Rational Choice"
  2. Ashby - "An Introduction to Cybernetics"
  3. Savage - "The Foundations of Statistics"
  4. Sutton & Barto - "Reinforcement Learning" (full book)
  5. Osborne & Rubinstein - "A Course in Game Theory"

Tier 2: Deep Dives

  1. Bertsekas - "Dynamic Programming and Optimal Control"
  2. Åström & Murray - "Feedback Systems"
  3. Tishby et al. - "The Information Bottleneck Method"
  4. Harsanyi - "Games with Incomplete Information"
  5. Todorov - "Optimal Control Theory" (survey paper)

Tier 3: Frontier Topics

  1. Friston - "The Free Energy Principle"
  2. Ortega & Braun - "Thermodynamics as a theory of decision-making"
  3. Soares et al. - "Agent Foundations for Aligning Machine Intelligence"
  4. Russell - "Human Compatible"
  5. Open Games literature (Jules Hedges et al.)

Final Note: The companies that survive the correction will be those that invested in these fundamentals early—treating agents as engineered systems with costs, constraints, and formal properties, not magical text-generators with tool use bolted on.