Agentic Notes Library

Chapter 3: Formal Agent Architectures — Theoretical Frameworks and Design Patterns

Agent architectures are not implementation accidents; they are structural commitments that determine what an agent can represent, how it reasons, when it acts, and how it fails. Every production agentic system—whether built on large lang...

March 19, 2026 38 min read 8,308 words
Chapter 03Math

Preamble#

Agent architectures are not implementation accidents; they are structural commitments that determine what an agent can represent, how it reasons, when it acts, and how it fails. Every production agentic system—whether built on large language models (LLMs), classical planners, or hybrid stacks—implicitly or explicitly instantiates one or more of the formal architectures catalogued in this chapter. The difference between a brittle demonstration and a reliable, scalable agentic application lies precisely in whether the engineering team has made these architectural commitments explicit, typed, bounded, and verifiable.

This chapter provides an exhaustive formal treatment of twelve foundational agent architectures, each analyzed along five axes:

  1. Formal definition — mathematical objects, state spaces, transition semantics.
  2. Structural decomposition — components, interfaces, data flow.
  3. Mapping to LLM-based agentic systems — how each architecture instantiates concretely in modern context-engineered, tool-augmented, memory-layered agent stacks.
  4. Production implications — reliability, observability, fault tolerance, token efficiency, latency.
  5. Composition properties — how the architecture interacts with other architectures in layered or multi-agent configurations.

The reader should emerge with the capacity to select, combine, and formally justify architectural decisions for any agentic system at enterprise scale.


3.1 BDI (Belief–Desire–Intention) Architecture Applied to LLM Agents#

3.1.1 Philosophical and Formal Foundations#

The Belief–Desire–Intention architecture originates in Michael Bratman's theory of practical reasoning (1987) and was formalized computationally by Rao and Georgeff (1991, 1995). Its central insight is that rational agency requires not only beliefs about the world and desires regarding future states, but a distinct commitment mechanism—intentions—that constrains future deliberation, filters out irrelevant options, and stabilizes behavior over time.

Formal BDI Tuple#

A BDI agent is defined as a tuple:

ABDI=B,D,I,P,δB,δD,δI,π\mathcal{A}_{BDI} = \langle \mathcal{B}, \mathcal{D}, \mathcal{I}, \mathcal{P}, \delta_B, \delta_D, \delta_I, \pi \rangle

where:

SymbolDefinition
B\mathcal{B}Belief base: a set of propositions {b1,b2,,bn}\{b_1, b_2, \ldots, b_n\} representing the agent's current model of the world, including uncertainty.
D\mathcal{D}Desire set: a set of goal states {d1,d2,,dm}\{d_1, d_2, \ldots, d_m\} the agent finds desirable, not necessarily mutually consistent.
I\mathcal{I}Intention stack: an ordered, committed subset of plans currently being pursued, satisfying IPlans(D)\mathcal{I} \subseteq \text{Plans}(\mathcal{D}).
P\mathcal{P}Plan library: a mapping from triggering events and context conditions to partially ordered sequences of actions and subgoals.
δB\delta_BBelief revision function: δB:B×PerceptB\delta_B : \mathcal{B} \times \text{Percept} \rightarrow \mathcal{B}'.
δD\delta_DDesire deliberation function: δD:B×DD\delta_D : \mathcal{B} \times \mathcal{D} \rightarrow \mathcal{D}', generating or pruning desires given current beliefs.
δI\delta_IIntention selection / filtering function: δI:B×D×II\delta_I : \mathcal{B} \times \mathcal{D} \times \mathcal{I} \rightarrow \mathcal{I}', committing to or dropping intentions.
π\piPlan selection function: π:B×I×PAction\pi : \mathcal{B} \times \mathcal{I} \times \mathcal{P} \rightarrow \text{Action}, selecting the next executable action.

The BDI Reasoning Cycle#

The canonical BDI reasoning cycle proceeds as follows at each time step tt:

ALGORITHM 3.1: BDI-Reasoning-Cycle
────────────────────────────────────
Input: Agent state (B, D, I), percept stream P_stream, plan library PL
Output: Action a_t, updated agent state (B', D', I')
 
1.  percept ← OBSERVE(P_stream)
2.  B' ← BELIEF_REVISE(B, percept)
3.  D' ← DELIBERATE(B', D)                    // generate/filter desires
4.  I_candidates ← FILTER(B', D', I)           // compatible intentions
5.  IF RECONSIDER?(B', I) THEN
6.      I' ← SELECT_INTENTIONS(B', D', I_candidates)
7.  ELSE
8.      I' ← I                                  // commitment persists
9.  plan ← SELECT_PLAN(B', I', PL)
10. a_t ← HEAD(plan)                           // next primitive action
11. EXECUTE(a_t)
12. RETURN (B', D', I'), a_t

The critical design decision is the reconsideration policy at step 5. An agent that reconsiders too frequently is overcautious (high deliberation cost, low throughput). An agent that reconsiders too rarely is overcommitted (brittle under environmental change). This trade-off is formally captured as:

Reconsider(t)    Δ(Bt,Bt1)>τreconIntentionFailure(It)\text{Reconsider}(t) \iff \Delta(\mathcal{B}_t, \mathcal{B}_{t-1}) > \tau_{\text{recon}} \lor \text{IntentionFailure}(\mathcal{I}_t)

where τrecon\tau_{\text{recon}} is a tunable belief-change threshold and IntentionFailure\text{IntentionFailure} detects plan infeasibility.

3.1.2 Mapping BDI to LLM-Based Agentic Systems#

The BDI framework maps with high fidelity onto modern LLM agent stacks when each BDI component is given a concrete, typed implementation:

BDI ComponentLLM Agent Realization
Beliefs B\mathcal{B}Context window contents: system prompt policy, retrieved evidence (provenance-tagged), tool output history, environment observations, validated memory summaries. The belief base is the compiled prefill at inference time.
Desires D\mathcal{D}User-specified goals, system-level objectives (safety constraints, quality targets), and derived subgoals from task decomposition. Represented as structured goal objects with priority, deadline, and success criteria.
Intentions I\mathcal{I}The currently committed plan trace: a sequence of planned tool calls, retrieval operations, and generation steps with explicit ordering and dependency edges. Stored in working memory with rollback metadata.
Plan Library P\mathcal{P}A registry of known workflows, tool-chain templates, and decomposition strategies, exposed as typed MCP tool/resource surfaces or as few-shot plan exemplars in the prompt.
Belief Revision δB\delta_BContext engineering pipeline: retrieval re-ranking, memory refresh, observation integration, contradiction detection, and stale-context pruning—executed before each generation step.
Desire Deliberation δD\delta_DLLM-driven goal analysis: the model evaluates which user objectives are achievable given current beliefs, generates subgoals, and flags conflicts.
Intention Selection δI\delta_IPlan critique and commitment: the model (or a separate verifier agent) evaluates candidate plans against feasibility, cost, and risk, then commits to an intention stack with bounded depth.
Plan Selection π\piTool routing and action selection: given the current intention, the model selects the next tool call or generation action, parameterized by typed schemas from the tool registry.

Belief Revision as Context Engineering#

The most critical BDI function in LLM agents is belief revision. In classical BDI, δB\delta_B is a logical update. In LLM agents, it is context construction:

Bt+1=Compile(Policyrole, constraintsRetrieve(qt)evidenceMemory(st)session + episodicObs(et)tool outputsHistory(ht)pruned)\mathcal{B}_{t+1} = \text{Compile}\Big(\underbrace{\text{Policy}}_{\text{role, constraints}} \oplus \underbrace{\text{Retrieve}(q_t)}_{\text{evidence}} \oplus \underbrace{\text{Memory}(s_t)}_{\text{session + episodic}} \oplus \underbrace{\text{Obs}(e_t)}_{\text{tool outputs}} \oplus \underbrace{\text{History}(h_t)}_{\text{pruned}}\Big)

subject to the token budget constraint:

Bt+1tokensWmaxRreserved|\mathcal{B}_{t+1}|_{\text{tokens}} \leq W_{\text{max}} - R_{\text{reserved}}

where WmaxW_{\text{max}} is the model's context window and RreservedR_{\text{reserved}} is the token budget reserved for generation and reasoning.

Commitment and Reconsideration in LLM Agents#

LLM agents face a specific form of the commitment–reconsideration trade-off: each reconsideration step costs a full inference call (latency + cost), but insufficient reconsideration leads to hallucinated plan continuations when the environment has changed. The production-grade strategy is:

  • Observation-triggered reconsideration: reconsider only when a tool call fails, a verification step detects an error, or retrieved evidence contradicts the current plan.
  • Bounded intention depth: limit the intention stack to kk steps before requiring a verification checkpoint. This prevents runaway plan execution.
  • Explicit rollback metadata: each intention carries a compensating action or rollback specification, enabling recovery without full replanning.

3.1.3 Extended BDI: Obligations, Norms, and Trust#

Production agents operate under normative constraints (safety policies, authorization boundaries, organizational rules). The extended BDI model adds:

ABDI+=B,D,I,O,N,P,\mathcal{A}_{BDI+} = \langle \mathcal{B}, \mathcal{D}, \mathcal{I}, \mathcal{O}, \mathcal{N}, \mathcal{P}, \ldots \rangle

where O\mathcal{O} is a set of obligations (must-do constraints from policy or user approval gates) and N\mathcal{N} is a normative filter that blocks intention adoption if it violates safety, authorization, or ethical constraints. In typed protocol terms, N\mathcal{N} is enforced as a pre-execution validation gate in the agent loop—a mechanically enforced invariant, not a soft prompt instruction.

3.1.4 Production Implications#

ConcernBDI Mitigation
Hallucination controlBeliefs are grounded in provenance-tagged retrieval and verified tool outputs, not in model-generated assertions. The belief base is explicitly constructed, not implicitly assumed.
Fault toleranceIntentions carry rollback metadata; plan failure triggers reconsideration rather than silent continuation.
ObservabilityEach BDI cycle emits a structured trace: beliefs consulted, desires evaluated, intentions adopted/dropped, plan selected, action executed, outcome observed.
Token efficiencyThe compiled prefill budget enforces that beliefs are compressed and relevant, preventing context window saturation.
IdempotencyIntentions are logged with execution state; re-entry after crash replays from the last committed checkpoint, not from the beginning.

3.2 Cognitive Architectures: SOAR, ACT-R, Global Workspace Theory — Mappings to Agentic AI#

3.2.1 SOAR (State, Operator, And Result)#

Formal Structure#

SOAR, developed by Laird, Newell, and Rosenbloom (1987), models cognition as a search through a problem space. Its core objects are:

SSOAR=S,O,WM,LTM,Chunking,Impasse\mathcal{S}_{SOAR} = \langle S, O, \text{WM}, \text{LTM}, \text{Chunking}, \text{Impasse} \rangle
ComponentDefinition
SSState: the current representation of the problem, including goals, subgoals, and working context.
OOOperators: transformations applicable to states, selected through preference rules.
WM\text{WM}Working Memory: a short-term, high-bandwidth store holding the current state, active goals, and operator applications. Analogous to the LLM context window.
LTM\text{LTM}Long-Term Memory: divided into procedural (production rules), semantic (facts), and episodic (event sequences) stores.
Chunking\text{Chunking}Chunking mechanism: when an impasse is resolved, the resolution path is compiled into a new production rule and stored in LTM, enabling one-step resolution in the future.
Impasse\text{Impasse}Impasse detection: when the system cannot select an operator, apply an operator, or determine the result, a subgoal is automatically created.

The SOAR Decision Cycle#

ALGORITHM 3.2: SOAR-Decision-Cycle
───────────────────────────────────
Input: Working Memory WM, Long-Term Memory LTM
Output: Updated WM, executed actions
 
1.  LOOP:
2.      proposals ← MATCH_PRODUCTIONS(WM, LTM.procedural)
3.      preferences ← EVALUATE_PREFERENCES(proposals)
4.      IF UNIQUE_BEST(preferences) THEN
5.          op ← SELECT(preferences)
6.          WM' ← APPLY(op, WM)
7.          IF IS_EXTERNAL(op) THEN EXECUTE_ACTION(op)
8.      ELSE
9.          impasse_type ← CLASSIFY_IMPASSE(preferences)  // tie, conflict, no-change, rejection
10.         subgoal ← CREATE_SUBGOAL(impasse_type, WM)
11.         PUSH(subgoal, goal_stack)
12.         resolution ← SOAR_DECISION_CYCLE(WM ∪ subgoal, LTM)  // recursive
13.         chunk ← COMPILE_CHUNK(subgoal, resolution)
14.         LTM.procedural ← LTM.procedural ∪ {chunk}
15.     WM ← WM'

Mapping to LLM Agents#

SOAR ConceptAgentic AI Realization
Working MemoryThe active context window: compiled prefill + recent tool outputs + current plan state.
Long-Term Procedural MemoryTool registry schemas, workflow templates, and fine-tuned behavioral patterns encoded in model weights or retrieved few-shot exemplars.
Long-Term Semantic MemoryOrganizational knowledge base accessed via hybrid retrieval (semantic + exact + graph).
Long-Term Episodic MemoryValidated episode logs: past task traces with outcomes, stored in episodic memory layer with provenance and expiry.
Impasse → SubgoalWhen the LLM cannot determine the next action (ambiguity, missing information, tool failure), the agent loop creates a subgoal: ask the user, retrieve more context, invoke a specialist sub-agent.
ChunkingWorkflow compilation: successful multi-step resolutions are abstracted into reusable plan templates and stored in the plan library. In practice, this maps to automated pipeline construction from successful traces.
Operator ProposalTool/action candidates proposed by the model given the current context; ranked by a preference function (relevance, cost, latency, safety).
Operator SelectionTool routing: the orchestrator selects the highest-ranked action, with tie-breaking by policy priority.

Critical Insight: Impasse-Driven Subgoaling as Agent Escalation#

The impasse mechanism is the most valuable SOAR concept for production agents. Rather than allowing the LLM to hallucinate when uncertain, the agent architecture should detect impasses mechanically:

  • No applicable tool: the model's proposed action does not match any registered tool schema → trigger retrieval subgoal or human escalation.
  • Ambiguous tool selection: multiple tools are equally ranked → trigger a clarification subgoal or invoke a critic agent.
  • Verification failure: the output of a tool call does not satisfy the postcondition → trigger a repair subgoal with bounded recursion.

This transforms the failure mode from silent hallucination to explicit escalation, which is the single most important reliability improvement in agentic systems.

3.2.2 ACT-R (Adaptive Control of Thought — Rational)#

Formal Structure#

ACT-R (Anderson, 1993, 2007) models cognition as a modular system with parallel perceptual-motor modules and a serial procedural bottleneck:

SACT-R=Buffers,Modules,Productions,Utility,Activation\mathcal{S}_{ACT\text{-}R} = \langle \text{Buffers}, \text{Modules}, \text{Productions}, \text{Utility}, \text{Activation} \rangle

Modules include declarative memory, procedural memory, goal, imaginal (working memory), visual, motor, and vocal modules. Each module communicates through a buffer (a single-chunk interface), and the procedural module fires productions that match buffer contents.

Subsymbolic Layer: Activation and Utility#

ACT-R's power lies in its subsymbolic equations. The activation of a declarative memory chunk ii is:

Ai=Bi+jWjSji+ϵiA_i = B_i + \sum_j W_j \cdot S_{ji} + \epsilon_i

where:

  • Bi=ln(k=1ntkd)B_i = \ln\left(\sum_{k=1}^{n} t_k^{-d}\right) is the base-level activation (frequency and recency of access, with decay parameter d0.5d \approx 0.5).
  • WjSjiW_j \cdot S_{ji} is the spreading activation from elements jj currently in the goal buffer, with associative strength SjiS_{ji}.
  • ϵiLogistic(0,s)\epsilon_i \sim \text{Logistic}(0, s) is stochastic noise.

A retrieval request succeeds if Ai>τA_i > \tau (retrieval threshold); the retrieval latency is:

Tretrieval=FefAiT_{\text{retrieval}} = F \cdot e^{-f \cdot A_i}

where FF and ff are scaling parameters. Higher activation yields faster retrieval.

Production utility determines which production fires when multiple match:

Up=Up(0)+α[RUp(0)]U_p = U_p^{(0)} + \alpha \cdot [R - U_p^{(0)}]

updated by temporal-difference-like reward learning, where RR is the reward received and α\alpha is the learning rate.

Mapping to LLM Agents#

ACT-R ConceptAgentic AI Realization
Declarative Memory ActivationRetrieval scoring function: rank retrieved chunks by a composite of base-level activation (access frequency/recency), spreading activation (relevance to current goal), and noise (diversity). This directly informs hybrid retrieval ranking.
Base-Level Activation (BiB_i)Memory item freshness and usage frequency scores in the episodic/semantic memory layer. Items with higher BiB_i are prioritized in retrieval; items below threshold are candidates for eviction.
Spreading ActivationQuery-aware re-ranking: the current goal and working context items spread activation to related memory chunks, boosting retrieval of contextually relevant items. This is the theoretical justification for query expansion and contextual re-ranking in RAG pipelines.
Retrieval Threshold τ\tauMinimum relevance cutoff in the retrieval pipeline: chunks scoring below threshold are excluded from the context window, preventing low-relevance noise.
Production Utility LearningTool/action selection learning: track the utility of each tool invocation pattern over time; increase utility for patterns that lead to successful task completion, decrease for failures. This maps to bandit-style tool routing optimization.
Single-Chunk Buffer InterfaceTyped buffer abstraction: each module (retrieval, tool execution, memory, goal management) communicates through a single structured message at a time, enforcing serialization at the orchestration layer and preventing unbounded parallelism.

Production Implication: ACT-R-Informed Retrieval Scoring#

The ACT-R activation equation provides a principled scoring function for retrieval in agentic systems:

Score(chunki,queryt,goalt)=ln ⁣(ktkd)recency/frequency+jgoaltwjsim(j,chunki)goal-directed relevance+λauthority(chunki)provenance weight\text{Score}(chunk_i, query_t, goal_t) = \underbrace{\ln\!\left(\sum_{k} t_k^{-d}\right)}_{\text{recency/frequency}} + \underbrace{\sum_{j \in \text{goal}_t} w_j \cdot \text{sim}(j, chunk_i)}_{\text{goal-directed relevance}} + \underbrace{\lambda \cdot \text{authority}(chunk_i)}_{\text{provenance weight}}

This composite score integrates temporal dynamics, goal-directed attention, and source authority into a single ranking metric that outperforms naive cosine similarity for agentic retrieval.

3.2.3 Global Workspace Theory (GWT)#

Formal Structure#

Global Workspace Theory (Baars, 1988, 2005) models consciousness as a broadcast mechanism: many specialized unconscious processors compete to have their outputs broadcast to the global workspace, making those outputs available to all other processors simultaneously.

SGWT=W,{P1,,Pn},Compete,Broadcast,Attend\mathcal{S}_{GWT} = \langle \mathcal{W}, \{P_1, \ldots, P_n\}, \text{Compete}, \text{Broadcast}, \text{Attend} \rangle
ComponentDefinition
W\mathcal{W}Global Workspace: a shared, capacity-limited broadcast medium.
PiP_iSpecialist processors: domain-specific modules that operate in parallel on their own inputs.
Compete\text{Compete}Competition for access: processors form coalitions and compete for workspace access based on relevance, urgency, and activation strength.
Broadcast\text{Broadcast}Global broadcast: the winning coalition's output is broadcast to all processors, enabling integration.
Attend\text{Attend}Attentional selection: a gating mechanism that determines which coalition wins workspace access.

Mapping to Agentic AI: The Context Window as Global Workspace#

The LLM context window is a direct structural analog of the Global Workspace:

  • Capacity-limited: the context window has a fixed token budget. Not all information can be present simultaneously.
  • Broadcast medium: whatever is in the context window is available to the model's entire parameter space during inference—functionally, it is broadcast to all "processors" (attention heads, layers).
  • Competition for access: retrieval results, memory summaries, tool outputs, and user inputs compete for inclusion in the context window. The context engineering pipeline is the attentional selection mechanism.
  • Integration through broadcast: the model synthesizes information from all context sources simultaneously during forward pass, exactly as GWT predicts.
ALGORITHM 3.3: GWT-Inspired-Context-Assembly
─────────────────────────────────────────────
Input: Candidate context items C = {c_1, ..., c_N}, token budget W
Output: Assembled global workspace GW
 
1.  FOR EACH c_i IN C:
2.      score_i ← ACTIVATION(c_i)   // relevance + urgency + authority + recency
3.  coalitions ← FORM_COALITIONS(C, scores)   // group related items
4.  SORT coalitions BY aggregate_score DESCENDING
5.  GW ← ∅; tokens_used ← 0
6.  FOR EACH coalition IN coalitions:
7.      cost ← TOKEN_COUNT(coalition)
8.      IF tokens_used + cost ≤ W THEN
9.          GW ← GW ∪ COMPRESS(coalition)
10.         tokens_used ← tokens_used + TOKEN_COUNT(COMPRESS(coalition))
11.     ELSE
12.         BREAK   // budget exhausted
13. RETURN GW

Production Implication: Context as Attentional Bottleneck#

GWT explains why context engineering is the most important capability in agentic AI: the context window is the agent's consciousness. If critical information does not make it into the context window, it does not exist for the agent's reasoning. Conversely, irrelevant information in the context window displaces relevant information and degrades reasoning quality (analogous to attentional capture by irrelevant stimuli).

This motivates the engineering principle: the context window must be treated as a scarce, managed resource with explicit admission control, not as an unbounded append-only log.

3.2.4 Comparative Synthesis#

PropertySOARACT-RGWTLLM Agent Implication
Memory architectureWM + 3-type LTMModular buffers + declarative/proceduralGlobal workspace + specialistsMulti-layer memory with typed interfaces
Learning mechanismChunking (compile experience)Utility learning + activation tuningCoalition strengtheningTrace-to-workflow compilation + retrieval scoring adaptation
Failure handlingImpasse → subgoalRetrieval failure → defaultCompetition failure → no broadcastMechanical impasse detection → escalation
Attention/SelectionPreference-basedActivation-basedCompetition-basedComposite scoring for context admission
BottleneckOperator selectionProduction firingWorkspace capacityContext window token budget

3.3 Reactive, Deliberative, and Hybrid Agent Architectures#

3.3.1 Reactive Architectures#

Definition and Principles#

A purely reactive agent maintains no internal world model and no explicit goals. It maps percepts directly to actions through a set of condition–action rules (behaviors):

Action(t)=σ(Percept(t))\text{Action}(t) = \sigma(\text{Percept}(t))

where σ\sigma is a fixed, stateless mapping function. The canonical reactive architecture is Brooks' Subsumption Architecture (detailed in §3.7), but the principles extend to any system that acts without deliberation.

Properties#

  • Latency: minimal; no planning or search overhead.
  • Robustness: high for well-defined environments; no world model to become inconsistent.
  • Scalability of behavior: poor; complex behavior requires many hand-crafted rules with complex interaction patterns.
  • Adaptability: low; cannot reason about novel situations not covered by existing rules.

LLM Agent Analog#

A "zero-shot" LLM call with a system prompt and no retrieval, no memory, no tool use, and no verification loop is the closest LLM analog to a reactive agent. It maps input directly to output through the model's learned σ\sigma (parameter-encoded behaviors). This is the weakest agent architecture and the one most prone to hallucination, because there is no mechanism for grounding, verification, or adaptation.

3.3.2 Deliberative Architectures#

Definition and Principles#

A deliberative agent maintains an explicit symbolic world model and uses search-based planning to select actions:

Action(t)=Plan(Model(t),Goal(t))\text{Action}(t) = \text{Plan}(\text{Model}(t), \text{Goal}(t))

The agent constructs a model of the world, defines a goal state, and uses a planning algorithm (e.g., STRIPS, GraphPlan, HTN) to find a sequence of actions that transforms the current state into the goal state.

Properties#

  • Latency: high; planning is computationally expensive, especially for large state spaces.
  • Correctness: high for well-modeled domains; the plan is provably correct with respect to the model.
  • Brittleness: high sensitivity to model inaccuracies; if the world model is wrong, the plan is wrong.
  • Adaptability: replanning is possible but expensive.

LLM Agent Analog#

An LLM agent that performs explicit multi-step planning before acting—decomposing a user request into a structured task graph, selecting tools for each step, and executing sequentially with verification—is a deliberative agent. The "world model" is the agent's compiled context (beliefs), and the "planning algorithm" is the LLM's chain-of-thought reasoning constrained by the plan library.

3.3.3 Hybrid Architectures#

Definition and Layered Structure#

Hybrid architectures combine reactive and deliberative layers, typically in a layered configuration:

Ahybrid=Lreactive,Lplanning,Lcooperative,Control\mathcal{A}_{hybrid} = \langle L_{\text{reactive}}, L_{\text{planning}}, L_{\text{cooperative}}, \text{Control} \rangle

The canonical examples are:

  • InteRRaP (Müller, 1996): three layers (behavior, plan, cooperation) with bottom-up activation and top-down commitment.
  • TouringMachines (Ferguson, 1992): three layers (reactive, planning, modeling) with a control framework that mediates between layers.
  • 3T Architecture (Bonasso et al., 1997): three tiers (skills, sequencing, deliberation) with each tier operating at a different temporal granularity.

The Layered Control Pattern#

ALGORITHM 3.4: Hybrid-Layered-Agent
────────────────────────────────────
Input: Percept p, current agent state S
Output: Action a
 
1.  // Layer 0: Safety / Reactive (highest priority, lowest latency)
2.  IF SAFETY_VIOLATION(p, S) THEN
3.      RETURN SAFETY_ACTION(p)          // immediate, no deliberation
4.
5.  // Layer 1: Procedural / Skill Execution
6.  IF ACTIVE_PLAN(S) AND NEXT_STEP_APPLICABLE(S, p) THEN
7.      RETURN EXECUTE_NEXT_STEP(S)      // continue committed plan
8.
9.  // Layer 2: Deliberative / Planning
10. IF GOAL_UNACHIEVED(S) THEN
11.     plan ← REPLAN(S.model, S.goal, p)
12.     S.plan ← plan
13.     RETURN EXECUTE_NEXT_STEP(S)
14.
15. // Layer 3: Cooperative / Meta-reasoning
16. IF COORDINATION_NEEDED(S) THEN
17.     RETURN COORDINATE(S, peers)
18.
19. RETURN IDLE_ACTION()

Mapping to Production LLM Agent Stacks#

The hybrid layered architecture is the dominant pattern in production agentic systems:

LayerLLM Agent RealizationTemporal Granularity
Reactive / SafetyInput/output guardrails, content filters, authorization checks, rate limiters. No LLM inference required; executed as typed pre/post-conditions.Microseconds to milliseconds
Procedural / SkillTool execution, API calls, file operations—following a committed plan step. Tool routing based on typed schemas.Milliseconds to seconds
Deliberative / PlanningLLM-driven task decomposition, plan generation, retrieval-augmented reasoning. Full inference cycle.Seconds to minutes
Cooperative / MetaMulti-agent coordination, human escalation, cross-agent memory sharing, approval gates.Minutes to hours

The control framework mediates between layers through priority-based arbitration: safety always preempts planning; planning preempts idle exploration; ongoing skill execution preempts replanning unless a verification failure or impasse is detected.

3.3.4 Trade-Off Analysis#

Architecture Utility=f(reactivity,optimality,adaptability,complexity)\text{Architecture Utility} = f(\text{reactivity}, \text{optimality}, \text{adaptability}, \text{complexity})
ArchitectureReactivityOptimalityAdaptabilityImplementation Complexity
Reactive★★★★★★☆☆☆☆★☆☆☆☆★★☆☆☆
Deliberative★☆☆☆☆★★★★★★★★☆☆★★★★☆
Hybrid★★★★☆★★★★☆★★★★☆★★★★★

The hybrid architecture is strictly preferred for production agentic systems because it provides safety guarantees (reactive layer), plan quality (deliberative layer), and coordination capacity (cooperative layer) simultaneously. The cost is implementation complexity, which is managed through typed layer interfaces and explicit control policies.


3.4 Hierarchical Task Networks (HTN) for Complex Plan Decomposition#

3.4.1 Formal Definition#

A Hierarchical Task Network planning domain is defined as:

H=Tp,Tc,M,O,s0\mathcal{H} = \langle \mathcal{T}_p, \mathcal{T}_c, \mathcal{M}, \mathcal{O}, s_0 \rangle
SymbolDefinition
Tp\mathcal{T}_pPrimitive tasks: directly executable actions (operators).
Tc\mathcal{T}_cCompound (abstract) tasks: tasks that must be decomposed further.
M\mathcal{M}Methods: decomposition rules m=(t,pre(m),subtasks(m))m = (t, \text{pre}(m), \text{subtasks}(m)), where tTct \in \mathcal{T}_c is the task being decomposed, pre(m)\text{pre}(m) are preconditions, and subtasks(m)\text{subtasks}(m) is an ordered or partially ordered set of subtasks.
O\mathcal{O}Operators: state-transition specifications for primitive tasks, o=(t,pre(o),eff(o))o = (t, \text{pre}(o), \text{eff}(o)).
s0s_0Initial state: the starting world state.

A task network is a tuple w=(U,C)w = (U, C) where UU is a set of task nodes and CC is a set of constraints (ordering, variable binding, state conditions).

HTN Planning as Recursive Decomposition#

HTN planning proceeds by recursively replacing compound tasks with their method decompositions until only primitive tasks remain:

w0m1w1m2mkwkw_0 \xrightarrow{m_1} w_1 \xrightarrow{m_2} \cdots \xrightarrow{m_k} w_k

where w0w_0 contains the initial abstract task and wkw_k contains only primitive tasks whose execution sequence achieves the goal. The decomposition is valid if:

  1. Every compound task is replaced by a method whose preconditions are satisfied.
  2. The resulting primitive task sequence is executable in the initial state.
  3. All ordering and binding constraints are satisfied.

3.4.2 HTN Decomposition Algorithm#

ALGORITHM 3.5: HTN-Decompose
─────────────────────────────
Input: Task network w = (U, C), methods M, operators O, state s
Output: Executable plan π or FAILURE
 
1.  IF U contains only primitive tasks THEN
2.      π ← LINEARIZE(U, C)                    // respect ordering constraints
3.      IF EXECUTABLE(π, s, O) THEN RETURN π
4.      ELSE RETURN FAILURE
5.  SELECT t ∈ U such that t ∈ T_c             // choose a compound task
6.  applicable ← {m ∈ M : m.task = t AND SATISFIES(m.pre, s)}
7.  IF applicable = ∅ THEN RETURN FAILURE
8.  FOR EACH m ∈ applicable (ordered by heuristic priority):
9.      w' ← REPLACE(w, t, m.subtasks)          // substitute decomposition
10.     C' ← C ∪ m.constraints
11.     result ← HTN_DECOMPOSE(w' = (U', C'), M, O, s)
12.     IF result ≠ FAILURE THEN RETURN result
13. RETURN FAILURE                               // all methods failed → backtrack

3.4.3 Mapping HTN to LLM Agent Task Decomposition#

HTN is the most natural planning formalism for LLM-based agents because LLMs already perform task decomposition as a reasoning pattern (chain-of-thought, tree-of-thought). The mapping is:

HTN ConceptLLM Agent Realization
Compound taskHigh-level user request or agent-generated subgoal (e.g., "Refactor the authentication module").
MethodA decomposition template: a known workflow for breaking a compound task into subtasks. Stored in the plan library as typed method schemas with preconditions.
Primitive taskA single tool call, API invocation, or atomic generation step with a typed schema (MCP tool surface).
PreconditionsBelief checks: "Is the repository cloned? Is the file accessible? Does the user have authorization?" Evaluated against the agent's current belief base.
Ordering constraintsDependency edges in the task graph: "Subtask B requires the output of Subtask A." Enforced by the orchestrator.
DecompositionLLM-driven planning: given a compound task and the current context, the model generates a decomposition. The plan library provides method templates; the LLM selects and adapts.
BacktrackingIf a decomposition path fails (tool error, verification failure), the agent backtracks to try an alternative method, bounded by a retry budget.

Typed Method Schema (Pseudo-Schema)#

METHOD DecomposeCodeRefactor:
  task: "refactor_module"
  preconditions:
    - belief("repository_cloned") = true
    - belief("module_exists", module_name) = true
  subtasks:
    1. analyze_module(module_name) → analysis      [primitive, tool: code_analysis]
    2. generate_plan(analysis) → refactor_plan     [primitive, tool: llm_generate]
    3. apply_changes(refactor_plan) → diff          [primitive, tool: code_edit]
    4. run_tests(module_name) → test_result         [primitive, tool: test_runner]
    5. verify_result(test_result, refactor_plan)    [primitive, tool: llm_verify]
  ordering: 1 < 2 < 3 < 4 < 5
  rollback: revert_changes(diff)
  max_retries: 2

3.4.4 Ordered vs. Partially Ordered HTN#

  • Totally Ordered STN (Simple Task Networks): subtasks have a linear order. Simpler to execute but less parallelizable. Used when task dependencies are strictly sequential.
  • Partially Ordered STN: subtasks have partial ordering constraints. Enables parallel execution of independent subtasks. Requires a concurrent orchestrator with merge semantics.

For production multi-agent systems, partially ordered HTN is preferred because it enables:

Speedup=TsequentialmaxpathTcritical_path\text{Speedup} = \frac{T_{\text{sequential}}}{\max_{\text{path}} T_{\text{critical\_path}}}

where the critical path is the longest dependency chain. Independent subtasks can be distributed to parallel agents with isolated workspaces.

3.4.5 HTN and Verification#

HTN naturally integrates verification as a decomposition primitive. For every method that produces an artifact, a verification subtask is added:

subtasks(m)=[,produce(artifact),verify(artifact,spec),]\text{subtasks}(m) = [\ldots, \text{produce}(artifact), \text{verify}(artifact, spec), \ldots]

If verify\text{verify} fails, the orchestrator triggers a repair method or backtracks. This is the formal basis for the plan → act → verify → repair loop in agentic systems.

3.4.6 Depth Bounding and Cost Control#

Unbounded HTN decomposition is a risk in LLM agents: the model may recursively decompose tasks into ever-finer subtasks, consuming tokens and latency without converging. The mitigation is a depth bound:

depth(decomposition)Dmax\text{depth}(\text{decomposition}) \leq D_{\max}

and a cost bound:

i=1πcost(ti)Cmax\sum_{i=1}^{|\pi|} \text{cost}(t_i) \leq C_{\max}

where cost(ti)\text{cost}(t_i) includes inference tokens, tool invocation cost, and latency. The orchestrator enforces these bounds mechanically, triggering escalation or summarization when limits are approached.


3.5 OODA (Observe–Orient–Decide–Act) Loop as Agent Execution Primitive#

3.5.1 Origin and Formal Structure#

The OODA loop was developed by military strategist John Boyd (1976, 1987) as a model of competitive decision-making. Its power lies not in the individual phases but in the tempo of the loop: the agent that cycles through OODA faster than its environment changes can maintain decision advantage.

LOODA=Observe,Orient,Decide,Act,Feedback\mathcal{L}_{OODA} = \langle \text{Observe}, \text{Orient}, \text{Decide}, \text{Act}, \text{Feedback} \rangle
PhaseDefinition
ObserveGather raw data from the environment: percepts, tool outputs, retrieval results, user messages, system telemetry.
OrientThe critical phase. Synthesize observations with prior knowledge, mental models, cultural context, and previous experience to form a situational awareness picture. Orientation shapes what the agent notices and how it interprets information.
DecideSelect a course of action from the oriented understanding. This may be explicit (plan selection) or implicit (pattern-matched response).
ActExecute the selected action and generate new environmental effects, which feed back into the next Observe phase.
FeedbackTwo feedback loops: (1) Act → Observe (external feedback through environment change), and (2) Orient → Observe (internal feedback; orientation biases future observation).

3.5.2 Boyd's Insight: Orientation as the Schwerpunkt#

Boyd's most important contribution is the recognition that Orient is the central phase. Orientation determines:

  • What the agent attends to in observations (selective perception).
  • How the agent interprets observations (framing, schema application).
  • What options the agent considers (option generation is orientation-dependent).
  • How quickly the agent decides (well-oriented agents decide implicitly, without deliberative search).

In agentic AI terms, Orientation is Context Engineering. The quality of the agent's orientation—its compiled context, retrieved evidence, memory summaries, and policy constraints—determines the quality of every downstream decision.

3.5.3 OODA Mapping to the Agentic Agent Loop#

ALGORITHM 3.6: OODA-Agent-Loop
───────────────────────────────
Input: Environment E, agent state A, context budget W
Output: Continuous action stream
 
1.  LOOP (bounded by max_iterations or convergence):
2.      // OBSERVE
3.      raw ← COLLECT_OBSERVATIONS(E)             // tool outputs, user input, telemetry
4.      filtered ← FILTER_NOISE(raw, A.attention_policy)
5.
6.      // ORIENT
7.      queries ← DECOMPOSE_AND_EXPAND(filtered, A.goals)
8.      evidence ← HYBRID_RETRIEVE(queries, retrieval_config)
9.      memory ← RECALL(A.memory_layers, A.goals)
10.     context ← COMPILE_PREFILL(A.policy, evidence, memory, filtered, budget=W)
11.     situation ← LLM_ORIENT(context)            // situational assessment
12.
13.     // DECIDE
14.     options ← GENERATE_OPTIONS(situation, A.tool_registry)
15.     selected ← EVALUATE_AND_SELECT(options, A.preferences, A.constraints)
16.
17.     // ACT
18.     result ← EXECUTE(selected, A.tool_runtime)
19.     A.history ← APPEND(A.history, (filtered, situation, selected, result))
20.
21.     // FEEDBACK
22.     IF VERIFY(result, selected.postcondition) = FAIL THEN
23.         A.orientation_bias ← UPDATE_BIAS(A.orientation_bias, failure_signal)
24.         CONTINUE                                // rapid re-OODA
25.     ELSE
26.         COMMIT(result)
27.         UPDATE_MEMORY(A.memory_layers, (situation, selected, result))

3.5.4 Tempo and Latency Engineering#

Boyd's competitive advantage through tempo translates directly to latency engineering in agentic systems:

OODA Cycle Time=Tobserve+Torient+Tdecide+Tact\text{OODA Cycle Time} = T_{\text{observe}} + T_{\text{orient}} + T_{\text{decide}} + T_{\text{act}}

For an agent to maintain decision advantage over environmental change, the cycle time must be less than the environment's change rate:

TOODA<1λenvT_{\text{OODA}} < \frac{1}{\lambda_{\text{env}}}

where λenv\lambda_{\text{env}} is the rate of environmentally significant state changes.

PhaseDominant Latency ContributorsOptimization Strategies
ObserveI/O, polling, event subscriptionEvent-driven observation (webhooks, streams) rather than polling; parallel observation channels.
OrientRetrieval latency, LLM inferenceCache retrieval results; use tiered retrieval (exact match first, semantic only if needed); pre-compute orientation summaries.
DecideLLM inferenceImplicit orientation (pattern-matched decisions bypass deliberative inference); pre-computed decision tables for routine cases.
ActTool invocation, API latencyAsync tool execution; parallelized independent actions; pre-staged environments.

3.5.5 OODA vs. BDI: Complementary, Not Competing#

OODA and BDI are orthogonal frameworks that compose well:

  • OODA provides the execution tempo: the continuous loop structure with feedback.
  • BDI provides the commitment mechanism: intentions persist across OODA cycles, providing behavioral stability.
  • Combined: the agent runs an OODA loop, where Orient corresponds to belief revision, Decide corresponds to intention selection/filtering, and Act corresponds to plan execution. Reconsideration (§3.1) is triggered by Orient detecting significant change.

3.6 Blackboard Architectures for Multi-Source Knowledge Integration#

3.6.1 Formal Structure#

The Blackboard Architecture (Erman et al., 1980; Nii, 1986) was originally developed for the HEARSAY-II speech understanding system. Its core insight is that complex problems requiring heterogeneous knowledge sources are best solved by posting partial solutions to a shared data structure (the blackboard) where independent knowledge sources can contribute incrementally.

ABB=BB,{KS1,,KSn},Control\mathcal{A}_{BB} = \langle \text{BB}, \{KS_1, \ldots, KS_n\}, \text{Control} \rangle
ComponentDefinition
BB (Blackboard)A structured, shared workspace organized into hierarchical levels (e.g., raw data → features → hypotheses → solutions). Knowledge sources read from and write to the blackboard.
KSiKS_i (Knowledge Sources)Independent, specialized modules, each with a trigger condition (pattern that activates the KS when matched on the blackboard) and a contribution function (transformation applied to blackboard contents). Knowledge sources do not communicate directly; all communication is mediated through the blackboard.
ControlA scheduling mechanism that determines which knowledge source to activate when multiple are triggered. May be priority-based, focus-based, or opportunistic.

3.6.2 Blackboard Operation Cycle#

ALGORITHM 3.7: Blackboard-Cycle
────────────────────────────────
Input: Blackboard BB, Knowledge Sources KS[], Control strategy
Output: Updated Blackboard BB'
 
1.  LOOP until SOLUTION_FOUND(BB) or MAX_CYCLES:
2.      triggered ← {ks ∈ KS : TRIGGER(ks, BB) = true}
3.      IF triggered = ∅ THEN RETURN BB    // quiescent state
4.      ks* ← CONTROL_SELECT(triggered, BB) // select highest-priority KS
5.      contribution ← ks*.EXECUTE(BB)
6.      BB ← APPLY(BB, contribution)
7.      NOTIFY(KS, BB.changes)              // re-evaluate triggers

3.6.3 Mapping to Agentic AI: Multi-Source Retrieval and Reasoning#

The blackboard architecture maps elegantly to multi-source knowledge integration in agentic systems:

Blackboard ConceptAgentic AI Realization
BlackboardA structured shared workspace (e.g., a JSON document, a knowledge graph, or a structured working memory) where partial results from different sources are posted.
Knowledge Source: Semantic SearchA specialist that takes a query from the blackboard, executes semantic search against a vector store, and posts provenance-tagged results back.
Knowledge Source: Exact MatchA specialist that searches structured databases, code indices, or documentation by exact keyword/ID match.
Knowledge Source: Graph TraversalA specialist that follows relationship edges in a knowledge graph to find related entities and posts lineage-enriched context.
Knowledge Source: Code AnalysisA specialist that runs static analysis, AST parsing, or dependency resolution and posts code-derived enrichment.
Knowledge Source: Memory RecallA specialist that queries episodic and semantic memory layers, applying activation-based scoring (§3.2.2).
Knowledge Source: Live InspectionA specialist that queries runtime state, logs, metrics, or browser state and posts current observations.
Knowledge Source: LLM SynthesisA specialist that takes the accumulated blackboard contents and generates a synthesized answer, critique, or plan revision.
ControlThe context engineering pipeline's ranking and admission control: which knowledge source to invoke next, which results to admit to the context window.

Hierarchical Blackboard Levels for Agentic Retrieval#

Level 4: SYNTHESIZED RESPONSE     [Final answer, verified]
Level 3: HYPOTHESES/PLANS         [Candidate answers, plan options]
Level 2: EVIDENCE CHUNKS          [Ranked, provenance-tagged retrieval results]
Level 1: PROCESSED OBSERVATIONS   [Parsed queries, expanded subqueries, metadata]
Level 0: RAW INPUT                [User query, tool outputs, environment signals]

Each knowledge source reads from its input level(s) and writes to its output level. This hierarchy ensures that the synthesis agent (Level 4) operates only on processed, ranked, provenance-tagged evidence, not on raw data.

3.6.4 Production Advantages of the Blackboard Pattern#

  1. Modularity: knowledge sources are independently deployable, testable, and replaceable. Adding a new retrieval source requires only registering a new KS with the blackboard, not modifying the orchestrator.
  2. Heterogeneity: different KS modules can use different technologies (vector search, SQL, graph queries, API calls) without coupling.
  3. Incremental refinement: partial solutions are progressively enriched, which aligns with the agent's iterative reasoning pattern.
  4. Observability: the blackboard is a complete, inspectable record of all intermediate contributions, enabling debugging and audit.

3.6.5 Risks and Mitigations#

RiskMitigation
Blackboard contentionTyped write schemas; each KS writes to a designated level with a structured format. Concurrent writes are merge-safe by schema design.
Runaway KS activationBounded cycle count; each KS has an invocation budget per task. Control enforces monotonic progress (no KS can repeatedly write the same content).
Stale data on blackboardTTL (time-to-live) on blackboard entries; entries older than threshold are evicted or re-validated before use.
Token budget overflowThe control component enforces a token budget: only the highest-ranked blackboard entries are admitted to the LLM's context window.

3.7 Subsumption Architectures for Priority-Based Behavior Arbitration#

3.7.1 Formal Definition#

The Subsumption Architecture (Brooks, 1986) organizes agent behavior as a stack of behavior layers, where each layer implements a complete, independent behavior mapping from percepts to actions. Higher layers subsume (suppress or override) lower layers:

Asub=(L0,L1,,Lk),Suppress,Inhibit\mathcal{A}_{sub} = \langle (L_0, L_1, \ldots, L_k), \text{Suppress}, \text{Inhibit} \rangle
ComponentDefinition
LiL_iBehavior layer ii: a complete percept-to-action mapping. Higher index = higher competence level.
Suppress(Lj,Li)\text{Suppress}(L_j, L_i)Layer jj suppresses the input to layer ii (replaces ii's percepts with jj's signals).
Inhibit(Lj,Li)\text{Inhibit}(L_j, L_i)Layer jj inhibits the output of layer ii (blocks ii's actions and substitutes jj's).

The key property: lower layers are complete, operational systems that function correctly even if all higher layers are removed. Higher layers add competence without modifying lower layers.

3.7.2 Canonical Layer Stack#

Layer 3: EXPLORE        — seek novel states, learn    [lowest priority]
Layer 2: BUILD_MAP      — construct world model
Layer 1: WANDER         — move through environment
Layer 0: AVOID_OBSTACLE — emergency collision avoidance [highest priority]

Layer 0 operates independently and correctly. Layer 1 adds wandering behavior but Layer 0 can inhibit Layer 1's motor output when an obstacle is detected. Layer 2 adds map building but does not affect lower layers' safety guarantees.

3.7.3 Mapping to LLM Agent Safety and Priority Architecture#

The subsumption architecture provides the theoretical foundation for layered safety and priority arbitration in agentic AI systems:

Subsumption LayerLLM Agent RealizationPriority
Layer 0: Safety StopHard-coded guardrails: content filters, PII detection, authorization enforcement, rate limiting. Operates without LLM inference. Can inhibit all higher-layer outputs.Highest (absolute)
Layer 1: Policy EnforcementSystem prompt constraints, organizational policy rules, normative filters (N\mathcal{N} from §3.1.3). Suppresses inputs that violate policy.High
Layer 2: Plan ExecutionCommitted plan step execution: tool calls, API invocations following the current intention.Medium
Layer 3: Deliberative PlanningLLM-driven reasoning, plan generation, goal decomposition. Can be inhibited by safety or policy layers.Lower
Layer 4: Exploratory ReasoningSpeculative reasoning, creative generation, exploratory retrieval. Lowest priority; suppressed when any higher layer is active.Lowest

Arbitration Semantics#

The output action at time tt is determined by:

at={L0(pt)if active(L0)L1(pt)if active(L1)¬active(L0)Lk(pt)if active(Lk)¬active(Lk1)a_t = \begin{cases} L_0(p_t) & \text{if } \text{active}(L_0) \\ L_1(p_t) & \text{if } \text{active}(L_1) \land \neg\text{active}(L_0) \\ \vdots \\ L_k(p_t) & \text{if } \text{active}(L_k) \land \neg\text{active}(L_{k-1}) \land \cdots \end{cases}

This is a strict priority scheme: safety always wins. This is the formal justification for why safety guardrails must be implemented as pre-inference, mechanically enforced gates, not as prompt instructions that the LLM might ignore.

3.7.4 Production Implications#

  • Safety invariants are independent of model behavior: Layer 0 operates without LLM inference and cannot be circumvented by prompt injection, jailbreaking, or model error.
  • Graceful degradation: if the deliberative layer (Layer 3) fails, the agent falls back to plan execution (Layer 2). If plan execution fails, it falls back to policy enforcement (Layer 1). The system always has a safe, operational mode.
  • Incremental capability addition: new competence layers can be added without modifying safety or policy layers, reducing regression risk.
  • Testability: each layer can be tested independently, including testing that safety layers correctly inhibit lower layers.

3.8 Actor Model and Communicating Sequential Processes (CSP) for Agent Concurrency#

3.8.1 The Actor Model#

Formal Definition#

The Actor Model (Hewitt et al., 1973; Agha, 1986) defines the fundamental unit of concurrent computation as an actor—an entity that:

  1. Has a private state (not shared with any other actor).
  2. Processes messages from an unbounded mailbox, one at a time.
  3. In response to a message, can: (a) send messages to other actors, (b) create new actors, (c) designate a replacement behavior for the next message.
Actor:Message×State(Messagessent,Actorscreated,State)\text{Actor} : \text{Message} \times \text{State} \rightarrow (\text{Messages}_{\text{sent}}, \text{Actors}_{\text{created}}, \text{State}')

Key Properties#

  • Location transparency: actors are addressed by identity, not location; the runtime handles routing.
  • No shared state: all communication is via asynchronous message passing; no locks, no shared memory.
  • Mailbox semantics: messages are queued; processing is serial within an actor, parallel across actors.
  • Supervision hierarchy: actors are organized in trees; parent actors supervise children and handle their failures (let-it-crash philosophy, formalized in Erlang/OTP and Akka).

3.8.2 Communicating Sequential Processes (CSP)#

Formal Definition#

CSP (Hoare, 1978, 1985) models concurrent systems as processes that communicate through synchronous channels:

P=aPPQPQPQSTOPSKIPP = a \rightarrow P' \quad | \quad P \parallel Q \quad | \quad P \square Q \quad | \quad P \sqcap Q \quad | \quad \text{STOP} \quad | \quad \text{SKIP}
ConstructMeaning
aPa \rightarrow PPrefix: perform event aa, then behave as PP.
PQP \parallel QParallel composition: PP and QQ execute concurrently, synchronizing on shared events.
PQP \square QExternal choice: the environment chooses between PP and QQ.
PQP \sqcap QInternal (nondeterministic) choice: the process chooses.
STOP\text{STOP}Deadlock: no further events.
SKIP\text{SKIP}Successful termination.

Key Properties#

  • Synchronous communication: send and receive on a channel happen simultaneously; the sender blocks until the receiver is ready (rendezvous semantics).
  • Compositionality: complex processes are built by composing simpler processes through parallel composition, choice, and sequential composition.
  • Formal verification: CSP supports model checking for deadlock freedom, livelock freedom, and refinement checking.

3.8.3 Actor vs. CSP: Trade-Off Analysis for Agent Concurrency#

DimensionActor ModelCSP
CommunicationAsynchronous, unbounded mailboxSynchronous, channel-based
CouplingLoose; fire-and-forget messagesTight; rendezvous synchronization
OrderingNo guaranteed message ordering (in pure model)Deterministic ordering through channel discipline
Failure handlingSupervision trees (let-it-crash)Process algebra (refinement, no built-in failure model)
Deadlock riskNo deadlock (no blocking) but possible mailbox overflowPossible deadlock (detectable and verifiable)
SuitabilityDistributed, loosely coupled multi-agent systemsTightly coordinated, pipeline-style agent workflows

3.8.4 Mapping to Multi-Agent Orchestration#

Actor-Based Multi-Agent Architecture#

ALGORITHM 3.8: Actor-Based-Agent-Orchestration
───────────────────────────────────────────────
Actors:
  - OrchestratorActor: receives user requests, spawns task actors
  - PlannerActor: decomposes tasks into subtasks
  - ExecutorActor[i]: executes primitive tasks (tool calls)
  - VerifierActor: validates execution results
  - MemoryActor: manages memory read/write with validation
  - SupervisorActor: monitors child actors, handles failures
 
Message Flow:
  User → OrchestratorActor: {type: "request", payload: task}
  OrchestratorActor → PlannerActor: {type: "plan", payload: task}
  PlannerActor → OrchestratorActor: {type: "plan_result", payload: subtasks[]}
  OrchestratorActor → ExecutorActor[i]: {type: "execute", payload: subtask_i}
  ExecutorActor[i] → VerifierActor: {type: "verify", payload: result_i}
  VerifierActor → OrchestratorActor: {type: "verified" | "failed", payload: ...}
  
Failure Handling:
  IF ExecutorActor[i] crashes THEN
    SupervisorActor receives EXIT signal
    SupervisorActor restarts ExecutorActor[i] with last checkpoint
    SupervisorActor replays failed message
  IF restart_count > max_retries THEN
    SupervisorActor escalates to OrchestratorActor
    OrchestratorActor triggers alternative decomposition or human escalation

CSP-Based Pipeline Composition#

For sequential agent workflows (plan → retrieve → act → verify → commit), CSP provides a cleaner model:

AgentPipeline=Planch1Retrievech2Actch3Verifych4Commit\text{AgentPipeline} = \text{Plan} \xrightarrow{ch_1} \text{Retrieve} \xrightarrow{ch_2} \text{Act} \xrightarrow{ch_3} \text{Verify} \xrightarrow{ch_4} \text{Commit}

Each stage is a CSP process; channels (chich_i) enforce synchronization points. The pipeline is deadlock-free by construction if every process eventually produces output on its output channel for every input on its input channel (no infinite loops without output).

3.8.5 Production Implications#

ConcernActor/CSP Solution
IsolationEach agent actor has private state; no shared mutable state. Memory access is mediated through MemoryActor with validation.
Fault toleranceSupervision trees with bounded restart policies. Crash recovery is the default, not the exception.
BackpressureBounded mailboxes (actor) or synchronous channels (CSP) provide natural backpressure when downstream agents are overloaded.
ObservabilityEvery message is traceable; structured logging at message boundaries provides complete execution traces.
Concurrency controlWork units are claimed by actors with explicit task leases; concurrent modification is prevented by design, not by locks.
ScalabilityActors can be distributed across nodes; the mailbox abstraction is location-transparent.

3.9 Stigmergic Coordination: Environment-Mediated Multi-Agent Communication#

3.9.1 Formal Definition#

Stigmergy (Grassé, 1959) is a coordination mechanism in which agents communicate indirectly through modifications to a shared environment rather than through direct message passing.

Cstig=E,{A1,,An},Mark,Sense,Decay\mathcal{C}_{stig} = \langle \mathcal{E}, \{A_1, \ldots, A_n\}, \text{Mark}, \text{Sense}, \text{Decay} \rangle
ComponentDefinition
E\mathcal{E}Shared environment: a data structure (artifact repository, shared workspace, knowledge base, task board) that all agents can read from and write to.
AiA_iAgents: autonomous entities that operate independently, without direct awareness of other agents.
Mark(Ai,E)E\text{Mark}(A_i, \mathcal{E}) \rightarrow \mathcal{E}'Marking: agent AiA_i modifies the environment (writes an artifact, updates a status, leaves a "pheromone" signal).
Sense(Aj,E)pj\text{Sense}(A_j, \mathcal{E}) \rightarrow p_jSensing: agent AjA_j perceives the current state of the environment, including marks left by other agents.
Decay(E,t)E\text{Decay}(\mathcal{E}, t) \rightarrow \mathcal{E}'Decay: environmental marks decay over time, preventing stale signals from persisting indefinitely.

Quantitative Stigmergy#

The strength of a stigmergic signal at location \ell at time tt is:

ϕ(,t)=ikwkeλ(ttk(i))\phi(\ell, t) = \sum_{i} \sum_{k} w_k \cdot e^{-\lambda(t - t_k^{(i)})}

where wkw_k is the weight of mark kk deposited by agent ii at time tk(i)t_k^{(i)}, and λ\lambda is the decay rate. Agents preferentially attend to locations with high ϕ\phi, creating positive feedback loops that amplify successful patterns.

3.9.2 Mapping to Multi-Agent Agentic Systems#

Stigmergic ConceptAgentic AI Realization
Shared environmentShared artifact store (Git repository, shared workspace, task board, blackboard).
MarksCommitted code, documentation updates, test results, status annotations, TODO comments, task completions, error reports.
SensingAgents observe the shared artifact store before acting: reading current code state, checking test results, reviewing task board status.
DecayArtifact staleness scoring: old TODOs lose priority; stale test results trigger re-execution; completed tasks are archived.
Positive feedbackSuccessful patterns (e.g., well-structured code modules) attract more agent attention (more tests, more documentation, more refinement).
Negative feedbackFailed patterns (e.g., failing tests, unresolved errors) trigger corrective agent attention.

Stigmergic Task Board#

ALGORITHM 3.9: Stigmergic-Task-Board-Coordination
──────────────────────────────────────────────────
Shared State: TaskBoard TB (visible to all agents)
 
Agent_i behavior loop:
1.  tasks ← SENSE(TB, status="available")
2.  task* ← SELECT(tasks, A_i.competency, priority_score)
3.  success ← CLAIM(TB, task*, A_i.id, lease_duration)  // atomic CAS
4.  IF NOT success THEN GOTO 1                           // another agent claimed it
5.  result ← EXECUTE(task*, A_i.tools)
6.  MARK(TB, task*, result, status="completed" | "failed")
7.  IF result.status = "failed" THEN
8.      MARK(TB, task*, error_signal, status="needs_repair")
9.      // Decay will eventually re-surface this for another agent
10. GOTO 1

3.9.3 Advantages for Large-Scale Multi-Agent Systems#

  • Decoupling: agents need no knowledge of each other; coordination emerges from environment state.
  • Scalability: adding agents does not require updating routing tables or communication topologies.
  • Robustness: agent failure does not break coordination; unclaimed tasks remain on the board.
  • Emergent coordination: complex coordinated behavior arises from simple individual rules, without centralized planning.

3.9.4 Risks and Mitigations#

RiskMitigation
Livelock (agents repeatedly pick up and drop tasks)Lease-based claiming with exponential backoff; monotonic progress requirements.
Stale signalsDecay function with configurable λ\lambda; periodic garbage collection of resolved signals.
Duplicate workAtomic claim operations (compare-and-swap); work unit idempotency.
ConvergenceTermination detection: a cleanup agent monitors the task board and declares completion when all tasks are resolved and no new marks are being generated.

3.10 Contract-Net Protocol and Auction-Based Task Allocation#

3.10.1 Contract-Net Protocol (CNP)#

Formal Definition#

The Contract-Net Protocol (Smith, 1980) is a negotiation-based task allocation mechanism for multi-agent systems:

CNP=Manager,{C1,,Cn},Announce,Bid,Award,Report\text{CNP} = \langle \text{Manager}, \{C_1, \ldots, C_n\}, \text{Announce}, \text{Bid}, \text{Award}, \text{Report} \rangle
PhaseDescription
Task AnnouncementThe manager broadcasts a task specification to all potential contractors, including task description, deadline, constraints, and evaluation criteria.
BiddingEach contractor evaluates the task against its capabilities, current load, and cost model, and submits a bid (or declines).
AwardingThe manager evaluates bids using a scoring function and awards the contract to the best bidder.
Execution & ReportingThe winning contractor executes the task and reports results to the manager.

Bid Evaluation Function#

The manager's bid evaluation function is:

Score(bidj)=αcapability(j)+βcost1(j)+γload1(j)+δreliability(j)\text{Score}(bid_j) = \alpha \cdot \text{capability}(j) + \beta \cdot \text{cost}^{-1}(j) + \gamma \cdot \text{load}^{-1}(j) + \delta \cdot \text{reliability}(j)

where α+β+γ+δ=1\alpha + \beta + \gamma + \delta = 1 are weights reflecting the manager's priorities.

3.10.2 CNP Protocol Flow for LLM Agents#

ALGORITHM 3.10: Contract-Net-Agent-Allocation
──────────────────────────────────────────────
Participants: Manager M, Contractor agents C[] 
 
1.  // ANNOUNCE
2.  task_spec ← M.decompose_and_specify(user_request)
3.  FOR EACH c IN C:
4.      SEND(c, {type: "cfp", task: task_spec, deadline: T_d})  // call for proposals
5.
6.  // BID
7.  bids ← ∅
8.  WAIT_UNTIL(deadline_bid OR all_responded):
9.      FOR EACH response IN incoming:
10.         IF response.type = "propose" THEN
11.             bids ← bids ∪ {response}
12.         // "refuse" responses are discarded
13.
14. // AWARD
15. IF bids = ∅ THEN
16.     ESCALATE(task_spec, "no_capable_contractor")
17. ELSE
18.     ranked ← SORT(bids, BY score_function, DESCENDING)
19.     winner ← HEAD(ranked)
20.     SEND(winner.agent, {type: "accept"})
21.     FOR EACH loser IN TAIL(ranked):
22.         SEND(loser.agent, {type: "reject"})
23.
24. // EXECUTE & REPORT
25. result ← WAIT(winner.agent, timeout=T_exec)
26. IF result.status = "success" THEN
27.     RETURN result.payload
28. ELSE
29.     // Re-announce to remaining contractors or escalate
30.     RETRY with EXCLUDE(winner.agent)

3.10.3 Auction-Based Task Allocation#

For competitive multi-agent environments or resource-constrained settings, auction mechanisms generalize CNP:

Auction TypeMechanismAgent Application
First-price sealed-bidEach agent submits one bid; highest bidder wins and pays its bid.Simple task allocation where agents bid based on estimated capability.
Vickrey (second-price)Highest bidder wins but pays the second-highest bid. Incentive-compatible (truthful bidding is dominant strategy).Preferred when truthful capability reporting is important for system-level optimization.
CombinatorialAgents bid on bundles of tasks; winner determination is an optimization problem.Multi-task allocation where tasks have synergies (e.g., related code modules).
Dutch (descending)Price starts high and decreases; first agent to accept wins.Fast allocation under time pressure; agents with lowest cost thresholds win.

Formal Auction for Task Allocation#

For mm tasks {T1,,Tm}\{T_1, \ldots, T_m\} and nn agents {A1,,An}\{A_1, \ldots, A_n\}, the allocation problem is:

maxxiji=1nj=1mvijxij\max_{x_{ij}} \sum_{i=1}^{n} \sum_{j=1}^{m} v_{ij} \cdot x_{ij}

subject to:

i=1nxij=1j(each task assigned to exactly one agent)\sum_{i=1}^{n} x_{ij} = 1 \quad \forall j \quad \text{(each task assigned to exactly one agent)} j=1mxijkii(agent capacity constraint)\sum_{j=1}^{m} x_{ij} \leq k_i \quad \forall i \quad \text{(agent capacity constraint)} xij{0,1}x_{ij} \in \{0, 1\}

where vijv_{ij} is the value (expected quality) of agent ii performing task jj, and kik_i is agent ii's maximum concurrent task capacity.

3.10.4 Production Considerations#

ConcernMitigation
Communication overheadUse targeted announcement (only agents with matching capability schemas receive CFPs), not global broadcast.
Strategic misrepresentationVickrey auction incentivizes truthful bidding; reputation systems penalize agents that consistently over-promise.
LatencyBounded bid windows; timeout-based fallback to direct assignment if insufficient bids received.
Failure after awardContracts include SLAs; failure triggers re-announcement with the failed agent excluded. Compensation actions are specified in the contract.
Load balancingBids include current load as a factor; overloaded agents bid lower or decline.

3.11 Formal Petri Net and State Machine Representations of Agent Lifecycles#

3.11.1 Finite State Machines for Agent Lifecycle#

Formal Definition#

An agent's lifecycle can be modeled as a Deterministic Finite Automaton (DFA):

M=Q,Σ,δ,q0,F\mathcal{M} = \langle Q, \Sigma, \delta, q_0, F \rangle
SymbolDefinition
QQFinite set of agent states: {Idle,Planning,Retrieving,Executing,Verifying,Repairing,Committing,Failed,Completed}\{\text{Idle}, \text{Planning}, \text{Retrieving}, \text{Executing}, \text{Verifying}, \text{Repairing}, \text{Committing}, \text{Failed}, \text{Completed}\}.
Σ\SigmaInput alphabet: events that trigger state transitions (e.g., task_received\text{task\_received}, plan_ready\text{plan\_ready}, tool_result\text{tool\_result}, verify_pass\text{verify\_pass}, verify_fail\text{verify\_fail}, max_retries\text{max\_retries}).
δ\deltaTransition function: δ:Q×ΣQ\delta : Q \times \Sigma \rightarrow Q.
q0q_0Initial state: Idle\text{Idle}.
FFAccepting states: {Completed}\{\text{Completed}\}.

Agent Lifecycle State Machine#

       task_received            plan_ready           evidence_ready
Idle ───────────────→ Planning ──────────→ Retrieving ──────────→ Executing
                         ↑                                            │
                         │                                            │
                         │              tool_result                   │
                         │         ┌──────────────────────────────────┘
                         │         ↓
                         │     Verifying
                         │      │       │
                         │  verify_pass  verify_fail
                         │      │       │
                         │      ↓       ↓
                         │  Committing  Repairing
                         │      │       │
                         │   done  │  repair_success → Verifying
                         │      │  │  max_retries
                         │      ↓  │      ↓
                         │  Completed    Failed
                         │               │
                         └── retry ───────┘ (if retry_budget > 0)

Transition Function (Excerpt)#

Current StateEventNext StateGuard Condition
Idletask_receivedPlanningauthorization_check = pass
Planningplan_readyRetrievingplan.subtask_count ≤ max_depth
Retrievingevidence_readyExecutingevidence.relevance ≥ threshold
Executingtool_resultVerifying
Verifyingverify_passCommitting
Verifyingverify_failRepairingrepair_attempts < max_retries
Verifyingverify_failFailedrepair_attempts ≥ max_retries
Repairingrepair_successVerifying
CommittingdoneCompleted
FailedretryPlanningretry_budget > 0

3.11.2 Petri Nets for Concurrent Agent Modeling#

Formal Definition#

A Petri Net is a tuple:

N=P,T,F,W,M0\mathcal{N} = \langle P, T, F, W, M_0 \rangle
SymbolDefinition
PPFinite set of places (represent conditions or states).
TTFinite set of transitions (represent events or actions).
F(P×T)(T×P)F \subseteq (P \times T) \cup (T \times P)Flow relation: directed arcs between places and transitions.
W:FN+W : F \rightarrow \mathbb{N}^+Weight function: arc weights (default 1).
M0:PNM_0 : P \rightarrow \mathbb{N}Initial marking: number of tokens in each place at start.

A transition tt is enabled at marking MM if:

pt:M(p)W(p,t)\forall p \in \bullet t : M(p) \geq W(p, t)

where t\bullet t denotes the input places of tt. Firing tt produces a new marking:

M(p)=M(p)W(p,t)+W(t,p)pPM'(p) = M(p) - W(p, t) + W(t, p) \quad \forall p \in P

Why Petri Nets for Agentic Systems#

Petri nets are strictly more expressive than finite state machines for modeling agentic systems because they naturally represent:

  1. Concurrency: multiple tokens can be in different places simultaneously, modeling parallel agent execution.
  2. Synchronization: transitions with multiple input places model synchronization barriers (e.g., "wait for all sub-agent results before synthesis").
  3. Resource constraints: token counts model resource availability (e.g., available tool invocation budget, concurrent request capacity).
  4. Conflict/Choice: when multiple transitions are enabled by the same tokens, this models nondeterministic choice (e.g., tool selection).

Agent Execution Petri Net#

Places:
  p_idle, p_planning, p_plan_ready, p_retrieving, p_evidence_ready,
  p_executing, p_tool_result, p_verifying, p_verified, p_failed,
  p_repairing, p_committing, p_completed,
  p_token_budget (resource place, initial tokens = max_budget),
  p_retry_budget (resource place, initial tokens = max_retries)
 
Transitions:
  t_receive_task: {p_idle} → {p_planning}
  t_plan: {p_planning} → {p_plan_ready}
  t_retrieve: {p_plan_ready} → {p_retrieving}
  t_evidence: {p_retrieving} → {p_evidence_ready}
  t_execute: {p_evidence_ready, p_token_budget} → {p_executing}  
      // consumes a token budget unit
  t_result: {p_executing} → {p_tool_result}
  t_verify: {p_tool_result} → {p_verifying}
  t_pass: {p_verifying} → {p_verified}
  t_fail: {p_verifying, p_retry_budget} → {p_repairing}  
      // consumes a retry token
  t_repair: {p_repairing} → {p_tool_result}    // loops back to verify
  t_exhaust: {p_verifying} → {p_failed}  
      // fires when p_retry_budget is empty (no retry tokens)
  t_commit: {p_verified} → {p_committing}
  t_done: {p_committing} → {p_completed}

Formal Analysis Properties#

Petri nets enable formal verification of agent lifecycle properties:

PropertyDefinitionAgent Interpretation
ReachabilityCan marking MM be reached from M0M_0?Can the agent reach the Completed state from Idle for a given task?
BoundednessIs the number of tokens in every place bounded?Does the agent stay within resource limits (token budget, retry budget, concurrent tasks)?
LivenessCan every transition eventually fire from every reachable marking?Is every agent capability eventually usable? Are there dead-end states?
Deadlock freedomIs there always at least one enabled transition?Can the agent get stuck in a state with no possible progress?
FairnessDoes every continuously enabled transition eventually fire?Are all agent paths eventually explored (no starvation)?

3.11.3 Colored Petri Nets for Multi-Agent Modeling#

Colored Petri Nets (CPN) extend standard Petri nets with typed tokens, enabling modeling of multiple agents, task types, and data flows in a single net:

NCPN=P,T,F,Σ,C,G,E,M0\mathcal{N}_{CPN} = \langle P, T, F, \Sigma, C, G, E, M_0 \rangle

where Σ\Sigma is a set of color sets (types), C:PΣC : P \rightarrow \Sigma assigns a type to each place, G:TGuardG : T \rightarrow \text{Guard} assigns guard conditions to transitions, and E:FArcExpressionE : F \rightarrow \text{ArcExpression} assigns typed arc expressions.

In multi-agent systems, token colors can encode:

  • Agent identity: which agent is in which state.
  • Task parameters: task type, priority, deadline.
  • Resource type: GPU budget, API rate limit, token budget.

This enables modeling and verification of complex multi-agent orchestration with heterogeneous agents and resources.

3.11.4 Production Application: Agent Lifecycle as Verifiable State Machine#

Every production agent orchestrator should implement its agent lifecycle as an explicit, verifiable state machine (or Petri net), not as implicit control flow in code. Benefits:

  1. Invariant enforcement: the state machine definition mechanically prevents illegal transitions (e.g., executing without planning, committing without verifying).
  2. Observability: current state is always inspectable; state transitions are logged with timestamps and payloads.
  3. Recovery: after crash, the agent resumes from its last committed state, not from the beginning.
  4. Testing: state machine coverage analysis ensures all states and transitions are exercised in testing.
  5. Formal verification: for critical applications, model checking can verify deadlock freedom, boundedness, and liveness.

3.12 Category-Theoretic Composition of Agent Pipelines#

3.12.1 Motivation: Compositionality as Engineering Discipline#

Agent systems are assembled from components: retrievers, planners, executors, verifiers, memory managers, synthesizers. The central question is: when can components be composed, and does the composite inherit the properties of its parts?

Category theory provides the most general framework for answering this question. It abstracts away implementation details and focuses on composition structure: what composes with what, and what properties are preserved.

3.12.2 Category-Theoretic Foundations#

Categories#

A category C\mathcal{C} consists of:

  • Objects: Ob(C)\text{Ob}(\mathcal{C}) — e.g., types, state spaces, data schemas.
  • Morphisms: for each pair of objects A,BA, B, a set Hom(A,B)\text{Hom}(A, B) of arrows from AA to BB — e.g., typed functions, agent transformations, pipeline stages.
  • Composition: for morphisms f:ABf : A \rightarrow B and g:BCg : B \rightarrow C, a composite gf:ACg \circ f : A \rightarrow C.
  • Identity: for each object AA, an identity morphism idA:AA\text{id}_A : A \rightarrow A.
  • Associativity: h(gf)=(hg)fh \circ (g \circ f) = (h \circ g) \circ f.
  • Identity law: fidA=f=idBff \circ \text{id}_A = f = \text{id}_B \circ f.

The Category of Agent Components#

Define CAgent\mathcal{C}_{\text{Agent}} with:

  • Objects: typed data schemas (e.g., UserQuery\text{UserQuery}, RetrievalResult\text{RetrievalResult}, Plan\text{Plan}, ToolOutput\text{ToolOutput}, VerifiedResult\text{VerifiedResult}, Response\text{Response}).
  • Morphisms: agent components as typed transformations:
    • retrieve:UserQueryRetrievalResult\text{retrieve} : \text{UserQuery} \rightarrow \text{RetrievalResult}
    • plan:RetrievalResultPlan\text{plan} : \text{RetrievalResult} \rightarrow \text{Plan}
    • execute:PlanToolOutput\text{execute} : \text{Plan} \rightarrow \text{ToolOutput}
    • verify:ToolOutputVerifiedResult\text{verify} : \text{ToolOutput} \rightarrow \text{VerifiedResult}
    • synthesize:VerifiedResultResponse\text{synthesize} : \text{VerifiedResult} \rightarrow \text{Response}

The pipeline is the composite morphism:

pipeline=synthesizeverifyexecuteplanretrieve:UserQueryResponse\text{pipeline} = \text{synthesize} \circ \text{verify} \circ \text{execute} \circ \text{plan} \circ \text{retrieve} : \text{UserQuery} \rightarrow \text{Response}

Associativity guarantees that we can group and refactor pipeline stages without changing semantics:

(synthesizeverify)(executeplanretrieve)=synthesize(verifyexecute)(planretrieve)(\text{synthesize} \circ \text{verify}) \circ (\text{execute} \circ \text{plan} \circ \text{retrieve}) = \text{synthesize} \circ (\text{verify} \circ \text{execute}) \circ (\text{plan} \circ \text{retrieve})

3.12.3 Functors: Structure-Preserving Mappings Between Agent Categories#

A functor F:CDF : \mathcal{C} \rightarrow \mathcal{D} maps objects and morphisms from one category to another while preserving composition and identity:

F(gf)=F(g)F(f)F(idA)=idF(A)F(g \circ f) = F(g) \circ F(f) \qquad F(\text{id}_A) = \text{id}_{F(A)}

Application: Testing Functor#

Define a testing functor T:CProdCTestT : \mathcal{C}_{\text{Prod}} \rightarrow \mathcal{C}_{\text{Test}} that maps production components to their test doubles:

  • T(retrieveprod)=retrievemockT(\text{retrieve}_{\text{prod}}) = \text{retrieve}_{\text{mock}} (returns canned results)
  • T(executeprod)=executestubT(\text{execute}_{\text{prod}}) = \text{execute}_{\text{stub}} (simulates tool calls)

The functor property guarantees that if the production pipeline composes correctly, the test pipeline composes correctly with the same structure. This is the formal basis for compositional testing of agent pipelines.

Application: Observability Functor#

Define an observability functor O:CAgentCTracedO : \mathcal{C}_{\text{Agent}} \rightarrow \mathcal{C}_{\text{Traced}} that wraps each morphism with tracing:

O(f:AB)=trace(f):A(B,TraceSpan)O(f : A \rightarrow B) = \text{trace}(f) : A \rightarrow (B, \text{TraceSpan})

The functor preserves composition: tracing the composite pipeline is equivalent to composing the individually traced stages.

3.12.4 Monoidal Categories: Parallel Composition#

A monoidal category (C,,I)(\mathcal{C}, \otimes, I) adds a tensor product \otimes that models parallel composition:

  • f:ABf : A \rightarrow B and g:CDg : C \rightarrow D compose in parallel as fg:ACBDf \otimes g : A \otimes C \rightarrow B \otimes D.
  • II is the unit object (the "empty" type; doing nothing in parallel changes nothing).

Application: Parallel Agent Execution#

Two independent agent tasks ff and gg can be executed in parallel:

fg:Task1Task2Result1Result2f \otimes g : \text{Task}_1 \otimes \text{Task}_2 \rightarrow \text{Result}_1 \otimes \text{Result}_2

The monoidal structure guarantees that:

  1. Independence: ff and gg do not interact (no shared state, isolated workspaces).
  2. Determinism: the result of fgf \otimes g is the pair of results, regardless of execution order.
  3. Composability: (fg)(hk)(f \otimes g) \circ (h \otimes k) is well-defined when types align.

This provides the formal justification for the parallel agent execution model described in the system prompt: "Parallel agents shall operate on independently claimable work units."

3.12.5 Natural Transformations: Component Substitution#

A natural transformation η:FG\eta : F \Rightarrow G between functors F,G:CDF, G : \mathcal{C} \rightarrow \mathcal{D} is a family of morphisms {ηA:F(A)G(A)}AC\{\eta_A : F(A) \rightarrow G(A)\}_{A \in \mathcal{C}} such that for every f:ABf : A \rightarrow B:

ηBF(f)=G(f)ηA\eta_B \circ F(f) = G(f) \circ \eta_A

This naturality square commutes:

F(A)F(f)F(B)ηAηBG(A)G(f)G(B)\begin{array}{ccc} F(A) & \xrightarrow{F(f)} & F(B) \\ \downarrow{\eta_A} & & \downarrow{\eta_B} \\ G(A) & \xrightarrow{G(f)} & G(B) \end{array}

Application: Retriever Substitution#

If FF and GG are two retrieval strategies (e.g., semantic search and hybrid search), a natural transformation η\eta provides a systematic migration path: for every query type, there exists a transformation from FF's result to GG's result that commutes with all downstream pipeline stages. This guarantees that replacing a retriever does not break the pipeline's composition structure.

3.12.6 Monads: Effectful Agent Computation#

A monad (M,η,μ)(M, \eta, \mu) on a category C\mathcal{C} consists of:

  • An endofunctor M:CCM : \mathcal{C} \rightarrow \mathcal{C}.
  • A unit (return) ηA:AM(A)\eta_A : A \rightarrow M(A).
  • A multiplication (join/flatten) μA:M(M(A))M(A)\mu_A : M(M(A)) \rightarrow M(A).

satisfying associativity and unit laws.

Application: Error Handling as Monad#

Define M(A)=ResultA,ErrorM(A) = \text{Result}\langle A, \text{Error}\rangle. Every agent component returns M(A)M(A) instead of AA, explicitly propagating errors through the pipeline:

retrieve:QueryM(Evidence)\text{retrieve} : \text{Query} \rightarrow M(\text{Evidence}) plan:EvidenceM(Plan)\text{plan} : \text{Evidence} \rightarrow M(\text{Plan})

Monadic composition (Kleisli composition) gKfg \circ_K f automatically handles error short-circuiting:

(gKf)(x)={g(y)if f(x)=Ok(y)Err(e)if f(x)=Err(e)(g \circ_K f)(x) = \begin{cases} g(y) & \text{if } f(x) = \text{Ok}(y) \\ \text{Err}(e) & \text{if } f(x) = \text{Err}(e) \end{cases}

This provides the formal basis for typed error propagation in agent pipelines, ensuring that failures are handled uniformly and cannot be silently dropped.

Application: Retrieval Context as Monad#

Define M(A)=ContextualizedA,Provenance,ScoreM(A) = \text{Contextualized}\langle A, \text{Provenance}, \text{Score}\rangle. Every retrieval result carries provenance and relevance metadata. Monadic composition ensures that provenance is preserved through all pipeline stages:

μ:M(M(A))M(A)\mu : M(M(A)) \rightarrow M(A)

flattens nested contextualization (e.g., retrieval of a retrieval result) into a single provenance chain.

3.12.7 String Diagrams: Visual Pipeline Composition#

String diagrams provide a graphical syntax for monoidal categories that makes pipeline composition visually explicit:

Sequential:    A ─── f ─── B ─── g ─── C
 
Parallel:      A ─── f ─── B
               C ─── g ─── D
 
Feedback:      A ─── f ──┬── B

                         └── (feedback to A via h)

String diagrams are sound and complete for monoidal categories: every valid diagram corresponds to a well-typed pipeline, and every well-typed pipeline has a diagram. This provides a formal specification language for agent pipeline architecture.

3.12.8 Composition Properties and Invariant Preservation#

The category-theoretic framework guarantees the following properties when agent pipelines are composed through categorical constructs:

PropertyCategorical GuaranteeAgent Implication
Type safetyMorphism composition requires matching domain/codomain.Pipeline stages must have compatible input/output schemas. Detected at design time.
Associativity(hg)f=h(gf)(h \circ g) \circ f = h \circ (g \circ f).Pipeline refactoring (re-grouping stages) does not change semantics.
Identityfid=f=idff \circ \text{id} = f = \text{id} \circ f.Inserting a no-op stage does not change behavior; useful for observability wrappers.
Functorial testingT(gf)=T(g)T(f)T(g \circ f) = T(g) \circ T(f).Testing the composed pipeline is equivalent to composing tested components.
Parallel independencefgf \otimes g guarantees no interaction.Formally verifies that parallel agents do not share mutable state.
Error propagationMonadic composition short-circuits on error.Failures propagate uniformly; no silent error swallowing.

3.12.9 Practical Typed Interface Design from Category Theory#

Category theory motivates the following production design principles:

  1. Every agent component is a typed morphism: input schema → output schema, with explicit error types.
  2. Pipelines are compositions: validated at construction time by type checking, not at runtime by exception handling.
  3. Parallel execution requires monoidal independence: components in a parallel group must have no shared mutable state. Verify mechanically.
  4. Component substitution requires natural transformation: replacing a retriever, planner, or verifier requires demonstrating that the new component satisfies the same naturality conditions (same interface contract, commuting with downstream stages).
  5. Error handling is monadic: every component wraps its output in a Result monad; the pipeline framework provides Kleisli composition for automatic error propagation.
  6. Observability is functorial: tracing, logging, and metrics are applied as functors that preserve pipeline structure.

Chapter Summary and Architectural Selection Guide#

Synthesis: Architecture Selection Matrix#

ArchitecturePrimary StrengthBest Application in Agentic AIComposition Pattern
BDI (§3.1)Commitment and practical reasoningSingle-agent goal pursuit with plan stabilityCore agent loop structure
SOAR (§3.2.1)Impasse-driven learningEscalation and workflow compilationMemory and learning subsystem
ACT-R (§3.2.2)Activation-based retrievalRetrieval scoring and tool utility learningRetrieval ranking function
GWT (§3.2.3)Attention and broadcastContext window engineeringContext admission control
Reactive (§3.3.1)Minimal latencySafety guardrails, input filtersSubsumption layer 0
Deliberative (§3.3.2)Plan optimalityComplex multi-step task planningHTN decomposition
Hybrid (§3.3.3)Balanced capabilityProduction agent stacksLayered architecture
HTN (§3.4)Hierarchical decompositionTask breakdown and workflow templatesPlan library
OODA (§3.5)Execution tempoReal-time and interactive agent loopsOuter execution loop
Blackboard (§3.6)Multi-source integrationHeterogeneous retrieval aggregationRetrieval orchestration
Subsumption (§3.7)Priority arbitrationSafety and behavior layeringControl hierarchy
Actor/CSP (§3.8)Concurrent executionMulti-agent orchestrationCommunication substrate
Stigmergy (§3.9)Decoupled coordinationLarge-scale multi-agent collaborationCoordination protocol
Contract-Net (§3.10)Negotiated allocationDynamic task distributionTask allocation protocol
Petri Nets/FSM (§3.11)Formal verificationAgent lifecycle managementState management
Category Theory (§3.12)Compositional guaranteesPipeline design and refactoringComposition framework

The Production Agentic Stack as Architectural Composition#

A production-grade agentic system does not use one architecture; it composes multiple architectures at different layers:

Category-theoretic typed pipelinesComposition frameworkPetri net lifecycleState managementActor-based concurrencyExecution substrateOODA outer loopTempo controlHybrid layered controlPriority arbitrationBDI core reasoningCommitment engineHTN decompositionPlanningBlackboard retrievalKnowledge integration\underbrace{\text{Category-theoretic typed pipelines}}_{\text{Composition framework}} \circ \underbrace{\text{Petri net lifecycle}}_{\text{State management}} \circ \underbrace{\text{Actor-based concurrency}}_{\text{Execution substrate}} \circ \underbrace{\text{OODA outer loop}}_{\text{Tempo control}} \circ \underbrace{\text{Hybrid layered control}}_{\text{Priority arbitration}} \circ \underbrace{\text{BDI core reasoning}}_{\text{Commitment engine}} \circ \underbrace{\text{HTN decomposition}}_{\text{Planning}} \circ \underbrace{\text{Blackboard retrieval}}_{\text{Knowledge integration}}

Each architecture provides a specific structural guarantee. Together, they yield a system that is formally grounded, concurrency-safe, verifiable, observable, and compositionally maintainable.

The discipline of selecting, combining, and formally justifying these architectural choices—rather than improvising control flow around prompt templates—is what separates engineering from experimentation in the construction of agentic AI systems.


End of Chapter 3.