Operational Embedded Agency Theory

Editorial Note: This document is a consolidated version organized through sequential dialogue relays with multiple AIs. For the generation process, scope of involvement, and limits of mathematical guarantees, see the top page (index.html). The relay logs and code/figures are in the appendix (appendix.html).

Abstract

This paper formulates "self" and "observer" not as substance, soul, or qualia, but as operationally and physically closed structures. Specifically, by stacking the access boundary via von Neumann algebra, the Petz recovery map, the reparameterization-invariant counterfactual divergence (RICD), and self-measurement back-action, the following hierarchy is constructed: Adaptive Controller (Level 1) → Observer-Agent (Level 2) → Perspectival Observer (Level 3). Furthermore, a Selection Principle is presented showing that Δ_cf-sensitive updating becomes an evolutionarily stable attractor, supported by numerical simulation in a classical surrogate environment. The hard problem of phenomenal consciousness is not solved, but the coordinates of where that cliff lies outside the algebraic structure are determined.

§1Starting Point: From Observer as "Point" to Structure

Starting from the naive picture that "an observer is a point somewhere in the universe" leads to inconsistencies with quantum information, AdS/CFT, and quantum gravity alike. Instead, this theory defines the observer as a local structure with recoverability. The following three stages form the backbone of the theory.

1.1 Holographic QEC and Recoverability

In holographic quantum error correction represented by the HaPPY code (Pastawski et al. 2015), bulk logical information is redundantly encoded across multiple boundary regions. The recent Evenbly codes (Steinberg et al., Quantum 9, 1826, 2025) realize a new class of hyperinvariant holographic codes using non-perfect tensors, with a threshold of approximately 19.1% against depolarizing noise.

Key Insight Information is not localized at a point. It is encoded so that it can be stably recovered from multiple boundary regions. "Which region can know what" emerges as the geometry of the bulk.

1.2 Entanglement Wedge and Island Formula

Through entanglement wedge reconstruction, the bulk region (entanglement wedge) that can be reconstructed from a boundary sub-region $R$ is determined. The island formula further shows that even the "inside/outside" of the boundary is dynamically determined as a quantum extremal surface. The structure whereby the optimal surface of reconstruction is determined variationally rather than statically becomes the prototype of the observer's "boundary" concept.

1.3 Quantum Reference Frames (QRF)

In the perspective-neutral framework by Vanrietvelde et al. (Quantum 4, 225, 2020), all physical quantities are relational, and the specific viewpoint of "I" arises from a choice of gauge-fixing.

Consequence It is not that "I see the world," but that "from the redundant description of cosmic information, the gauge-fixing corresponding to this perspective is selected." The observer is determined not as an external fixed coordinate but as a gauge choice within the quantum system.

Stacking these three structures — holographic QEC / entanglement wedge / QRF — the skeleton of the observer condenses into the following single sentence:

An observer is not a point within the world, but the structure by which the world locally, stably, and reference-frame-dependently reconstructs itself.

§2Operational Definition of Observer

2.1 Access Boundary via von Neumann Algebra

For a region $R$, the closure of all observables accessible there naturally forms a von Neumann algebra $\mathcal{A}(R) \subset \mathcal{B}(\mathcal{H})$.

Definition: Operational Observer $$O_R := \mathcal{A}(R)$$ The operational transformation from "What is an observer?" to "What can be measured from this region?"

Haag duality $\mathcal{A}(R)' = \mathcal{A}(\bar{R})$ algebraically separates accessible quantities from the environment side. This is not a semantic "self/environment" boundary, but an operational access boundary (the latter further requires $C_O$ and exploitability).

2.2 Petz Recovery Map and Adaptive Decoder

The optimal recovery map (Petz map) from information loss via channel $\Lambda$ minimizes relative entropy:

$$\mathcal{R}_{\rho,\Lambda}(\cdot) = \rho^{1/2}\Lambda^\dagger\!\bigl(\Lambda(\rho)^{-1/2}(\cdot)\Lambda(\rho)^{-1/2}\bigr)\rho^{1/2}$$

The variational free energy of active inference also contains relative entropy:

$$\mathcal{F} = D_{\mathrm{KL}}(q \| p)$$

Both sit on the common objective function of "minimization of relative entropy." The bridge between QEC (preservation) and predictive processing (updating) is connected by the equation: Petz recovery map ≈ optimal adaptive decoder ≈ relative entropy minimization.

Note QEC "preserves" the code subspace, while PP "updates" the belief distribution — the directions of the objective function differ. The bridging concept is adaptive decoding — QEC that updates the recovery map while estimating the noise model.

2.3 Mini-Model (4-qubit)

For boundary physical qubits: $b_1, b_2, b_3, b_4$ (2 self-qubits + 2 environment-qubits), define self-model $M$ and environment model $E$ as bulk logical information.

Recovery Map

$$D_R : \mathcal{H}_R \to \mathcal{H}_{\mathrm{logical}}$$

Observer Patch Definition

$$O = (R_O,\; D_O,\; S_O,\; E_O,\; \varepsilon)$$ $$D_O(R_O) \approx S_O \otimes E_O \quad (\text{fidelity } 1-\varepsilon)$$

Gauge-fixing (fixing gauge qubits in the subsystem code) changes the resolution and stability of the reconstruction region, which structurally corresponds to perspective selection in QRF.

§3Operational Definition of Self

3.1 Identity Condition: Internal Use of Δ_cf

The strong claim that "the self-model is a logical qubit" is merely a metaphor. Instead, self is defined as a generative latent state:

$$M(t) := \text{action-conditioned predictive sufficient statistic}$$

The central quantity of the observer-agent is the counterfactual future distinguishability (counterfactual future distinguishability):

Definition: Δ_cf $$\Delta_{\mathrm{cf}}(M_t) = \sum_{a < a'} \pi(a|M_t)\,\pi(a'|M_t)\; D\!\bigl(\rho^{(a)}_{t+1} \,\|\, \rho^{(a')}_{t+1}\bigr)$$ Average pairwise divergence of future distributions produced by actions $a$ and $a'$ under the current policy.

Identity Condition Necessary and sufficient condition (operational definition) for a system to be an observer-agent: $$\frac{\partial M_{t+1}}{\partial D_{\mathrm{ch}}(a,a'|M_t)} \neq 0$$ That is, the distinguishability of intervention-conditioned future distributions is internally utilized in the latent state update.

For Controller (Level 1): $\partial M / \partial \Delta_{\mathrm{cf}} = 0$. For Observer-Agent (Level 2): $> 0$. Numerically confirmed corr(Δ_cf, gain) ≈ 0.987 in 4-qubit experiment.

3.2 F-Coalgebraic Fixed Point

The question "who references π?" triggers homunculus regression. To avoid this, self is defined as an F-coalgebraic fixed point:

$$M^* = F(M^*)$$

Here $F : M \mapsto \Phi\bigl(M,\, R_M(\mathcal{E}_\pi(\rho))\bigr)$ is the composition of the policy-conditioned channel $\mathcal{E}_\pi$ and the Petz recovery map $R_M$. There is no separate "referencing subject" — the closed self-referential structure itself is the observer-agent.

On the Gödelian Remainder Directly importing Gödel's incompleteness theorems is inappropriate for finite-dimensional quantum systems. Instead, it is formulated as a quantum self-measurement gap (no-cloning + post-measurement back-action): $M^*$ cannot fully capture itself within $\mathcal{A}(R)$. This is maintained as a physical-basis candidate for the cliff of phenomenology, but the connection is unproven.

3.3 Endogenous Intervention Structure C_O

The $do(a)$ in Pearl's do-calculus is an external intervention and does not represent the observer's autonomy. What is needed is endogenous intervention:

$$C_O \subset \mathcal{A}(R)$$

$C_O$ is the set of operations generated from within $\mathcal{A}(R)$. A rock has $C_O = \emptyset$ and no connection between Δ_cf and $M^*$. In the 4-qubit model, $C_O = \{I, X_1, Z_1, CZ_{1,3}\}$.

The Markov blanket $B$ (Friston) provides "statistical separation of self and environment": $\mu_O \perp \eta \mid B$. This functions as a two-layer structure that complements Haag duality (algebraic access boundary) with a statistical-causal boundary.

Final Integrated Definition $$\text{observer-agent} := \bigl(\mathcal{A}(R),\;\Psi,\;\sigma_t^\Psi,\;P_\Psi,\;C_O,\;F,\;M^*\bigr)$$ Roles of each element: $\mathcal{A}(R)$: accessible von Neumann algebra, $\Psi$: reference state, $\sigma_t^\Psi$: modular flow (Tomita–Takesaki), $P_\Psi$: Petz recovery map, $C_O$: set of endogenous intervention operators, $F$: self-referential update operator, $M^*$: fixed point (self-model).

§4Hierarchical Structure: From Controller to Perspectival Observer

Level	Name	Condition	Numerical Evidence
Level 0	Passive Dissipative Structure	No internal model. Response to external forces only.	—
Level 1	Adaptive Controller	Action → future change → update. But does not use Δ_cf: $\partial M_{t+1}/\partial \Delta_{\mathrm{cf}} = 0$	corr ≈ 0.000
Level 2	Observer-Agent	Identity condition: Δ_cf is internally utilized in latent update. $\partial M_{t+1}/\partial D_{\mathrm{ch}} \neq 0$	corr ≈ 0.987
Level 3	Perspectival Observer	Complete externalization of perspective is impossible due to self-measurement back-action: $R_{M_t}(\rho_{t+1}) \neq \rho_{t+1}$	back-action > 0 (confirmed)
Level 4	Phenomenological Subject	Unresolved cliff. Unreachable with current framework.	—

Continuous Self (Selfhood)

Define temporal continuity of self as the stability of the reconstruction chain:

$$\text{Selfhood}(t) = \bigl\{O(t) \to O(t+\Delta t) \to O(t+2\Delta t) \cdots\bigr\}$$

Stability conditions:

— $d_{\mathrm{Bures}}(M(t), M(t+\Delta t))$ is small
— Prediction error is correctable within $\varepsilon$
— Reconstruction fidelity $F(P_\Psi) \geq 1-\delta$

Central Proposition Selfhood is not the persistence of a substance, but the stability of a reconstruction process.

Connection to Many-Worlds Branching

After branching, both $O_A(t) \to O_A(t+\Delta t)$ and $O_B(t) \to O_B(t+\Delta t)$ are valid as reconstruction chains. It is not a matter of "which is real" — both are locally experienced as chains that maintain lower free energy.

§5Selection Principle: Why Observer-Agents Emerge

The identity condition defines "what an observer-agent is." The selection principle explains "why such structures naturally emerge." Without the former there is no latter; confusing the two causes the theory to collapse.

5.1 RICD (Reparameterization-Invariant Counterfactual Divergence)

The raw Δ_cf depends on action labels, granularity, and policy. To eliminate this, we define the following quantity.

Equivalence Class of Actions

$$[a] := \{a' \mid \rho^{(a')}_{t+h} = \rho^{(a)}_{t+h} \;\forall h \geq 1\}$$ (Actions producing identical future distributions are identified)

Symmetrized KL Distance h Steps Ahead

$$d_h([a],[a'])^2 = D\!\bigl(\rho^{(a,h)} \,\|\, \rho^{(a',h)}\bigr) + D\!\bigl(\rho^{(a',h)} \,\|\, \rho^{(a,h)}\bigr)$$

Reconstruction Permeation Rate

$$\mathcal{R}_h([a],[a'],M_t) = 1 - \frac{D(M_{t+h}^{(a)} \| M_{t+h}^{(a')})}{\varepsilon + d_h^2}$$ (How much of the intervention divergence is preserved in the internal model)

Definition: RICD $$\Delta_{\mathrm{RICD}}^{*,\gamma}(M_t) = \sum_{h=1}^{\infty} \gamma^h \; \mathbb{E}_{[a],[a']\sim\bar\pi} \bigl[d_h([a],[a'])^2 \cdot \mathcal{R}_h([a],[a'],M_t)\bigr]$$ Reparameterization-invariant, horizon-aware, exploitability-aware counterfactual divergence. $\bar\pi$ is a reference policy (e.g., uniform distribution).

Relationship with Empowerment (Klyubin & Polani 2005):

$\Delta_{\mathrm{RICD}} \;\leq\; I(A;S'|\pi) \;\leq\; E(s) = \max_\pi I(A;S'|s)$

RICD is "the realized, available intervention divergence under the current policy," while Empowerment is its optimized upper bound. The two are not equal; RICD is positioned as a policy-conditioned, exploitability-aware sub-concept of empowerment.

5.2 Free-Energy Advantage Principle (FEAP)

Selection Principle $$\mathcal{J} = U_{\mathrm{pred}} + \lambda U_{\mathrm{ctrl}} - C_{\mathrm{therm}}$$ The observer-agent is naturally selected as a structure that maximizes the combined utility of prediction accuracy $U_{\mathrm{pred}}$, control performance $U_{\mathrm{ctrl}}$, and maintenance cost $C_{\mathrm{therm}}$.

Intuition: When Δ_cf (or RICD) is large, updating is beneficial; when it is small (noise-dominated), updating only increases cost. Therefore, the update rule that maximizes $\mathcal{J}$ naturally becomes RICD-sensitive.

Numerical verification (classical surrogate environment, evolutionary simulation): The sensitivity parameter $g_1$ of Δ_cf converges from an initially random population to positive values, with a final mean of $g_1 \approx 0.85$ and $g_1 \approx 1.0$ in top individuals. This shows that RICD-sensitive updating emerges as "a consequence of selection pressure" rather than "design" (note: results from classical surrogate environment; the quantum version requires a richer set of operators).

5.3 L2 Proposition and Environment Class

Structurally define the environment class $\mathcal{E}''$:

S1: rank of action channel > 1 (actions change something)
S2: finite mixing time $\tau_{\mathrm{mix}} < \infty$
S3: causal path $a \to Y$ exists and contributes to future inference/control (exploitability)

Observer-Agent Selection Theorem (L2 Proposition) In $\mathcal{E}''$, an observer-agent with RICD-sensitive adaptive gain has a positive asymptotic regret advantage over any fixed-gain controller: $$\lim_{T\to\infty} \frac{1}{T}\bigl(\mathrm{Regret}_{\mathrm{ctrl}} - \mathrm{Regret}_{\mathrm{agent}}\bigr) \geq \delta(\gamma,\tau_{\mathrm{mix}},\varepsilon) > 0$$ This advantage can be derived from the empowerment literature (Salge et al.) and optimal filtering theory for Markov switching environments.
Note: Not a new theorem but an application of existing frameworks. Finite-time guarantees depend on $\tau_{\mathrm{mix}}$; only asymptotic claims are possible.

L3 (universality class): Whether $\mathcal{E}''$ has positive measure in the natural environment distribution family $\mathcal{P}$ — this is an empirical question and lies outside the current framework.

§6Perspectival Incompleteness

The Level 2 observer-agent internalizes "being a causal node of future generation," but what is missing is "from where" — there is causal self-reference, but no positional self-reference.

6.1 Self-Measurement Back-Action

By introducing the self-measurement channel $S(\rho) = \sum_i P_i \rho P_i$ (non-reversible), a structural constraint arises that "trying to read the future changes oneself":

$$R_{M_t}(\rho_{t+1}) \neq \rho_{t+1}$$

Through the combination of the no-cloning theorem and post-measurement back-action, $M^*$ cannot fully capture itself within $\mathcal{A}(R)$. This "self-excess remainder" is the physical basis of perspectival incompleteness.

6.2 Modular Flow and the Emergence of Time

By Tomita–Takesaki theory, from the algebra $\mathcal{A}(R)$ and reference state $|\Psi\rangle$ the modular Hamiltonian $K_R$ is determined, and the modular flow

$$\sigma_t^\Psi(A) = \Delta^{it} A \Delta^{-it}$$

is generated. This is an internal automorphism of $\mathcal{A}(R)$, not an external time parameter. By the Bisognano–Wichmann theorem, in the Rindler wedge the modular flow becomes a Boost. That is, the sense of time may emerge from the stable ordering of reconstruction chains — however, note that this is state- and algebra-dependent, and one cannot say "the time of the universe has emerged."

6.3 Correspondence with Perspective-Neutral Structure

The "viewed from here" structure of the observer-agent can be written as gauge-fixing from a perspective-neutral world description:

$$\text{perspective-neutral structure} \xrightarrow{\text{gauge-fixing}} \text{indexical self-location}$$

This is closed as an intrinsic and self-referential operation, and together with the incompleteness due to back-action (complete externalization impossible), constitutes the perspectival observer (Level 3).

§7Map of the Cliff: Coordinates of Phenomenal Consciousness

Block's A-consciousness (the state where information is available for inference, reporting, and action control) and P-consciousness ("something it is like" inner qualitative feel) are distinguished. What this theory describes are the structural conditions of A-consciousness; regarding P-consciousness, the following is claimed.

Coordinates of the Cliff (Final Version)

What this framework provides:

— Access boundary ($\mathcal{A}(R)$, Haag duality)
— Adaptive recovery (Petz map, RICD-sensitive updating)
— Self-model (fixed point $M^*$ of action-conditioned Quantum IB)
— Formation of perspective (gauge-fixed indexical self-location)
— Perspectival incompleteness (quantum self-measurement gap)

What this framework does not provide:

— Why "something it is like" inhabits this structure

Shape of the cliff:
Previously: "recoverability → qualia" (vague)
Now: "gauge-fixed F-coalgebraic fixed point → first-person phenomenology" (precise)

The cliff has not disappeared. But its location can now be written in algebraic terms.

From the perspective of structural realism (Russellian Monism), $\mathcal{A}(R)$ describes structure, and "what realizes that structure" lies outside physics. The hypothesis that the intrinsic nature of that realization may be P-consciousness is retained — while honestly not proving it.

TerminalOpen Questions · What Can Be Claimed Strongly · What Cannot Yet Be Claimed

What Can Be Claimed Strongly

The observer can be defined as a von Neumann algebra $\mathcal{A}(R)$ rather than a "point" (consistent with AQFT)
Controller and Observer-Agent can be operationally separated by identity condition $\partial M_{t+1}/\partial \Delta_{\mathrm{cf}} \neq 0$ (numerical evidence corr ≈ 0.987 vs 0.000)
The Petz recovery map and active inference share a common objective function of relative entropy minimization (not metaphor, but structural correspondence)
RICD can be defined as a reparameterization-invariant quantity independent of action labels, granularity, and policy
Self-measurement back-action produces "complete externalization impossibility" as a structure (implementation of Level 3)
Numerically confirmed in evolutionary simulation (classical surrogate environment) that RICD-sensitive updating is naturally selected
L2 theorem: asymptotic regret advantage in $\mathcal{E}''$ can be derived from existing filter/control theory
The boundary reachable by the current framework can be located algebraically

What Cannot Yet Be Claimed

Explanation of phenomenal consciousness (P-consciousness / qualia). "Something it is like" lies outside this framework
Rigorous proof of the quantum version of the Selection Principle (the current 4-qubit model has the problem that Δ_cf becomes nearly constant)
Guarantee of finite-time regret (depends on mixing time; only asymptotic claims are possible)
L3 (universality class): whether $\mathcal{E}''$ has positive measure in the natural environment distribution family is an empirical question
EFE epistemic term and RICD have a difference between observation channel vs control channel and are generally not equal
Modular flow being "the emergence of time" itself — it is state- and algebra-dependent, and generalization to universal time is unproven
"Self-consciousness" (self knowing self) and "self-referential structure" are not identical
Connection with IIT (φ value) is currently dangerous — the redundancy of QEC and the irreducibility of IIT point in opposite directions

Open Questions (Where to Dig Next)

Proof of generic emergence: Define L3 as a "phase transition in the causal geometry of natural environments" and find the measure of the environment set where RICD tracks predictive-control relevance
Implementation of quantum RICD: Design of a richer quantum operator set where Δ_cf genuinely fluctuates. Verification of whether the quantum version has an essential advantage over the classical version
Non-circular definition of exploitability: Formulation of actionability based solely on the structure of the causal graph, without depending on control utility $U_{\mathrm{ctrl}}$
Quantification of perspectival incompleteness: Concrete measurement method for the quantum self-measurement gap $\Delta_{\mathrm{self}} = 1 - \max_{M^* \subset \mathcal{A}(R)} F(M^*, \rho_{\mathrm{self}})$
Phase II (compression): Can RICD_exp be further minimized and the definition of observer compressed into a single line: "a structure that continues to retain usable intervention divergence"?
Finite-time regret: Conditions under which initial overshoot of adaptive gain and finite-time reversal with the Controller occur