Where does the actual−iid discrepancy become visible?
A coordinate-wise localization of Δ in Collatz escape words
A coordinate-wise measurement of where Δ localizes most strongly.
remaining_K), and remaining_K-chain coordinates. The discrepancy
is not concentrated in any single prefix or transition; it localizes more sharply in
state coordinates and along the remaining_K boundary distance, with the
largest observed \(|\Delta|\) at remaining_K = 32–63. Along the
remaining_K chain, mass Δ can be negative while conditional downstream
transition Δ is positive, separating mass placement from local transition. The paper reports
finite-sample descriptive statistics: remaining_K identifies the band with the
largest observed discrepancy in this scan.
1. Introduction
Finite-block diagnostics for Collatz escape words showed where the actual–iid discrepancy can be detected, but did not reproduce it generatively. This study takes the discrepancy itself as the object of observation.
Throughout this paper, the discrepancy refers to the difference between the observed finite-integer escape-word measure \(\mu_{\text{actual}}\) and the iid 2-adic reference measure \(\mu_{\text{iid}}\). The object of study is that difference itself,
\[ \Delta(\cdot) \;=\; \mu_{\text{actual}}(\cdot) \;-\; \mu_{\text{iid}}(\cdot). \]The single question the paper asks is:
The analysis changes coordinates while keeping the same Δ fixed, and reports where localization is strongest.
This is a measurement paper. Every reported quantity is a descriptive statistic, and
is split-sample and sampled. The analysis is descriptive rather than generative. In particular remaining_K is, as shown below, the band
where the largest \(|\Delta|\) is observed, but we do not interpret it as a source of
the discrepancy.
All numbers in this paper are the quantities already computed in the prior study's auxiliary analysis (the Δ-map of §4.6), retaken under a single, self-contained theme. No new data generation or model fitting is performed.
2. Coordinate systems
The discrepancy is projected onto five coordinates, all defined on the same population of escape words. They differ only in which quantity bins the words; in each coordinate, Δ is read as the per-cell mass difference.
state
The state coordinate is the same conditioning cell as in the prior study,
\[ \text{state} \;=\; \text{bridge\_cluster} \;\big|\; \text{x\_K\_window} \;\big|\; \text{parity}. \]It combines path shape (bridge_cluster), valuation depth
(x_K_window), and parity into one coarse-grained cell.
prefix cylinder
The classical cylinder coordinate that bins words by fixing the first \(L\) symbols of the escape word. Projecting Δ onto prefix cylinders lets us ask whether the discrepancy is visible from an early part of the word, and whether it concentrates in a single initial word-fragment.
transition
A coordinate that bins words by the transition (edge) between consecutive blocks, or the branch in prefix growth. Projecting Δ onto transitions lets us ask whether any single local transition carries the bulk of the discrepancy.
boundary (remaining_K)
A coordinate of distance to the stopping boundary. remaining_K is a
\(K_\tau\)-based boundary distance, here bucketed into bands such as
16–31/32–63/64–95/96–127.
Projecting Δ onto remaining_K lets us ask whether the discrepancy is
organized along distance to the boundary.
remaining_K chain
A coordinate treating the boundary not as a single cell but as a chain of boundary-distance bands. Alongside the mass difference in each band (mass Δ), the chain records the difference in the share moving downstream conditioned on being in a given band (conditional transition Δ). The central observation of §7 is that these two Δ's can have opposite signs.
| coordinate | what it bins by | what it can ask |
|---|---|---|
| state | path shape + valuation depth + parity | does Δ localize in a coarse-grained cell? |
| prefix | first \(L\) symbols | does Δ concentrate in an initial fragment? |
| transition | inter-block edge / branch | does a single transition carry Δ? |
| boundary | boundary distance remaining_K | is Δ organized along boundary distance? |
| chain | chain of boundary-distance bands | do mass placement and local transition agree? |
3. State localization
Projected onto the coarse-grained state coordinate, Δ gathers in a small number of cells.
Results
Projected onto state coordinates, the discrepancy localizes in a small number of states. The representative stress case singled out in the prior study,
\[ \textbf{focus state} \;=\; \texttt{late\_growth | tail\_64\_95 | even}, \]is the state where the actual–iid discrepancy is largest and where sampling is
sparsest. In this state the mass carried by actual and by iid differ substantially,
while that difference gathers into a few states and nearly cancels in most. The
combination of the three coordinates
bridge_cluster + x_K_window + parity provided the strongest localization of
Δ.
Discussion
State localizes Δ with medium-to-high sharpness. This indicates that path shape
(bridge_cluster), valuation depth (x_K_window), and parity all
act not singly but as a combination. In other words, Δ is not pinned to "one
shape" or "one depth," but appears as their intersection. That the focus state sits in a
sparsely sampled band means that, while the localization is strong, it also stands on a thin
mass.
Open questions
State is a coarse-grained coordinate; whether localization sharpens or the image diffuses when each of the three coordinates is further subdivided is not measured here. This remains a question of coordinate resolution.
4. Prefix localization
In prefix cylinders, Δ is already visible early but remains spread over many initial fragments.
Results
Projected onto prefix cylinders, two things happen at once. First, the discrepancy is visible from the early prefix on — fixing only the very beginning of the word already gives a non-zero actual–iid mass difference. Second, however, the discrepancy does not concentrate in a single prefix. Lengthening the prefix window does not draw Δ into one particular initial fragment; it remains spread thinly across many cylinders.
Discussion
The sharpness of Δ in the prefix coordinate is low. That the difference is visible early shows the discrepancy is not a tail-only phenomenon, but that it does not concentrate in a single prefix shows the prefix cylinder is not a good coordinate for capturing Δ. Δ does not appear as excess mass of words with a fixed opening.
Open questions
How far Δ can be captured by longer prefixes, or by products of prefix with other coordinates, remains open. The present scan records the two-sidedness here: visible early, but not concentrated.
5. Transition localization
In the transition coordinate, Δ does not collapse onto any single edge or branch.
Results
Projected onto transitions (inter-block edges / branches in prefix growth), no single transition captures the discrepancy. Δ is not collapsed onto any one edge or branch; the discrepancy is distributed across many transitions. Whichever local transition we pick out, on its own it carries only a small part of the total discrepancy.
Discussion
The sharpness of Δ in the transition coordinate is low. This points the same way as the prefix result: Δ is not pinned to the microscopic units of "a local word-fragment" or "a local transition." However finely we examine local transitions, the discrepancy does not reside locally; it is distributed.
This finding also foreshadows the chain analysis of §7. A single transition does not capture the discrepancy; from the coordinate side, the discrepancy may appear not in transitions themselves but in the resulting mass placement.
Open questions
Whether higher-order transitions (two or more steps) or combinations of transitions can capture Δ is not measured. The claim here is limited to the low localization of a single transition.
6. Boundary localization (remaining_K)
The boundary-distance coordinate remaining_K is one of
the coordinates in which Δ localizes most sharply.
Results
Projected onto remaining_K (distance to the stopping boundary), the
discrepancy is organized along boundary distance. The largest \(|\Delta|\) appears at
remaining_K = 32–63. Thinness continues into
remaining_K = 64–95 and 96–127, but the largest absolute mass
difference remains at 32–63.
| remaining_K | actual | iid | \(\Delta\) | ratio | L1 share |
|---|---|---|---|---|---|
| 32–63 | 1.959532 | 2.139743 | −0.180211 | 0.916 | 38.57% |
| 64–95 | 0.266435 | 0.341662 | −0.075227 | 0.780 | 16.10% |
| 96–127 | 0.018644 | 0.031704 | −0.013059 | 0.588 | 2.80% |
In all three bands \(\Delta\) is negative: the mass actual carries in each band is thinner than iid. The largest absolute mass difference is at 32–63 (\(|\Delta|=0.180211\), L1 share 38.57%); the lowest ratio is at 96–127 (0.588), but the mass and L1 share of that band are small. So the "band with the largest observed discrepancy" is 32–63 by absolute mass difference and 96–127 by ratio — two coexisting readings. We take absolute mass difference as the criterion and call 32–63 the band with the largest observed discrepancy.
Discussion
The sharpness of Δ in the boundary coordinate is high. The discrepancy, distributed
in the prefix and transition coordinates, appears as a clear band structure once
rearranged along the single axis of boundary distance. This shows the effectiveness of
remaining_K as a coordinate that captures Δ.
Open questions
How the largest-discrepancy band moves under different binnings (here roughly dyadic windows), and whether a finer boundary-distance decomposition reveals structure within 32–63, is not measured here.
7. remaining_K chain
The boundary coordinate can also be read as a chain of boundary-distance bands. In that chain, mass placement and local transition separate.
Results
Along the remaining_K chain we measure two distinct Δ's.
- mass Δ: the mass difference in a band (the same direction as Table 6.1 in §6). It is negative in the leading bands.
- conditional transition Δ: the difference in the share moving to a downstream band, conditioned on being in a given band. There are bands where it is positive.
mass Δ < 0
conditional Δ > 0
In two concrete cases the signs of the two diverge:
| transition | mass Δ | conditional Δ | sign |
|---|---|---|---|
64-95 → 32-63 | −0.006059 | +0.007861 | opposite |
32-63 → 16-31 | −0.009262 | +0.004618 | opposite |
That is, in these bands the mass actual carries is thinner than iid (mass Δ negative), yet the share moving downstream, conditioned on being in the band, is not necessarily weak (conditional Δ positive). Carrying little mass and having a weak conditional transition do not coincide here.
Discussion
This separation of signs shows that local transition and mass
placement are distinct. If the discrepancy were simply a local-transition
malfunction — "the transition from a band downstream is weak" — mass Δ and conditional Δ
would move in the same direction. They do not: the conditional transition is, if
anything, on the strong side, while the band's mass is thin. The observed discrepancy is
therefore better read not as an excess or deficit of transition probability, but as a
difference in where mass is placed along the remaining_K chain.
This is consistent with the finding of §5: no single transition captures the discrepancy. Examined finely, local transitions do not house the discrepancy, yet viewed as per-band mass placement it appears sharply. The chain is the coordinate that shows this "placement, not transition" image most clearly, in the form of a sign mismatch.
Open questions
How the sign difference distributes across the whole chain (more bands, deeper chains), and how systematic the positive-conditional-Δ bands are, is recorded here only in two cases. A systematic measurement varying chain length and band resolution is a possible continuation.
8. Coordinate comparison
The projections of §3–§7 can be compared by how concentrated Δ becomes in each coordinate.
Results
| coordinate | sharpness | main finding |
|---|---|---|
| block score | — | diagnostic signal present, generation not reproduced (prior §4.1–4.4) |
| state | medium–high | localized by bridge_cluster + x_K_window + parity |
| prefix | low | visible early but not concentrated in a single prefix |
| transition | low | not collapsed onto a single edge / branch |
| boundary remaining_K | high | largest \(|\Delta|\) at 32–63 (peak band) |
| remaining_K chain | high | mass Δ and conditional Δ have opposite signs; visible as placement |
Discussion
Changing coordinates changes the concentration of the same Δ. In the local
coordinates of prefix and transition, Δ disperses over many cells. In the coarse-grained
state coordinate and the boundary-distance remaining_K coordinate, the
concentration evidence in Table 8.1 is stronger: Δ gathers into a few cells or bands.
The chain provides a finer localization of the boundary structure by showing a sign
mismatch between mass placement and local transition.
This comparison is itself the paper's main result. Asked "where is Δ," one can answer that it lies not in local word-fragments (prefix, transition) but in coarse-grained state and boundary distance (boundary / chain).
Open questions
"Sharpness" is a qualitative label here. We do not quantify sharpness on a single scale across AUC, L1 share, ratio, and the like. Comparing sharpness across coordinates on a common scale is a possible continuation.
9. Discussion
The coordinate dependence of Δ is itself informative: the discrepancy becomes sharp after coarse-graining by state and boundary distance, not after isolating local word-fragments or single transitions.
The observations gather into one careful description:
This reading is supported by the two negative/separating observations of §5 and §7. That no single transition captures the discrepancy (§5) shows the discrepancy does not reside in local transitions. The sign difference between mass Δ and conditional Δ in the chain (§7) shows that mass placement can differ independently of local transition. Taken together, the result is that the location of Δ is placement, not transition.
The coordinate dependence supplies a boundary condition for later accounts. Any proposed
account of the actual–iid discrepancy should explain why the discrepancy is diffuse in
prefix and transition coordinates, but concentrated in state and boundary coordinates,
especially as mass placement along the remaining_K chain.
10. Limitations
The claims are limited to coordinate-wise descriptive statistics.
- Descriptive statistics only. The reported quantities are coordinate-wise projections of the actual–iid mass difference Δ, not a generative model or probabilistic model.
- They are sample-based. Every quantity is split-sample and
sampled. In particular the focus state and the deep
remaining_Kbands that show the strongest localization sit in sparsely sampled regions. Sampling error is not formally quantified. - Coordinate localization is the object of study. The 32–63 band and the chain sign difference are reported as features of this coordinate projection. They are not, by themselves, a generative account.
- Coordinate resolution is coarse. The three state coordinates and
the
remaining_Kbands are coarse decompositions, and finer coordinates could change the sharpness of localization. The chain sign difference is recorded in only two cases. - "Sharpness" is qualitative. We do not quantify the clarity of localization across coordinates on a common scale. The coordinate comparison (§8) should be read under this limitation.
11. Conclusion
The paper provides a coordinate-wise map of where the actual–iid discrepancy becomes visible.
remaining_K boundary coordinate. The
largest absolute mass difference is observed at remaining_K = 32–63 (the
peak band). Along the remaining_K chain there
are bands where the mass Δ is negative while the conditional downstream transition Δ is
positive — local transition and mass placement are distinct.
The conclusion follows from one controlled comparison: the same Δ is observed while the coordinate system changes. This paper provides a coordinate-wise map of the actual–iid discrepancy that future theoretical studies can build upon. Natural continuations are re-measurement at finer coordinate resolution, a systematic study of the chain sign difference, and a common quantitative scale for localization sharpness.