02 BackgroundCollatz finite-block diagnostics

Background and notation

This section fixes the objects we compare: escape words, the iid 2-adic reference, and the coordinates we condition on. Nothing here is novel; it is recorded so the measurements in §4 are unambiguous.

2.1 Escape words

Work with the accelerated odd-to-odd map. For an odd \(n\), one step removes the factor of two from \(3n+1\):

\[ n \;\longmapsto\; \frac{3n+1}{2^{\,k}}, \qquad k = v_2(3n+1) \ge 1 . \]

Following a trajectory until it escapes the layer produces a finite sequence of valuations

\[ \mathbf{k} = (k_1, k_2, \dots, k_\tau), \qquad k_i \ge 1, \]

which we call the escape word; \(\tau\) is its length. The cumulative valuation is \(K_\tau = \sum_i k_i\). In log-magnitude terms each step contributes \(\log_2 3 - k_i\), so the partial sums of \(\log_2 3 - k_i\) describe the trajectory's descent; this is the path we summarize by shape features below.

2.2 The two measures being compared

We compare two distributions over escape words inside each conditioning cell:

actual — words of finite integers, enumerated from exhaustive residue-status caches over odd residues up to \(2^{\text{power}}\), for power in \(\{24,\dots,28\}\). Trajectories flagged ESCAPE are traced and weighted by their layer mass.
iid — words sampled from the iid 2-adic reference: valuations drawn from a tilted geometric-like law (tilted_k) independently at each step, with importance weights that match the layer geometry. The reference is the null against which "actual" is measured.

Framing Throughout this paper, the object of study is the discrepancy between actual and iid inside a state, not either measure on its own. Every reported quantity (AUC, RMSE, survival, residual) is a comparison.

2.3 Valuation categories and blocks

Raw valuations are bucketed into three categories,

\[ \text{k\_cat}(k) = \begin{cases} \texttt{"1"} & k = 1 \\ \texttt{"2"} & k = 2 \\ \texttt{"3+"} & k \ge 3, \end{cases} \]

and a block of length \(L\) is a window of \(L\) consecutive categories. There are \(3^L\) possible blocks; we use \(L \in \{3,4,5,6\}\). The coarse bucketing keeps the per-block support small enough to estimate from samples while still resolving the short-range structure that the one-step view misses.

2.4 Why a finite-block view at all

Previous analyses suggest that the discrepancy is not adequately captured by any single low-order summary: not by \(K_\tau\), not by \(\tau\), not by mean valuation or cumulative drift, and that one-step transitions are close to iid. In the regression baseline used here those covariates together separate actual from iid only marginally (AUC \(\approx 0.50\); see §4). This naturally leads to the finite-block question: if the deviation is not one-step and is not adequately captured by these scalars, does it live in short multi-step patterns, and if so can those patterns be turned into a generator?

The remainder of the paper answers: the deviation does show up in short blocks (diagnostically), but turning the blocks into a generator fails in the two ways we tried.