Negative results: a hierarchy of failures
This is the central chapter. The contribution of the study is not a model but a structured map of where an iid 2-adic approximation breaks: a list of descriptions that do not account for the finite-vs-iid discrepancy, ordered from the simplest to progressively more expressive descriptions, together with the residual that survives all of them.
Throughout this chapter, the discrepancy means the difference between the observed finite-integer escape-word measure and the iid 2-adic reference measure.
5.1 The ladder of eliminations
Read top to bottom, each rung is a more capable description than the one above, and each is ruled out as providing a complete explanation of the discrepancy. The first four rungs come from previous analyses and from the regression baseline used here; the last three are the present tests.
| Rung | Rejected explanation | Evidence here |
|---|---|---|
| 1 | cumulative valuation \(x_K\) alone | in the baseline, \(x_K\)+parity+bridge+\(z\) give AUC ≈ 0.50 |
| 2 | escape-word length \(\tau\) alone | prior step; \(\tau\) is informative but not sufficient |
| 3 | mean valuation / cumulative drift alone | prior step; correlated but not the main effect |
| 4 | a one-step (near-iid local) picture | prior step; local transitions close to iid |
| 5 | finite blocks as a generative model (reweighting) | Test 3: overcorrects; best fit is the damped baseline C |
| 6 | finite-block maximum entropy | Test 4: no better than raw/damped C |
| 7 | accumulation of block anomalies → whole-word deficit | Tests 1–2: AUC grows, bridge/parity residual structure persists B |
5.2 What the diagnostic tests do show
These results should not be read as implying that finite blocks reveal nothing. The finite-block diagnostics reveal a real, monotonic signal: the \(+\)score AUC rises from \(0.5363\) at \(L=3\) to \(0.5643\) at \(L=6\) (Table 4.2), and the focus-state \(B_4\) separation reaches AUC \(0.719\) (§4.1). The negative conclusion is more specific:
- the rising AUC does not drive the bridge coefficient to zero (it falls by \(0.0162\) when \(B_4\) is added; §4.1) — the block signal and the bridge structure capture largely independent aspects of the discrepancy;
- the remaining parity effect is not reduced at all by the block score (§4.1, §4.2);
- no block length tested closes the gap, and whether any finite \(L\) would is left open.
5.3 Why the generative tests fail differently
The two generative tests fail in two distinct ways, which is itself informative:
- Reweighting (Test 3) overcorrects. The only setting that improves the fit uses heavily damped short blocks (\(L=3\), \(\alpha=0.25\)); as soon as the blocks are long or the exponent is full, the reweighted measure becomes too sharp — the focus survival collapses from an actual \(0.472461\) to \(0.00416667\) at \(L=6,\ \alpha=1\). A raw product of block ratios over-counts.
- Maximum entropy (Test 4) does not over-count, and still does not win. Replacing the raw product with a regularized projection removes the oversharpening but yields an RMSE of \(0.000493147\), worse than the damped baseline's \(0.000440978\) (Figure 1). Matching block marginals is not enough to recover the state distribution.
Taken together, these results show that the failure is not merely "we reweighted too hard". A principled finite-block exponential family that matches block marginals also fails to recover the observed distribution. That is the stronger negative statement.
5.4 Reading the classification letters
Each script emits a coarse self-classification. These letters are not a shared scale and not an external benchmark. They are per-test verdicts with thresholds chosen by the author, reported verbatim. Their meanings:
| Letter | In the diagnostic tests (1–2) | In the generative tests (3–4) |
|---|---|---|
| A | finite blocks reconstruct the whole-word deficit (AUC gain large and residual structure small) | — |
| B | AUC grows with the block score, but the bridge/parity structure remains | — |
| C | improvement saturates near \(B_4\) | the generator overcorrects, or is no better than the simple/damped baseline |
| D | signal too sparse or noisy to read | — |
Concretely, in the renormalization test the thresholds are: A requires AUC gain \(> 0.03\) and bridge residual \(< 0.3\) and parity residual \(< 0.1\); B requires AUC gain \(> 0.005\); C if the \(L=6\) AUC barely exceeds \(L=3\); else D. The observed AUC gain (\(0.0280\)) clears B but the residuals (bridge \(1.1011\), parity \(0.3749\)) are far above the A thresholds — hence B, not A.
5.5 The residual that survives everything
The common conclusion of all four tests is the persistence of a single residual structure. What remains unexplained is specific and reproducible: a state-distribution component indexed by bridge shape and by parity. The corresponding coefficients remain large (bridge \(\approx 1.10\)–\(1.13\), parity \(\approx 0.37\)–\(0.44\)); the pattern is not explained away by adding block scores, and it is also flagged by the bridge RMSE and parity RMSE figures in Tests 3–4. We record this as the principal unresolved structure; §6 lists candidate frameworks for thinking about it, strictly as candidates.
The auxiliary Δ analysis (§4.6) supplements where this residual is
visible. Rather than remaining merely unexplained, the residual localizes in state
coordinates and in the remaining_K boundary distance, while it does not
collapse onto any single prefix or transition cell; the largest \(|\Delta|\) is
observed at remaining_K = 32–63. This reinforces the negative
conclusion of this chapter — that finite-block features diagnose the discrepancy but
do not generate the whole-word measure difference — and identifies remaining_K = 32–63 as the band with the largest observed discrepancy, not as an identified
source of the discrepancy.
5.6 What is closed here, and what is not
The closed claim of this paper is deliberately narrow: within the tested escape-word coordinates, the finite-vs-iid discrepancy is not reproduced by the finite-block approximations tried here. Short and medium blocks detect the discrepancy, but finite-block reweighting and finite-block maximum entropy do not generate the observed state distribution.
This is a deliberately narrow boundary statement rather than an explanation of the discrepancy or a proposal of a new Collatz mechanism: along this iid approximation route, local block corrections extend only this far. Locating the remaining structure in residue classes, inverse trees, prefix cylinders, stopping boundaries, or hidden states is left to later work. These directions are possible continuations rather than consequences of the present results.