The Conway system: compression as game theory

40 Lean 4 files, ~14,500 lines
Richard Hoekstra · April 2026

The idea

Compression is a game between encoder and decoder. Conway's combinatorial game theory provides the right language: the star game * = {0|0} satisfies * + * = 0 and * · * = 0 — it is a NilSquare element. The encoder/decoder role bit lives in Z/2Z, and swapping roles is addition by *. This connects the compression problem directly to the algebraic theory of ε² = 0.

The Conway system is a layered formalization in Lean 4 that develops this idea from axioms to search algorithms. Each layer depends only on the layers below it. The full system comprises 40 files totaling ~14,500 lines.

Layer architecture

LayerFilesWhat it formalizes
1. Game coreConwayStar, ConwayCoderDuality, ConwayInformationGameStar game * + * = 0, encoder/decoder role duality via ZMod 2, separation of predictive state from emitted residual
2. Codec interfaceConwayStatefulCodec, ConwayPredictiveByteCodec, ConwayRecurrentCodecAsymmetric codec interface, forward/backward decomposition, recurrent decoder/update rules
3. Cost structuresConwayStatefulCost, ConwayPredictiveByteCost, ConwayRecurrentCostCost formalization for stateful, predictive, and recurrent codecs
4. Open systemsConwayOpenSystemMDL, ConwayOpenSystemScan, ConwayOpenGeometry, ConwayOpenConnectionOpen-system interaction step, trace, geometric structure of state space, connection formalism
5. ScatteringConwayScatteringAction, ConwayPathMeasure, ConwayBulkBoundaryScatteringScattering action functional, path integral measure over admissible histories, bulk-boundary duality
6. Byte mechanicsConwayByteMachine, ConwayByteMachineScan, ConwayByteResidualHamiltonian, ConwayByteMatterField, ConwayByteMatterFieldScan, ConwayByteFieldReversal, ConwayByteObstruction, ConwayByteBitstreamUniversal byte observations, bit-level state geometry, XOR/AND/OR transport, popcount energy, exact/harmonic/residual sector split, field reversal symmetries, obstruction theory
7. GeometryConwayExteriorDerivative, ConwayHyperbolicBulkde Rham exterior derivative on state space, Poincaré-ball hyperbolic geometry layer
8. DynamicsConwayHamiltonian, ConwayHamiltonianTraining, ConwayHamiltonianCoarse, ConwayHamiltonianScanGlobal exact-reconstruction energy functional, differentiable training variant, coarse-grained approximation
9. World modelConwayWorldModelGame, ConwayWorldModelGameCost, ConwayWorldModelGameMeasure, ConwayWorldModelGameScanMulti-role extension (environment, model, encoder, decoder, state, router), cost measures
10. SearchConwayHoarePath, ConwayGuardedMCTS, ConwaySwarmMeasure, ConwaySwarmSearch, ConwaySwarmGuardedBridgeWeighted Hoare edges, AlphaGo-style guarded tree search, ensemble swarm measurements, bridge between swarm and MCTS
11. IntegrationRepresentationalAtlas, ConwayCoordinateProblemFinite atlas layer for coordinate-relative MDL, MDL coordinate problem in Conway framework

Key structures

StarGame An additive group G with a distinguished element star : G satisfying star + star = 0. Instance: ZMod 2 with star = 1. The encoder/decoder swap is addition by star; swapping twice is the identity.
ConwayHamiltonian The energy functional H = joint description length of a representation choice. Exact reconstruction is a constraint. The family-level energy landscape over representation choices defines representational advantage as an energy gap.
GuardedMCTS Tree search where nodes are (state, assertion) pairs and children are guarded moves justified by weighted Hoare edges. Value is negative remaining action. Selection combines backed-up value, local action, and exploration prior.

What the game framing actually gives you

The genuine gain of the Conway framing over the ring framing is in Layer 1: role-swap symmetry. The encoder/decoder duality is addition by *, and swapping twice is the identity. In the ring framing you get the same thing from the flip involution on matrix factorizations, but in the game language it's native — you don't have to construct it.

The ConwayHamiltonian is a cost function with a reconstruction constraint. Calling it a "Hamiltonian" is justified if the Hamiltonian structure gives you something a plain cost function doesn't — conservation laws, symplectic structure, Noether currents. Currently, the formalization defines the energy landscape and proves that representational advantage equals the energy gap between two model families (advantage_eq_energy_gap). Whether this yields anything beyond "the cheaper representation wins" remains to be demonstrated.

What's honest and what's aspirational

Concrete and proved: Layers 1-3 (game core, codec interface, cost structures) are fully formalized with zero sorry's. The role duality theorem, the roundtrip from the PPM-C formalization, and the cost decompositions are machine-checked. The ConwayByteMachine spec is implemented as a Zig runtime that actually compresses: PPM-C D=8 + Hodge gate + skip-gram expert + range coder.
Aspirational: Layers 4-7 use physics language (scattering action, path measure, bulk-boundary, hyperbolic geometry, exterior derivative) for what are currently formal interfaces without empirical benchmarks on those specific layers. The question is whether "scattering action functional" does work that "sum over compression traces" doesn't. If ConwayPathMeasure produces a provably tighter bound than plain MDL, the name earns its keep. That proof doesn't exist yet.

GuardedMCTS: the interesting part

The most concrete algorithm in the system. Tree search where nodes are (state, assertion) pairs and children are guarded moves: each move carries a logical assertion (via weighted Hoare edges) ensuring the compression remains decodable. Selection uses PUCT over admissible edges.

A Python implementation runs MCTS over model-choice sequences: at each segment of a corpus, which compression model to use? The search recovers model sequences whose total bpb is strictly better than any single-model baseline on synthetic mixed corpora (XML + prose + dense text). If that result extends to enwik8, it's a standalone paper regardless of the Conway framing.

BGC0: the native runtime

The Conway byte machine spec is implemented in Zig as the BGC0 runtime. On enwik8 slices:

Sizebpb
20 KB3.03
200 KB2.30
2 MB2.18

Tracks the Python PPM-C compressor within ~0.25 bpb across two decades of stream length. The gap is algorithmic (PPM-C exclusion blend, not architecture). Zero parameters, online learning only.

Sorry count

Zero sorry's across all 41 Conway files. The system is fully type-checked. This means the interfaces and cost decompositions are proved, but it does not mean the physics-language layers produce empirical results beyond what plain MDL produces.

The RepresentationalAtlas (1,141 lines) defines a finite family of model costs and the profile (LF₀*(x), LF₁*(x), ...). This is coordinate-relative MDL — choosing between representations is choosing coordinates in the atlas. Whether the atlas structure provides a search advantage over enumeration is an open question.