The Kolmogorov complexity of machine-verified proofs

Paper 2 in the Hodge-Epsilon Program
Richard Hoekstra · April 2026

The pipeline

Machine-verified proofs are concrete objects. Lean 4 produces explicit proof terms that can be inspected, serialized, and measured. We build a three-stage pipeline:

1. Type erasure. Lean.Expr (the proof term) is converted to TLC, a seven-constructor ternary lambda calculus, by forgetting all type information. The resulting term captures the computational content — the "what" of the proof — without the "why."

2. DAG-CSE. The TLC tree is hash-consed into a directed acyclic graph. Structurally identical subtrees are shared. The DAG node count is our complexity measure.

3. Exact bounds. For the simplest proofs, we prove upper = lower bounds on KDAG, yielding exact Kolmogorov complexities.

The seven TLC constructors

CodeConstructorTrit costPre-CSE freqRole
0app(fn, arg)1 trit49%Application
10const(idx)2+ trits27%Global reference
11var(idx)2+ trits25%Local (de Bruijn)
12lam(body)2+ trits2%Abstraction
20let(val, body)2+ trits~0%Sharing
21case(scrut, alts)2+ trits~0%Observation
22quote(term)2+ trits~0%Reification

The Kraft sum is 1/3 + 6/9 = 1. Saturating. No prefix code over ternary can use these seven slots more efficiently.

K of the epsilon tower

TheoremRaw nodesDAG nodesSharedRatioLayer
eps_sq6511× (exact)L1
eps_nilpotent1113813L1
eps_zero_divisor3075726L1
mulEps_mulEps567533810×L2
mulEps_comp_zero1,964806024×L2
eps_pow_two1,4521005514×L1
exact_iff55311556L2
dual_mul298,0031,018527292×L1

The range is dramatic: from 5 to 1,018 DAG nodes, spanning three orders of magnitude in compression ratio.

K(eps_sq) = 5 exactly

Theorem KDAG(ε² = 0) = 5 DAG nodes exactly. Upper = lower bound, proved in Lean (K_epsSq_tight).

The proof term is lam(app(app(const 22, var 0), var 0)) — "lambda epsilon, mul(epsilon, epsilon)." The five DAG nodes are:

Node 0: const 22 (the multiplication operator) Node 1: var 0 (epsilon — shared, referenced twice) Node 2: app(0, 1) (mul applied to epsilon) Node 3: app(2, 1) (mul(epsilon) applied to epsilon — reuses Node 1) Node 4: lam(3) (the proof binder)

Upper bound: DAG size = 5 (proved: epsSqProof_dag_nodes). Lower bound: all five nodes are structurally distinct — const vs var vs app(0,1) vs app(2,1) vs lam. No further sharing is possible (proved: epsSqProof_dag_nodes_distinct).

Tower cohesion and mutual information

All eight theorems merged into a single hash-consed DAG:

MetricValue
Sum of individual DAGs1,571 nodes
Combined DAG1,409 nodes
Cross-theorem shared nodes162 nodes
Cohesion10%

The tower is 90% modular and 10% cohesive. dual_mul dominates at 1,018 of 1,571 individual DAG nodes (65%).

Mutual information

I(A; B) = K(A) + K(B) − K(A, B), computed via combined DAG:

RankPairI
1eps_pow_two ↔ eps_zero_divisor31
2eps_nilpotent ↔ eps_pow_two22
3mulEps_mulEps ↔ exact_iff19
eps_sq has ZERO mutual information with every other theorem. It is informationally atomic — it shares no proof structure with anything. The hub theorem is eps_pow_two (total I = 104), not the axiom and not the largest theorem.

K vs proof depth

DepthAvg DAG KRepresentative theorems
05eps_sq (the axiom)
181eps_nilpotent, eps_zero_divisor, eps_pow_two
2370dual_mul (outlier: ring tactic), mulEps
382mulEps compositions, exact_iff
4123derivation theorems
5271deformation theorems

Information production rate: ~29 DAG nodes per proof step. K grows sublinearly with depth (α < 1) — the signature of increasing structural reuse. The master theorem (NilSquare.lean, 3.3 nodes/line) is 6× more compact than the lemmas (Epsilon.lean, 19.3 nodes/line). Assembly is cheap. Proving is expensive.

Post-CSE constructor distribution

ConstructorPre-CSEPost-CSE
app49%94%
const27%~0%
var25%3%
lam2%3%

The leaves collapse into shared references; the wiring remains unique. Shannon entropy drops from 0.948 to 0.283 trits. Mathematics, after deduplication, is 94% application — applying things to other things.

Optimal axiom selection

MDL criterion Which single theorem, if taken as axiom, minimizes the total Kolmogorov complexity of describing the tower?
CandidateK(axiom)K(rest | A)Winner?
dual_mul1,018217YES
exact_iff1151,120
eps_pow_two1001,135
eps_sq51,230
Fundamentality ≠ simplicity. dual_mul's ring normalization proof contains 83% of the tower's shared algebraic machinery. Conditioning on it, the rest becomes nearly trivial. eps_sq is incompressible, isolated (I = 0 with all), and informationally atomic — the purest axiom but the worst foundation for reconstruction.

K = 5 universality and the gap theorem

EquationTree nodesDAG nodes
x² = c65
x³ = c87
(x²)² = c107
x² + x = c88

All three order-2 axiom systems (nilpotent ε² = 0, involution j² = 1, idempotent e² = e) share K = 5. A strict gap of 2 separates them from K ≥ 7 for anything more complex. No equation with K = 6 exists. Order 2 is the K-optimal algebraic structure.

Confluence complexity

Confluence proofKDAG
CR_idempotent273
CR_nilpotent444
CR_involution578

Nilpotent is NOT the simplest confluence proof — idempotent is. K-complexity of an axiom (all three have K = 5) does not predict K-complexity of its derived theorems. Simplicity of the axiom and simplicity of the theory are different measures.

The self-measurement tower

The measurement pipeline is itself Lean code. We measure it with its own tools:

FunctionK (trits)DAG nodesRole
exprToTLC164Type erasure (partial — opaque)
tritLength74944Trit counting
nodeCount65942Node counting
getConstValue10114Extract proof term
measure1,13283The full instrument
dagCSE23230Hash-consed compression

The fixed-point tower:

Level 0: K(ε² = 0) = 5 nodes (the axiom) Level 1: K(measure) = 83 nodes (measures Level 0) Level 2: K(meta-measure) = 23 nodes (measures Level 1) Level 3: K(meta-meta) = 25 nodes (measures Level 2)

Reflection overhead: R(0→1) = 16.6×, R(1→2) = 0.3×, R(2→3) = 1.1×. Meta-measurement is smaller than measurement — going up the tower strips overhead rather than adding it. The tower stabilizes at Level 2–3. Total reflection cost: 5 + 83 + 23 + 25 = 136 nodes = 27× the axiom.

Machine-verified compression

A formalized PPM-C compressor in Lean 4 with exact rational arithmetic. Three files, zero sorry's:

FileProves
ArithCoder.leanInterval narrowing, containment, exact width
PPMExclusion.leanEscape bounded, excluded symbols get zero mass
KolmogorovBound.leandecompressBytes(compressBytes(data)) = data

Over the rationals, interval arithmetic is exact. The roundtrip becomes a consequence of arithmetic identities, not precision analysis. Combined with the empirical 2.16 bpb on enwik8, this yields K(enwik8) ≤ 2.16|x| + O(1) — the first machine-verified Kolmogorov complexity upper bound on a nontrivial string.

The snake measures its own length. The length is 83 nodes. The price of knowing this is 136.
Full paper (HTML) PDF