Papers · Atlas · Data

The Kolmogorov complexity of machine-verified proofs

Paper 2 in the Hodge-Epsilon Program

Richard Hoekstra · April 2026

The pipeline

Machine-verified proofs are concrete objects. Lean 4 produces explicit proof terms that can be inspected, serialized, and measured. We build a three-stage pipeline:

1. Type erasure. Lean.Expr (the proof term) is converted to TLC, a seven-constructor ternary lambda calculus, by forgetting all type information. The resulting term captures the computational content — the "what" of the proof — without the "why."

2. DAG-CSE. The TLC tree is hash-consed into a directed acyclic graph. Structurally identical subtrees are shared. The DAG node count is our complexity measure.

3. Exact bounds. For the simplest proofs, we prove upper = lower bounds on K_DAG, yielding exact Kolmogorov complexities.

The seven TLC constructors

Code	Constructor	Trit cost	Pre-CSE freq	Role
0	app(fn, arg)	1 trit	49%	Application
10	const(idx)	2+ trits	27%	Global reference
11	var(idx)	2+ trits	25%	Local (de Bruijn)
12	lam(body)	2+ trits	2%	Abstraction
20	let(val, body)	2+ trits	~0%	Sharing
21	case(scrut, alts)	2+ trits	~0%	Observation
22	quote(term)	2+ trits	~0%	Reification

The Kraft sum is 1/3 + 6/9 = 1. Saturating. No prefix code over ternary can use these seven slots more efficiently.

K of the epsilon tower

Theorem	Raw nodes	DAG nodes	Shared	Ratio	Layer
eps_sq	6	5	1	1× (exact)	L1
eps_nilpotent	111	38	13	2×	L1
eps_zero_divisor	307	57	26	5×	L1
mulEps_mulEps	567	53	38	10×	L2
mulEps_comp_zero	1,964	80	60	24×	L2
eps_pow_two	1,452	100	55	14×	L1
exact_iff	553	115	56	4×	L2
dual_mul	298,003	1,018	527	292×	L1

The range is dramatic: from 5 to 1,018 DAG nodes, spanning three orders of magnitude in compression ratio.

K(eps_sq) = 5 exactly

Theorem K_DAG(ε² = 0) = 5 DAG nodes exactly. Upper = lower bound, proved in Lean (K_epsSq_tight).

The proof term is lam(app(app(const 22, var 0), var 0)) — "lambda epsilon, mul(epsilon, epsilon)." The five DAG nodes are:

Node 0: const 22 (the multiplication operator) Node 1: var 0 (epsilon — shared, referenced twice) Node 2: app(0, 1) (mul applied to epsilon) Node 3: app(2, 1) (mul(epsilon) applied to epsilon — reuses Node 1) Node 4: lam(3) (the proof binder)

Upper bound: DAG size = 5 (proved: epsSqProof_dag_nodes). Lower bound: all five nodes are structurally distinct — const vs var vs app(0,1) vs app(2,1) vs lam. No further sharing is possible (proved: epsSqProof_dag_nodes_distinct).

Tower cohesion and mutual information

All eight theorems merged into a single hash-consed DAG:

Metric	Value
Sum of individual DAGs	1,571 nodes
Combined DAG	1,409 nodes
Cross-theorem shared nodes	162 nodes
Cohesion	10%

The tower is 90% modular and 10% cohesive. dual_mul dominates at 1,018 of 1,571 individual DAG nodes (65%).

Mutual information

I(A; B) = K(A) + K(B) − K(A, B), computed via combined DAG:

Rank	Pair	I
1	eps_pow_two ↔ eps_zero_divisor	31
2	eps_nilpotent ↔ eps_pow_two	22
3	mulEps_mulEps ↔ exact_iff	19

eps_sq has ZERO mutual information with every other theorem. It is informationally atomic — it shares no proof structure with anything. The hub theorem is eps_pow_two (total I = 104), not the axiom and not the largest theorem.

K vs proof depth

Depth	Avg DAG K	Representative theorems
0	5	eps_sq (the axiom)
1	81	eps_nilpotent, eps_zero_divisor, eps_pow_two
2	370	dual_mul (outlier: ring tactic), mulEps
3	82	mulEps compositions, exact_iff
4	123	derivation theorems
5	271	deformation theorems

Information production rate: ~29 DAG nodes per proof step. K grows sublinearly with depth (α < 1) — the signature of increasing structural reuse. The master theorem (NilSquare.lean, 3.3 nodes/line) is 6× more compact than the lemmas (Epsilon.lean, 19.3 nodes/line). Assembly is cheap. Proving is expensive.

Post-CSE constructor distribution

Constructor	Pre-CSE	Post-CSE
app	49%	94%
const	27%	~0%
var	25%	3%
lam	2%	3%

The leaves collapse into shared references; the wiring remains unique. Shannon entropy drops from 0.948 to 0.283 trits. Mathematics, after deduplication, is 94% application — applying things to other things.

Optimal axiom selection

MDL criterion Which single theorem, if taken as axiom, minimizes the total Kolmogorov complexity of describing the tower?

Candidate	K(axiom)	K(rest \| A)	Winner?
dual_mul	1,018	217	YES
exact_iff	115	1,120
eps_pow_two	100	1,135
eps_sq	5	1,230

Fundamentality ≠ simplicity. dual_mul's ring normalization proof contains 83% of the tower's shared algebraic machinery. Conditioning on it, the rest becomes nearly trivial. eps_sq is incompressible, isolated (I = 0 with all), and informationally atomic — the purest axiom but the worst foundation for reconstruction.

K = 5 universality and the gap theorem

Equation	Tree nodes	DAG nodes
x² = c	6	5
x³ = c	8	7
(x²)² = c	10	7
x² + x = c	8	8

All three order-2 axiom systems (nilpotent ε² = 0, involution j² = 1, idempotent e² = e) share K = 5. A strict gap of 2 separates them from K ≥ 7 for anything more complex. No equation with K = 6 exists. Order 2 is the K-optimal algebraic structure.

Confluence complexity

Confluence proof	K_DAG
CR_idempotent	273
CR_nilpotent	444
CR_involution	578

Nilpotent is NOT the simplest confluence proof — idempotent is. K-complexity of an axiom (all three have K = 5) does not predict K-complexity of its derived theorems. Simplicity of the axiom and simplicity of the theory are different measures.

The self-measurement tower

The measurement pipeline is itself Lean code. We measure it with its own tools:

Function	K (trits)	DAG nodes	Role
exprToTLC	16	4	Type erasure (partial — opaque)
tritLength	749	44	Trit counting
nodeCount	659	42	Node counting
getConstValue	101	14	Extract proof term
measure	1,132	83	The full instrument
dagCSE	232	30	Hash-consed compression

The fixed-point tower:

Level 0: K(ε² = 0) = 5 nodes (the axiom) Level 1: K(measure) = 83 nodes (measures Level 0) Level 2: K(meta-measure) = 23 nodes (measures Level 1) Level 3: K(meta-meta) = 25 nodes (measures Level 2)

Reflection overhead: R(0→1) = 16.6×, R(1→2) = 0.3×, R(2→3) = 1.1×. Meta-measurement is smaller than measurement — going up the tower strips overhead rather than adding it. The tower stabilizes at Level 2–3. Total reflection cost: 5 + 83 + 23 + 25 = 136 nodes = 27× the axiom.

Machine-verified compression

A formalized PPM-C compressor in Lean 4 with exact rational arithmetic. Three files, zero sorry's:

File	Proves
ArithCoder.lean	Interval narrowing, containment, exact width
PPMExclusion.lean	Escape bounded, excluded symbols get zero mass
KolmogorovBound.lean	`decompressBytes(compressBytes(data)) = data`

Over the rationals, interval arithmetic is exact. The roundtrip becomes a consequence of arithmetic identities, not precision analysis. Combined with the empirical 2.16 bpb on enwik8, this yields K(enwik8) ≤ 2.16|x| + O(1) — the first machine-verified Kolmogorov complexity upper bound on a nontrivial string.

The snake measures its own length. The length is 83 nodes. The price of knowing this is 136.

Full paper (HTML) PDF