Machine-verified proofs are concrete objects. Lean 4 produces explicit proof terms that can be inspected, serialized, and measured. We build a three-stage pipeline:
1. Type erasure. Lean.Expr (the proof term) is converted to TLC, a seven-constructor ternary lambda calculus, by forgetting all type information. The resulting term captures the computational content — the "what" of the proof — without the "why."
2. DAG-CSE. The TLC tree is hash-consed into a directed acyclic graph. Structurally identical subtrees are shared. The DAG node count is our complexity measure.
3. Exact bounds. For the simplest proofs, we prove upper = lower bounds on KDAG, yielding exact Kolmogorov complexities.
| Code | Constructor | Trit cost | Pre-CSE freq | Role |
|---|---|---|---|---|
| 0 | app(fn, arg) | 1 trit | 49% | Application |
| 10 | const(idx) | 2+ trits | 27% | Global reference |
| 11 | var(idx) | 2+ trits | 25% | Local (de Bruijn) |
| 12 | lam(body) | 2+ trits | 2% | Abstraction |
| 20 | let(val, body) | 2+ trits | ~0% | Sharing |
| 21 | case(scrut, alts) | 2+ trits | ~0% | Observation |
| 22 | quote(term) | 2+ trits | ~0% | Reification |
The Kraft sum is 1/3 + 6/9 = 1. Saturating. No prefix code over ternary can use these seven slots more efficiently.
| Theorem | Raw nodes | DAG nodes | Shared | Ratio | Layer |
|---|---|---|---|---|---|
| eps_sq | 6 | 5 | 1 | 1× (exact) | L1 |
| eps_nilpotent | 111 | 38 | 13 | 2× | L1 |
| eps_zero_divisor | 307 | 57 | 26 | 5× | L1 |
| mulEps_mulEps | 567 | 53 | 38 | 10× | L2 |
| mulEps_comp_zero | 1,964 | 80 | 60 | 24× | L2 |
| eps_pow_two | 1,452 | 100 | 55 | 14× | L1 |
| exact_iff | 553 | 115 | 56 | 4× | L2 |
| dual_mul | 298,003 | 1,018 | 527 | 292× | L1 |
The range is dramatic: from 5 to 1,018 DAG nodes, spanning three orders of magnitude in compression ratio.
K_epsSq_tight).
The proof term is lam(app(app(const 22, var 0), var 0)) — "lambda epsilon, mul(epsilon, epsilon)." The five DAG nodes are:
Upper bound: DAG size = 5 (proved: epsSqProof_dag_nodes). Lower bound: all five nodes are structurally distinct — const vs var vs app(0,1) vs app(2,1) vs lam. No further sharing is possible (proved: epsSqProof_dag_nodes_distinct).
All eight theorems merged into a single hash-consed DAG:
| Metric | Value |
|---|---|
| Sum of individual DAGs | 1,571 nodes |
| Combined DAG | 1,409 nodes |
| Cross-theorem shared nodes | 162 nodes |
| Cohesion | 10% |
The tower is 90% modular and 10% cohesive. dual_mul dominates at 1,018 of 1,571 individual DAG nodes (65%).
I(A; B) = K(A) + K(B) − K(A, B), computed via combined DAG:
| Rank | Pair | I |
|---|---|---|
| 1 | eps_pow_two ↔ eps_zero_divisor | 31 |
| 2 | eps_nilpotent ↔ eps_pow_two | 22 |
| 3 | mulEps_mulEps ↔ exact_iff | 19 |
| Depth | Avg DAG K | Representative theorems |
|---|---|---|
| 0 | 5 | eps_sq (the axiom) |
| 1 | 81 | eps_nilpotent, eps_zero_divisor, eps_pow_two |
| 2 | 370 | dual_mul (outlier: ring tactic), mulEps |
| 3 | 82 | mulEps compositions, exact_iff |
| 4 | 123 | derivation theorems |
| 5 | 271 | deformation theorems |
Information production rate: ~29 DAG nodes per proof step. K grows sublinearly with depth (α < 1) — the signature of increasing structural reuse. The master theorem (NilSquare.lean, 3.3 nodes/line) is 6× more compact than the lemmas (Epsilon.lean, 19.3 nodes/line). Assembly is cheap. Proving is expensive.
| Constructor | Pre-CSE | Post-CSE |
|---|---|---|
| app | 49% | 94% |
| const | 27% | ~0% |
| var | 25% | 3% |
| lam | 2% | 3% |
The leaves collapse into shared references; the wiring remains unique. Shannon entropy drops from 0.948 to 0.283 trits. Mathematics, after deduplication, is 94% application — applying things to other things.
| Candidate | K(axiom) | K(rest | A) | Winner? |
|---|---|---|---|
| dual_mul | 1,018 | 217 | YES |
| exact_iff | 115 | 1,120 | |
| eps_pow_two | 100 | 1,135 | |
| eps_sq | 5 | 1,230 |
dual_mul's ring normalization proof contains 83% of the tower's shared algebraic machinery. Conditioning on it, the rest becomes nearly trivial. eps_sq is incompressible, isolated (I = 0 with all), and informationally atomic — the purest axiom but the worst foundation for reconstruction.
| Equation | Tree nodes | DAG nodes |
|---|---|---|
| x² = c | 6 | 5 |
| x³ = c | 8 | 7 |
| (x²)² = c | 10 | 7 |
| x² + x = c | 8 | 8 |
All three order-2 axiom systems (nilpotent ε² = 0, involution j² = 1, idempotent e² = e) share K = 5. A strict gap of 2 separates them from K ≥ 7 for anything more complex. No equation with K = 6 exists. Order 2 is the K-optimal algebraic structure.
| Confluence proof | KDAG |
|---|---|
| CR_idempotent | 273 |
| CR_nilpotent | 444 |
| CR_involution | 578 |
Nilpotent is NOT the simplest confluence proof — idempotent is. K-complexity of an axiom (all three have K = 5) does not predict K-complexity of its derived theorems. Simplicity of the axiom and simplicity of the theory are different measures.
The measurement pipeline is itself Lean code. We measure it with its own tools:
| Function | K (trits) | DAG nodes | Role |
|---|---|---|---|
| exprToTLC | 16 | 4 | Type erasure (partial — opaque) |
| tritLength | 749 | 44 | Trit counting |
| nodeCount | 659 | 42 | Node counting |
| getConstValue | 101 | 14 | Extract proof term |
| measure | 1,132 | 83 | The full instrument |
| dagCSE | 232 | 30 | Hash-consed compression |
The fixed-point tower:
Reflection overhead: R(0→1) = 16.6×, R(1→2) = 0.3×, R(2→3) = 1.1×. Meta-measurement is smaller than measurement — going up the tower strips overhead rather than adding it. The tower stabilizes at Level 2–3. Total reflection cost: 5 + 83 + 23 + 25 = 136 nodes = 27× the axiom.
A formalized PPM-C compressor in Lean 4 with exact rational arithmetic. Three files, zero sorry's:
| File | Proves |
|---|---|
| ArithCoder.lean | Interval narrowing, containment, exact width |
| PPMExclusion.lean | Escape bounded, excluded symbols get zero mass |
| KolmogorovBound.lean | decompressBytes(compressBytes(data)) = data |
Over the rationals, interval arithmetic is exact. The roundtrip becomes a consequence of arithmetic identities, not precision analysis. Combined with the empirical 2.16 bpb on enwik8, this yields K(enwik8) ≤ 2.16|x| + O(1) — the first machine-verified Kolmogorov complexity upper bound on a nontrivial string.