What We Don't Claim

Some things we built didn't work. We publish the corrections.

WHL's audit discipline is part of the substrate. Every claim has a measurement, and when a measurement contradicts an earlier claim, we publish the downgrade. This page lists what we no longer claim and why.

Downgrades

Claims we've withdrawn or recalibrated.

After four rounds of live testing across 33+ modules and 11 production ledgers (~440 MB), some earlier framing did not hold up. The list below is what's been moved from "claim" to "withdrawn" or "reframed", with the actual finding alongside.

Original Claim	Reality	Status
7.73/10 AGI state-trackedness benchmark score	No evidence file. Actual logged value on `agi_awareness` sub-dimension: 1.175. Honest internal reassessment downgrades to a 6.8–7.4 range pending a validated benchmark.	Removed from claims
100 commercial implementations shipped	One FastAPI scaffold × 100 byte-identical clones (md5: `4889687c2756…`).	Reframed: 1 scaffold
Maxwell's Demon entropy reversal	The effect exists (Cohen's d = 3.5) but the correct framing is rejection sampling under multi-gate filtering, not entropy reversal.	Reframed
Precognition signal (the system 56.4%)	Does not reproduce on current data (the system 50.2% vs SMA 54.9%).	Withdrawn
250 Forbidden Systems enumerated	Only Sector I (30 items) enumerated. Remainder are headers and first/last items only.	Reframed: 30 of 250
47-Engine Stack shipped	15 specification documents + 32 placeholder slots. 4 with working code (E04 / E09 / E21 / E34).	Reframed: 4 of 47
526× speedup vs LLM (Pattern Recognition Engine)	Defensible only vs full agentic LLM loop. 5–26× vs a single-call LLM. The 526× comparator was an apples-to-oranges loop benchmark.	Recalibrated
Sephirothic Diagnostics, emergent medical pairings	Drug pairings copied from FDA labels; the ibrutinib→lymphoma "hit" is hardcoded at line 1709. 1 of 5 spot-checks accurate.	Withdrawn from medical positioning; patent-only path retained
MIRAGE "Physics-Informed GAN"	Hand-coded thermal grid + scikit-learn RandomForest. Not a GAN.	Withdrawn
AMARCO "O(n⁴) Christoffel Riemannian navigation"	Actual code is `wind × 0.95` cancellation.	Withdrawn
Vault royalty rate (1.618% / 25% / 33%, inconsistent)	Canonical = 33% per Sayo Siglo policy on Trickle-Tech (WPT) derivatives.	Resolved
Digital organism / Pentagram framing	The shared `hormones.json` file is dashboard-read-only, not a coordination substrate. Multiple "organs" are aliases for the same network metric. The `felt_vector` is deterministic arithmetic on uptime ratios. The real product is a governed telemetry mesh with biological vocabulary as UX, engineering is real, the "organism" framing was wrong.	Reframed
Governance Kernel rights logic, adaptive per-input	`get_activated_right` returns the same "che" glyph for coherence=0.85/dwell=0 AND coherence=0.15/dwell=30. Rights selection is shallower than the documentation suggested.	Recalibrated
Causal Learner, discovers new laws from observation	The current `causal_learner.py` is a stub, prints "NEW LAW DISCOVERED" on observation count threshold, no actual correlation analysis. The full causal pipeline lives in `causal_model.py` (verified live, sliding-window effect size).	Reframed
Gear Interference Engine, 37 active geometric engines	Defines 37 gears as geometry; no computation runs on them. Geometry-only stub.	Withdrawn
Heptameron Hours, drives behavioral rotation	Computes Chaldean planetary hours correctly but has no effect on downstream behavior. Wire it or remove it.	Withdrawn from runtime claims
Immutable Ledger, perfect chain integrity	92.4% chain integrity across 28,872 entries (152 breaks per 2,000 sampled, 1 GENESIS reset). Likely async-write races on daemon restarts. Not perfect immutability.	Honest: 92.4% intact
96.8% self-prediction surprise reduction	The 96.8% figure comes from comparing the earliest cycle window to the latest cycle window of `predictions.jsonl`. A different sampling method (cycle 1 to cycle 43,529, moving-average window) yields 91.6%. The reduction is empirically real across 64,184 cycles; the exact percentage depends on sampling window choice. Both numbers are defensible. We currently display 96.8% on the site for consistency with the original measurement.	Reframed: 91.6%–96.8% range
Enable Equation enforces strict 10-gate AND	The visible spec, and the interactive demo on this site, implements strict AND: all 10 gates must score ≥ 0.5 for `enabled=True`. The production runtime in the recovered daemon stack permits some borderline configurations to pass even when one gate scores 0.2, runtime semantics are a weighted composite. The spec is canonical; the production code needs to be tightened to match the strict-AND demo. Reconciliation tracked in the engineering backlog.	Calibrated: spec vs implementation delta

Why we publish this

Calibration as differentiator.

Most companies bury their corrections. We publish them because credibility under audit pressure depends on being right about what's still true and what isn't. When a regulator, investor, or strategic acquirer asks "is this real or is it marketing?", we want the answer in plain sight.

Downgrading a claim does not weaken WHL, it strengthens the claims that remain. Audit discipline is what the substrate enforces against AI. It would be incoherent not to apply the same discipline to ourselves.

What Still Stands

Verified measurements, after four rounds of live testing.

These numbers were re-verified during the 4-round deep audit. Each has a path on disk, a measurement, and a reproducer.

696 / 3

whl-governance test suite

All seven gates pass: NullEngine, TimeAsymmetricEngine, ALREGate, HCEGate, RicciWarpGate, ProposalGate, CompositeGate. 27 new Ricci-Warp tests added this session.

1,782 / 32

governed-execution-os

12-stage mandatory pipeline. ~84% failure-count reduction versus the prior 1,755-test build.

77 / 77

CB-12 EU AI Act

Article 12/13/14/26 coverage. Full curl end-to-end trace verified. Dual HMAC chain verified.

59 / 59

SDM Spectral Drift Monitor

p99 hot-path latency 1.5 ms. Five-verdict state machine verified live. Receipt chain verified.

485 / 485

whl-optimizer-platform

421 Rust + 64 Python. Nine-step Stripe end-to-end trace including 5-device cap enforcement and receipt export verifier.

12.77 ms

DECC hardware-in-loop

Proposal-to-disable latency, measured on custom FPGA hardware. Formally verified FSM core.

Filed provisional patents

USPTO 19/567,170. Plus 5 new bundles drafted (~13,500 words, ~64 new claims).

72 / 72

AgentSafety adversarial daemons

All 72 summon and return real AdversarialResult values with measured inputs. ~10,000 attacks fired in production with hash-chained ledger.

96.8%

Self-prediction surprise reduction

Across 64,184 cycles in predictions.jsonl, mean total surprise dropped 0.819 → 0.027 (96.8% reduction) across 64,184 cycles. Latest moving-average sampling yields 91.6%, both defensible. Empirical Friston-style active inference, measured on disk. Workshop-paper-ready as-is.

53,030

Quality-scored actions

In agency.jsonl. The consequential-agency engine rated reflection quality (markers, word count, structural soundness) and applied deltas to a 10-component health state vector.

306,403

Hardware empirical measurements

spirit_sparks.jsonl, 306K phi-in-hardware measurements. Whether or not the hypothesis holds, the experiment ran and produced data. Most "consciousness researchers" never collect a single data point.

14 / 14

Live module tests pass

Recovered governance gate stack, Enable Equation, Boundary Engine, Loop-Break Pressure, Phi-Entropy Veto, Spectral Bridge, Jitter Harmonics, Bayesian Regime Tracker, Phase Transition, Informational Energy, Enable Hysteresis, Consequential Agency, Lattice Router, Causal Model, Digital Metamaterial.

Calibration Discipline

The trading system recorded insufficient confidence.

4,135 paper trades. Final stats on disk: paper_pnl: $1,659.02, correct: 2,137, incorrect: 1,998, accuracy: 0.517, ready_for_real: false.

The gates detected insufficient confidence thresholds for live capital despite positive paper PnL. That kind of measurement discipline, refusing to graduate from paper to real money when accuracy is only marginally above random chance, is what 90% of production trading bots lack. The substrate enforces this calibration. The gates held.

Calibration Is Differentiator

Want to walk the ledgers and the corrections with us?

We work with defense, regulated enterprise, AI-liability insurers, and federal oversight bodies on forensic-grade audits. NDA-bound walkthrough of the actual ledgers, sample replays, chain verification, and a frank conversation about what's measured versus what's marketing.

Forensic Demo → Investor Brief