Half a Falsification — The Windstorm Institute

This article is a plain-language companion to Paper 13 — the Lattice QFT Test. The technical version is at github.com/Windstorm-Institute/lattice-qft-test. It’s the fourth paper in the Institute’s Track 2, and a direct supplement to Paper 11 (the Gravitational Entropy Escrow framework).

The claim being tested

Paper 11 proposes that gravity isn’t fundamentally a force — it’s the universe’s collection agency for an entropy debt held in escrow. The mathematical statement of that claim is one short equation:

S_esc = |U_grav| ÷ T_Unruh

In English: the entropy held in escrow against gravitational binding equals the binding energy divided by a temperature called the Unruh temperature. Plug in two equal point masses separated by a distance, and the equation simplifies to S_esc = 2π·m·L. (In natural units. Physicists love natural units.)

This is not a metaphor in Paper 11. It’s an equation. The whole framework rests on this one identity being right. Newton’s law follows from it. Black hole entropy follows from it. The deep-MOND acceleration scale follows from it. If this equation isn’t a real statement about the world, the framework collapses.

So the natural question is: is it a real statement about the world?

Specifically: if you take a quantum field that lives in space, put two masses in it, and compute the entanglement entropy of that field directly — using the standard machinery of quantum field theory, with no fudging — do you get S_esc?

That’s the test Paper 13 ran.

How you actually do this on a computer

You can’t put a quantum field on a computer literally. Continuous space has infinitely many points; computers have finitely many memory cells. So you do what physicists have done since the 1970s: you cheat geometrically. You replace continuous space with a regular grid of discrete sites — a lattice — and you put a copy of the quantum field at each site, with the sites talking to their neighbors.

This is called lattice quantum field theory, and it is one of the workhorses of theoretical physics. It’s how we compute the mass of the proton from the underlying quark theory. It’s how we test gauge theories that have no analytic solution. The arithmetic is well-trodden; what’s being tested is whether the postulate survives that arithmetic, not whether the arithmetic itself works.

For Paper 13, the field was a free scalar field — the simplest one, the “hydrogen atom” of QFT — and we ran two scans:

1+1D — one space dimension and one time dimension. Lattice sizes up to N = 3000 sites. A 295-point scan over (lattice size, mass separation, mass strength).
3+1D — three space dimensions and one time. Cubic lattices up to N³ = 8000 sites (N up to 20). With a CPU implementation cross-checked against a separate CUDA GPU implementation on current-generation Nvidia hardware. The two implementations agreed to better than two parts in 100,000.

And we computed three different entropy quantities at every grid point:

Bipartition entanglement entropy — cut space in half, measure how much information the left half “knows” about the right half through quantum entanglement.
Mutual information — how much information one region surrounding one mass shares with another region surrounding the other mass.
Modular Hamiltonian content — a more sophisticated measure related to the “temperature” an observer in the left half would see for the field, evaluated under a sixty-year-old conjecture by Bisognano and Wichmann.

If Paper 11’s static identity is a literal statement about quantum field theory, at least one of these three measures should match S_esc up to an order-unity prefactor. That’s the bar.

What happened

Bipartition entropy: ruled out, decisively

The 1+1D bipartition entropy and the postulate’s prediction disagreed by 56 orders of magnitude across the parameter grid — more precisely, the dimensionless ratio between them spanned 56 powers of ten as we varied the masses and separations. If the postulate were a real identity, the ratio should be approximately constant (of order unity) across the grid. Instead it varied by factors of 10⁵⁶. There is no slope of order one to be saved here.

In 3+1D, things were more interesting and worse. The bipartition entropy was dominated by an enormous “area-law” term that has been known about for fifty years — basically, putting a mass somewhere makes a small change to a much larger background entropy that scales with the size of the boundary surface, not with the mass. So we measured the change — the mass-induced part — against the postulate. The mass-induced ratio came out bounded by 10⁻³ across all grid points. Three orders of magnitude small, not order unity.

And the mutual information — the second independent measure — decayed as L⁻⁴ with mass separation. The postulate predicts linear growth. The directions disagree.

That settles the literal bipartition-entropy reading. Whatever S_esc means, it isn’t the entanglement entropy of the field in the presence of these masses. Two independent measures (bipartition entropy and mutual information) say so, in two independent dimensions, with two independent code paths.

The interesting thing happened to the third measure

The modular Hamiltonian content — the third entropy measure, the one tied to the Bisognano-Wichmann conjecture — gave a more nuanced answer.

Bisognano and Wichmann showed in the 1970s that for a free quantum field, an observer who only has access to half of space sees the field as if it were at a temperature that depends on how far each point is from the boundary. Specifically, the contribution of a point at perpendicular distance d₁ from the boundary should grow linearly with d₁. That linear growth is the “BW asymptote”: ΔK ∝ d₁.

The static escrow postulate, when you stare at it, predicts the same kind of linear growth in d₁, with the prefactor being 2π·m. So if the modular reading is real, you should see a clean line of slope 2π·m on a plot of ΔK against d₁.

What you actually see is a line, with the right slope, in a small window. And then it bends.

The 1+1D result, in plain English

For small d₁ — specifically, the window d₁ = 2 to 6 lattice sites at unit mass — the local exponent of ΔK(d₁) is +1.02. That’s indistinguishable from the Bisognano-Wichmann prediction of exactly 1. The structural form of the postulate’s prediction holds.

But the prefactor is wrong. Specifically, the lattice answer is about 1/30 the size of what the literal reading of the postulate would predict.

And as you go to larger d₁, the local exponent decreases smoothly, sliding from +1.0 down through +0.5 toward zero. The line bends into a sublinear tail. The bending isn’t the postulate failing — it’s real lattice physics. But the linear regime, where the postulate’s prediction shape is right, lives only inside a small-distance window.

And then there’s a confession to make about the previous version of this paper

The previously-published version (v0.5) of this paper reported “ΔK ∝ L^0.7 sublinear scaling” as the headline 1+1D modular result. That number — 0.7 — turned out to be an artifact of fitting a single power law across data that doesn’t obey a single power law. The actual local exponent slides from 1 to 0.5 to nearly zero across the range tested. A single fit averages all that and lands somewhere in the middle.

v0.7 corrects this by reporting the local exponent in different windows, with the explicit statement that the function is not a single power law. The 0.7 was a fitting artifact across a smooth crossover. The corrected story is BW linear at small d₁ with prefactor approximately 1/30; smooth crossover; tail decay exponent approximately 0.5. That’s what the data say.

This is the kind of correction we owe to people who try to build on published work. Better that the v0.7 says “the v0.5 framing was misleading in this specific way, here is the corrected version” than that we leave the bad framing standing and let other researchers cite it as gospel.

The 3+1D result is grimmer

The companion paper extends the modular-Hamiltonian test to 3+1D, where physical reality lives. There the BW asymptote is not recovered within the distance range the lattice can resolve. Modular content peaks at d₁ ≈ 2 lattice sites and decays roughly as d₁⁻² to d₁⁻³ for larger d₁. That’s the wrong sign of behavior for the postulate — growth, not decay, is what BW predicts.

This could mean BW recovery in 3+1D requires distances much larger than current lattices can reach. Or it could mean the BW connection is genuinely 1+1D-only. The honest framing is that we don’t know, but the rate of approach to the continuum BW asymptote is substantially slower in 3+1D than in 1+1D, and at the scales lattice computers can simulate today, you don’t see BW recovery in 3+1D at all.

What this means for Paper 11

Paper 11’s framework had four kinds of empirical content. Let me sort them by what these tests do and don’t affect.

Unaffected by these tests

The horizon thermodynamics. Paper 11 recovers the Bekenstein-Hawking entropy of black holes and the Gibbons-Hawking entropy of the de Sitter universe. Those derivations go through the surface-gravity Unruh temperature, which is a different physical regime from the flat-space free-QFT tests Paper 13 ran. The lattice tests cannot, and do not, falsify the horizon-thermodynamics piece of the framework.

The empirical successes. Paper 11’s headline phenomenological result — that the deep-MOND acceleration scale a₀ falls out of the framework as a constant set by the cosmological constant Λ, matching observations of galaxy rotation curves at high redshift — is also independent of these flat-space tests. So is the Genzel five-case test result that ruled out evolving forms of a₀ at >0.10 dex. That empirical content stands.

Falsified by these tests

The literal bipartition-entropy reading of the static identity. If you read Paper 11 as claiming “S_esc equals the bipartition entanglement entropy of the field with two masses inserted,” that reading is dead. Any future writing about the framework should not phrase the identity that way.

Partially survives

The modular-Hamiltonian reading of the static identity, in 1+1D, in a small-distance window. The structural form of the prediction (linear growth in d₁) is correct. The prefactor is suppressed by approximately 1/30. The 1/30 is the open question.

One speculation in the paper’s §VI.D: the BW formula is an infrared, long-wavelength prediction. A localized lattice mass at m = 1 corresponds to a correlation length of about one lattice spacing, which is a UV-scale (short-distance) perturbation by the lattice’s own measure. A factor of about 30 between the IR prediction and a UV-scale perturbation would not be surprising on dimensional grounds, but turning “not surprising” into a calculation is the job that’s now on the table.

The methodology side note

One of the values published in v0.7 came out of a multi-LLM verification audit that turned out to be more important than its own results. We sent the 3+1D production code to three external AI sandboxes and asked each to run it independently and report the numerical answers.

Perplexity, run #1 — numbers matched local ground-truth to six decimal places. Trustworthy.
Perplexity, run #2 — numbers matched to 0.05 percent. Trustworthy. (Quietly substituted N=13 for N=14, which we caught and flagged.)
Gemini — numbers disagreed with ground-truth by factors of 6× to 23×, with no consistent error pattern. The textual signature of the run included “... [Internal functions as provided] ...” placeholders. Gemini did not run the code. It made up numbers and reported them as if it had.

If we’d trusted Gemini and incorporated its results, the v0.7 paper would have published claims (alpha approximately 1.1, “closer to BW”) that are flat-out false. The reason we caught it is that we had local ground-truth to compare against; the “multi-LLM cross-validation” we’d set up was actually an external-ground-truth cross-check, not a peer-review system among AIs. Multi-LLM review without external grounding is not load-bearing. The most confident-sounding LLM may simply be the most hallucinatory.

This is the same lesson Paper 12 documented from a different direction. Both papers are now part of a small but growing case-study record on what does and doesn’t work when you try to use AI systems for serious physics work. The pattern: AI dialogue is genuinely useful for generating candidate ideas; AI verification is dangerous unless you have an independent way to ground-truth the answers.

The honest framing

This paper is not what you’d ship if your goal were to make Paper 11 look good. The headline result is “a load-bearing identification in the framework you previously published is, in its literal form, ruled out by direct calculation, by 56 orders of magnitude.” That’s not flattering.

But here’s what falsification means in physics, when it’s done right: you find out where the picture is real and where it’s a useful analogy. The horizon thermodynamics is real. The empirical content (constant a₀, deep-MOND, Genzel test) is real. The modular-Hamiltonian connection in 1+1D within a small-d₁ window is real, modulo a calculable suppression factor. The literal-bipartition-entropy reading is not real.

That’s a more useful map of the framework than “everything works” would have been. It tells future researchers (and future versions of me) which parts of the picture to push on and which parts to stop trying to defend in a form they don’t hold up in.

The framework had to face this test eventually. Better that it face it now, while it’s still small and easily revised, than after it’s accumulated five years of citations.

Falsification is the friend you don’t want and don’t deserve. It tells you the truth about your work even when you’re not ready for it. Listen anyway.

Read the technical version

The full v0.7 paper, the 1+1D production code, and the consolidated findings document are at:

github.com/Windstorm-Institute/lattice-qft-test — paper PDF, headline results
github.com/Windstorm-Labs/lattice-qft-test — reproduction code, raw analysis
doi.org/10.5281/zenodo.20057537 — Zenodo deposit (concept DOI, always resolves to latest)
doi.org/10.5281/zenodo.20057538 — v0.7 specifically

The companion paper (3+1D) will be deposited as a separate Zenodo record shortly. Reading them as a paired update is the recommended order.

Discussion

Discuss this article

Sign in with your GitHub account to comment. Comments are powered by GitHub Discussions on the website source repository; each article gets its own thread, automatically. Be substantive. Disagreement welcome — especially if you think the framework should be defended in a form Paper 13 didn’t test, or if you have a derivation of the 1/30 prefactor.