Bustabit RNG Consistency Report: 100 Million Rounds, Two Halves, One Verdict

Most auditors check a casino once and move on. We split Bustabit's entire 100-million-round hash chain into two halves and asked: Is the RNG consistent from start to finish? Seven statistical tests. One answer.

Bustabit RNG Consistency Report: 100 Million Rounds, Two Halves, One Verdict

Most provably fair audits check a casino once and call it a day. A green checkmark, a thumbs-up, move on.

We don't work like that.

This is the first published longitudinal consistency report for any crypto casino. We took Bustabit's entire 100-million-round Era 3 hash chain, split it into two halves — 50 million older games vs. 50 million newer games — and asked one question:

Does the RNG behave the same way across the entire chain, or did something change?

If a casino manipulates its RNG partway through a hash chain — even subtly — the statistical fingerprint shifts. Our 11-test comparison framework is designed to catch exactly that.


What We Tested

DetailValue
CasinoBustabit (Era 3)
GameCrash
Total rounds100,000,000
AlgorithmHMAC-SHA256, hash chain (raw bytes)
SaltBitcoin block hash (...1e08b7fd...)
Seeding eventBitcoinTalk #5560454
Sample size per half500,000 (every 100th round)
Phase AGames 50,000,001–100,000,000 (older half)
Phase BGames 1–50,000,000 (newer half)

Original audit: Bustabit Provably Fair Audit Report — 100,000,000 Rounds Analyzed

Why This Matters

A standard provably fair audit verifies that individual hashes are correct. That's table stakes — any calculator can do it.

A consistency report answers a harder question: Is the distribution stable over time?

Here's why that matters:

  • A casino could use a perfectly valid HMAC-SHA256 algorithm but choose a biased salt for one segment of the chain
  • A hash chain could be partially regenerated after a seeding event if the operator controls the infrastructure
  • Subtle distribution drift (e.g., 0.5% fewer high multipliers in newer games) would be invisible to single-hash verification but detectable in aggregate statistics

Our comparison module runs 11 independent statistical tests — each attacking the problem from a different mathematical angle. If the RNG changed behavior anywhere in the chain, at least one test will flag it.


Results: 11/11 Tests Passed

TestResultStatisticp-ValueInterpretation
Two-Sample Kolmogorov-Smirnov✅ PASSD = 0.002330.132Distributions are statistically identical
Chi-Square Homogeneity✅ PASSχ² = 110.060.210Bin distributions are homogeneous
Jensen-Shannon Divergence✅ PASSJSD = 0.0001 bitsDistributions are virtually identical
Welch's t-Test✅ PASSt = -1.1670.243Means are statistically equal
F-Test (Variance Ratio)✅ PASSF = 1.0020.425Variances are statistically equal
Cohen's d (Effect Size)✅ PASSd = -0.002Negligible — no practical difference
Bhattacharyya Distance✅ PASSBD = 0.00006Distributions are virtually identical
Mann-Whitney U✅ PASSz = -1.1660.243Rank distributions are identical
Anderson-Darling✅ PASSA² = 1.3180.100Tail distributions are identical
Cramér-von Mises✅ PASST = 0.0460.901Integral CDF distance negligible
Energy Distance✅ PASSE = 0.0000255Distributions are identical

Consistency Score: 9.2 / 10 — CONSISTENT

Verdict: RNG output is statistically identical across both halves of the chain. No evidence of manipulation, degradation, or behavioral change.


What the Numbers Mean

Kolmogorov-Smirnov (p = 0.132)

The KS test compares the full cumulative distributions of both halves. A p-value of 0.132 means there's a 13.2% chance of seeing this level of difference between two samples drawn from the same distribution. Perfectly normal. We'd only flag at p < 0.01.

Jensen-Shannon Divergence (0.0001 bits)

JSD measures how much information you'd need to distinguish one distribution from the other. At 0.0001 bits, the answer is: virtually none. For reference, our red flag threshold is 0.01 — this result is 100x below the alarm level.

Cohen's d (-0.002)

Effect size tells you whether a difference is practically meaningful, even if statistically detectable. At d = -0.002, the difference is negligible — you'd need millions of bets to even notice it in your bankroll.

Mann-Whitney U (p = 0.243)

A non-parametric rank-based test that compares whether one distribution tends to produce higher values than the other. Unlike the t-test, it makes no assumptions about the shape of the distribution. p = 0.243 means the ranks are indistinguishable.

Anderson-Darling (p = 0.100)

Similar to KS but with extra sensitivity at the tails of the distribution — exactly where manipulation would show up first. A² = 1.318 is well within normal range.

Cramér-von Mises (p = 0.901)

Measures the integral of the squared difference between the two CDFs. More sensitive than KS to small, distributed shifts. p = 0.901 is about as clean as it gets.

Energy Distance (E = 0.0000255)

A modern test (Székely & Rizzo, 2004) based on pairwise distances between samples. Can detect any type of distributional difference — location, scale, or shape. At 0.0000255, the distance is essentially zero.


How We Did It

  1. Collected the full chain: Starting from the most recent game hash, we walked Bustabit's Era 3 hash chain backward through all 100,000,000 rounds. Each crash point was computed using the exact Bustabit algorithm (HMAC-SHA256 → 52-bit seed → crash formula).
  2. Sampled uniformly: Every 100th round was recorded as a raw float (the uniform [0,1) value before crash conversion), giving us 1,000,000 representative samples.
  3. Split the chain: First 500,000 samples (older 50M games) = Phase A. Last 500,000 samples (newer 50M games) = Phase B.
  4. Ran the comparison: Our 11-test consistency framework compared the statistical fingerprints of both halves.

Total compute time: 410 seconds for collection + under 5 seconds for the consistency analysis.

Methodology

Full details on our testing framework, including NIST SP 800-22, PractRand, and TestU01 integration: fairplayaudit.com/methodology

Open Source

Every tool we used is publicly available. Verify our results yourself:


What This Report Does NOT Tell You

Let's be clear about the limits:

  • This report confirms statistical consistency across the hash chain. It does not guarantee future behavior if Bustabit switches to a new chain or algorithm.
  • Provably fair means the outcomes are verifiable. It does not mean the house edge disappears. Bustabit's 1% house edge is mathematically baked in — that's by design, not manipulation.
  • Consistency ≠ profitability. The RNG is fair. The math still favors the house. Those are two separate facts.

The Bigger Picture: Why Longitudinal Monitoring Matters

Anyone can audit a casino once. We plan to do it continuously.

This Bustabit consistency report is the first in a series. Our roadmap:

  • Multi-casino baselines: CryptoGames, Wolf.bet, Bitsler, and more
  • Weekly automated monitoring: Fresh data, fresh comparisons, anomaly alerts
  • Public watchlist: A live dashboard showing which casinos are consistent — and which aren't

Other auditors tell you a casino was fair on audit day. We tell you it's still fair today.


Report ID: FPA-CONSISTENCY-1781762933375
Generated: June 18, 2026
Auditor: FairPlay Audit (fairplayaudit.com)
Framework: NIST SP 800-22 + PractRand + TestU01 + Custom Consistency Module