StochasticSplats — Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting

Abstract

3D Gaussian Splatting (3DGS) represents scenes with tens of thousands of anisotropic Gaussians that are "splatted" onto the image plane and alpha-blended front-to-back. The mandatory depth sort is the Achilles heel of classic 3DGS:

  • 𝒪(N log N) CPU/GPU work per frame

  • branch-divergent fragment shaders

  • "popping" artifacts when the sort order changes under camera motion

  • paradoxically, lower-resolution renders are not cheaper because sort cost is scene-, not pixel-, bound

Core Idea

StochasticSplats drops the sort entirely. Each pixel receives K Monte-Carlo samples of the volume-rendering equation, choosing splats probabilistically instead of deterministically front-to-back. The estimator is:

Ĉ(x) = (1/K) * Σ[k=1 to K] (σ(g_ik, x) / p(ik | x)) * T_k-1 = Π[j<k] (1 - α_ij)

where p(ik | x) ∝ expected contribution of splat i to pixel x. This is an unbiased estimator of classical alpha compositing; variance falls as 1/K.

StochSplats Performance vs Quality

1. Introduction

3D Gaussian Splatting (3DGS) represents scenes with tens of thousands of anisotropic Gaussians that are "splatted" onto the image plane and alpha-blended front-to-back. The mandatory depth sort is the Achilles heel of classic 3DGS:

  • 𝒪(N log N) CPU/GPU work per frame

  • branch-divergent fragment shaders

  • "popping" artifacts when the sort order changes under camera motion

  • paradoxically, lower-resolution renders are not cheaper because sort cost is scene-, not pixel-, bound

2. Mathematical Analysis

2.1 Complexity

Let N be splat count, P pixel count, K samples / pixel.

Pipeline
Expected operations

Sorted splats

P⋅M shade + Nlog N sort

StochasticSplats

P⋅K shade (no sort)

The break-even K* occurs when:

K* ≈ (M/log N) * (N/P)

Typical scenes (N≈60k, P≈1M, M≈40) give K*≈6. Empirically a visually clean render needs 4–8 spp, so stochastic wins.

2.2 Variance Bound

Assuming per-pixel transmittance T and bounded per-splat contribution σ ≤ σ_max:

Var[Ĉ] ≤ (σ_max² / K) * E[T]

Render-time quality trade-off is therefore explicit and tunable.

Mathematical equation for variance bound

3. Implementation

  • Re-use GPU depth buffer by sampling only the nearest hit per strata ("stochastic transparency").

  • Importance-sample splats with a hierarchical BVH built once after training.

  • Single-pass GLSL shader; differentiable via re-parameterisation for training/fine-tuning.

4. Results

Resolution 1280×720, RTX 4090, 60k splats.

Samples/pixel
Time (ms)
PSNR↑
DSSIM↓

1

2.7

28.7

0.043

2

2.9

30.1

0.033

4

3.3

32.8

0.018

8

5.6

34.2

0.011

16

6.5

36.9

0.006

Sorted 3DGS

14.7

33.4

0.015

StochasticSplats is 4–5 × faster than depth-sorted splats at equal quality and eliminates popping artifacts.

StochSplats Performance vs Quality graph

5. Ablations

  1. No importance sampling → 2 × variance.

  2. No depth-buffer early-out → 40% slower.

  3. Lower-res render (640 × 360) gives 1.8 ms at 4 spp; sorted 3DGS barely speeds up (13.2 ms) confirming sort bottleneck.

6. Discussion & Limitations

  • High-frequency transparency (foliage) needs ≥8 spp.

  • Mobile GPUs: random sampling incurs divergent memory; a hashed-grid sampler could help.

  • Training-time gradients have extra variance; we use five-sample antithetic pairs to stabilise.

7. Conclusion

StochasticSplats reframes 3D Gaussian Splatting as Monte-Carlo volume rendering:

  • removes depth sorting,

  • exposes a continuous speed-quality dial,

  • attains real-time framerates while matching classical alpha compositing in expectation.

Future work: space–time sampling for motion-blurred splats and hardware ray-query support for even lower variance.

Last updated