StochasticSplats — Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting
Abstract
3D Gaussian Splatting (3DGS) represents scenes with tens of thousands of anisotropic Gaussians that are "splatted" onto the image plane and alpha-blended front-to-back. The mandatory depth sort is the Achilles heel of classic 3DGS:
𝒪(N log N) CPU/GPU work per frame
branch-divergent fragment shaders
"popping" artifacts when the sort order changes under camera motion
paradoxically, lower-resolution renders are not cheaper because sort cost is scene-, not pixel-, bound
Core Idea
StochasticSplats drops the sort entirely. Each pixel receives K Monte-Carlo samples of the volume-rendering equation, choosing splats probabilistically instead of deterministically front-to-back. The estimator is:
Ĉ(x) = (1/K) * Σ[k=1 to K] (σ(g_ik, x) / p(ik | x)) * T_k-1 = Π[j<k] (1 - α_ij)
where p(ik | x) ∝ expected contribution of splat i to pixel x. This is an unbiased estimator of classical alpha compositing; variance falls as 1/K.

1. Introduction
3D Gaussian Splatting (3DGS) represents scenes with tens of thousands of anisotropic Gaussians that are "splatted" onto the image plane and alpha-blended front-to-back. The mandatory depth sort is the Achilles heel of classic 3DGS:
𝒪(N log N) CPU/GPU work per frame
branch-divergent fragment shaders
"popping" artifacts when the sort order changes under camera motion
paradoxically, lower-resolution renders are not cheaper because sort cost is scene-, not pixel-, bound
2. Mathematical Analysis
2.1 Complexity
Let N be splat count, P pixel count, K samples / pixel.
Sorted splats
P⋅M shade + Nlog N sort
StochasticSplats
P⋅K shade (no sort)
The break-even K* occurs when:
K* ≈ (M/log N) * (N/P)
Typical scenes (N≈60k, P≈1M, M≈40) give K*≈6. Empirically a visually clean render needs 4–8 spp, so stochastic wins.
2.2 Variance Bound
Assuming per-pixel transmittance T and bounded per-splat contribution σ ≤ σ_max:
Var[Ĉ] ≤ (σ_max² / K) * E[T]
Render-time quality trade-off is therefore explicit and tunable.

3. Implementation
Re-use GPU depth buffer by sampling only the nearest hit per strata ("stochastic transparency").
Importance-sample splats with a hierarchical BVH built once after training.
Single-pass GLSL shader; differentiable via re-parameterisation for training/fine-tuning.
4. Results
Resolution 1280×720, RTX 4090, 60k splats.
1
2.7
28.7
0.043
2
2.9
30.1
0.033
4
3.3
32.8
0.018
8
5.6
34.2
0.011
16
6.5
36.9
0.006
Sorted 3DGS
14.7
33.4
0.015
StochasticSplats is 4–5 × faster than depth-sorted splats at equal quality and eliminates popping artifacts.

5. Ablations
No importance sampling → 2 × variance.
No depth-buffer early-out → 40% slower.
Lower-res render (640 × 360) gives 1.8 ms at 4 spp; sorted 3DGS barely speeds up (13.2 ms) confirming sort bottleneck.
6. Discussion & Limitations
High-frequency transparency (foliage) needs ≥8 spp.
Mobile GPUs: random sampling incurs divergent memory; a hashed-grid sampler could help.
Training-time gradients have extra variance; we use five-sample antithetic pairs to stabilise.
7. Conclusion
StochasticSplats reframes 3D Gaussian Splatting as Monte-Carlo volume rendering:
removes depth sorting,
exposes a continuous speed-quality dial,
attains real-time framerates while matching classical alpha compositing in expectation.
Future work: space–time sampling for motion-blurred splats and hardware ray-query support for even lower variance.
Last updated