Hacker Newsnew | past | comments | ask | show | jobs | submit | pixelsynth's commentslogin

We have a WebXR demo we built during Spark's development that showcases 3DGS running on Quest 3 or Vision Pro:

https://lofiworlds.ai

Make sure to enable hand tracking so you can "touch" the Gaussian splats :). (tap your wrists together to toggle spotlight hands mode)


It's an interesting idea, and with spark you could test this by adjusting the parameter maxStdDev to control how far out it draws the splat.

I agree with you though that in general 3DGS is a worse representation for hard, flat, synthetic things with hard edges. But in the flip side, I would argue it's a better representation for many organic, real-world things, like imagine fur or hair or leaves on a tree... These are things that can render beautifully photo realistically in a way that would require much, much more complex polygon geometry and texturing and careful sorting and blending of semi-transparent texels. This is one reason why 3DGS has become so popular in scanning and 3D reconstruction.. you just get much better results with smaller file sizes. When 3DGS first appeared, everyone was shocked by how photorealistic you could render things in real time on a mobile device!

But one final thought I want to add: with Spark it's not an either/or. You can have BOTH in the same Three.js scene and they will blend together perfectly via the Z-buffer. So you can scan the world around you and render it with 3DGS, and then insert your hard-edged robot character polygon meshes right into that world, and get the best of both!


Cool - thanks for explaining that. I totally see how each has its place.

I imagine it's pretty complex to take the raw scan data and generate 3dgs. Are these algorithms simple & standard, or do they take a fair amount of tuning & tweaking to do a good job? Adapting these to work well with hard-edge ovals seems like it would take some work, and a lot more work to get them to output a mix of ovals & fuzzy blobs. But if you could do that, I agree the combination would be amazingly expressive.


There are a lot of tools to do this easily today, for free! Take a look at Postshot, or Brush. You can literally take a video with your mobile phone, toss it in Postshot, and a few minutes later you have a photorealistic 3DGS model you can use in Spark!

3DGS is still a rapidly evolving research field, but the "baseline" is pretty much standard these days.


Spark allows you to construct compute graphs at runtime in Javascript and have them compiled and run on the GPU and not be bound by the CPU: https://sparkjs.dev/docs/dyno-overview/

WebGL2 isn't the best graphics API, but it allows anyone to write Javascript code to harness the GPU for compute and rendering, and run on pretty much any device via the web browser. That's pretty amazing IMO!


Most 4DGS reconstruction methods right now are exactly that: setting up many cameras and recording them simultaneously so you can reconstruct each instant in time as a 3DGS. In the future it might be possible to use a single camera and have an AI/ML method figure out how all the 3D gaussians move over time, including parts that are occluded from the single camera!


Yes, Spark does instanced rendering of quads, one covering each Gaussian splat. The sorting is done by 1) calculating sort distance for every splat on the GPU, 2) reading it back to the CPU as float16s, 3) doing a 1-pass bucket sort to get an ordering of all the splats from back to front.

On most newer devices the sorting can happen pretty much every frame with approx 1 frame latency, and runs in parallel on a Web Worker. So the sorting itself has minimal performance impact, and because of that Spark can do fully dynamic 3DGS where every splat can move independently each frame!

On some older Android devices it can be a few frames worth of latency, and in that case you could say it's amortized over a few frames. But since it all happens in parallel there's no real impact to the overall rendering performance. I expect for most devices the sorting in Spark is mostly a solved problem, especially with increasing memory bandwidth and shared CPU-GPU memory.


If you say 1 pass bucket sorting.. I assume you do sort the buckets as well?

I've implemented a radix sort on GPU to sort the splats (every frame).. and I'm not quite happy with performance yet. A radix sort (+ prefix scan) is quite involved with lot's of dedicated hierarchical compute shaders.. I might have to get back to tune it.

I might switch to float16s as well, I'm a bit hesitant, as 1 million+ splats, may exceed the precision of halfs.


We are purposefully trading off some sorting precision for speed with float16, and for scenes with large Z extents you'd probably get more Z-fighting, so I'm not sure if I'd recommend it for you if your goal is max reconstruction accuracy! But we'll likely add a 2-pass sort (i.e. radix sort with a large base / #buckets) in the future for higher precision (user selectable so you can decide what's more important for you). But I will say that implementing a sort on the CPU is much simpler than on the GPU, so it opens up possibilities if you're willing to do a readback from GPU to CPU and tolerate at least 1 frame of latency (usually not perceivable).


You might want to consider using words (16 bit integer) instead of halfs? Then you can use all the 65k value precision in a range you choose (by remapping 32bit floats to words), potentially adjust it every frame, or with a delay.


Yeah you're right, using float16 gets us 0x7C00 buckets of resolution only. We could explicitly turn it into a log encoding and spread it over 2^16 buckets and get 2x the range there! Other renderers do this dynamic per-frame range adjustment, we could do that too.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: