Marble, an AI model from World Labs, can turn a single image into a volumetric scene that you can view in WebXR in a matter of minutes.
World Labs was founded by Fei-Fei Li, one of the pioneers of modern AI, best known for creating the ImageNet dataset that helped enable the rapid advancement of computer vision of the past 15 years.
As with almost all of the remarkable advancements in 3D reconstruction over the past few years, Marble generates Gaussian splats, fitting thousands of semitransparent colored blobs (Gaussians) in 3D space, so that arbitrary viewpoints can be rendered realistically in real-time. And both its variety of supported input types and the speed of its output are, to date, unprecedented.
While other splat generation systems like Meta’s Horizon Hyperscape and Varjo Teleport require hundreds of input frames and hours of processing, in its simplest mode Marble can generate splats from a single input image or text prompt in a matter of minutes.
For more advanced outputs, if you pay for the $20/month subscription Marble can take multiple images as input, or a short video, or even a 3D structure, using a tool World Labs calls Chisel.
Chisel lets you lay out a scene with crude 3D shapes, as you would in a game editor, and then use a text prompt to turn it into a detailed volumetric scene.
With the subscription, Marble outputs support interactive editing, expanding, and the ability to combine multiple worlds together. And you can export as a high-quality traditional 3D mesh, though this takes multiple hours of conversion time.
Because of the unique capability set of Marble, World Labs describes it as a “first-in-class generative multimodal world model”.
On the Marble web app you can generate your own scenes for free, and view the output in VR via WebXR using the web browser of your headset.
0:00
Testing Marble with a single image of the Steam Dev Days 2014 VR room.
Trying out Marble on Quest 3 and Apple Vision Pro, by turning a single image of the Steam Dev Days 2014 VR room into a volumetric scene, I found the quality to be noticeably inferior to Meta’s Hyperscape worlds and Varjo Teleport, more akin to Niantic Scaniverse. While the details directly brought in from your input image are relatively detailed, the further away you move from this, the more of the typical Gaussian splat visual artifacts you’ll see.
And of course, the elephant in the room here is that details beyond the image frame are hallucinated, so will be very different from what was actually there behind the camera, unless you provide multiple input images.
Still, all this aside, the ability to generate volumetric scenes in minutes from a single image or sentence is remarkable, and that you can then edit them with a combination of an editor UI and natural language even more so.
Further, the ability to then export these scenes as traditional 3D worlds, with geometric steerability via Chisel, seems like it could have huge potential for VR developers to build environments for their interactive apps and games.
You can try out Marble at marble.worldlabs.ai. Note that if you don’t pay, any scenes you create will be publicly listed. You’ll need the $20/month subscription to create a private scene, alongside unlocking the advanced creation, editing, and export features.