World Labs Marble: Fei-Fei Li का Spatial Intelligence के लिए Vision

जिस researcher ने machines को देखने की ability दी, अब वो उन्हें पूरी worlds imagine करना सिखा रही हैं। World Labs Marble के साथ, Fei-Fei Li video generation के बाद अगला step ले रही हैं persistent, explorable 3D environments में।

ImageNet से World Models तक

💡

World models कैसे AI video evolution में fit करते हैं इसके context के लिए, देखें हमारा world models का overview जो next frontier है।

Fei-Fei Li ने ImageNet के साथ computer vision में revolution लाया था, वो dataset जिसने modern deep learning को possible बनाया। अब, $230 million funding के साथ एक साल World Labs build करने के बाद, उन्होंने launch किया है Marble, company का first commercial product।

Thesis simple है: AI ने पहले text को conquer किया, फिर images, फिर video। Next frontier है spatial intelligence, 3D worlds को perceive, generate और interact करने की ability।

$230M

Funding Raised

Pricing Tiers

Native Output

Marble क्या करता है

Marble multiple input types से persistent, downloadable 3D environments generate करता है:

✓Text prompts
✓Single images
✓Videos
✓Panoramas
✓3D layouts

Competitors जैसे Decart's Oasis या Google's Genie के real-time world models के unlike, Marble minimal morphing के साथ stable worlds create करता है। आप एक बार generate करते हैं, फिर AI जो create किया है उसे "भूले" बिना freely explore करते हैं।

Chisel Editor

🔨

AI-Native 3D Editing

Chisel spatial structure को visual style से decouple करता है। पहले अपना layout block out करें, फिर text-based styling guidance apply करें।

यह hybrid approach Marble को text-to-scene models से अलग करता है। AI आपका spatial intent समझेगा इसकी उम्मीद करने की बजाय, आप geometry को explicitly define करते हैं। AI aesthetics, materials और lighting handle करता है।

इसे एक floor plan sketch करने की तरह सोचें interior designer को decorate करने को बोलने से पहले। Spatial relationships पर control आपके पास ही रहता है।

Export Formats और Compatibility

Generated worlds तीन formats में export होते हैं:

Format	Use Case
Gaussian Splats	Real-time rendering, novel views
Meshes	Game engines, CAD integration
Videos	Content creation, pre-vis

💡

सभी Marble worlds Vision Pro और Quest 3 headsets के साथ out of the box VR-compatible हैं।

Pricing Structure

World Labs चार tiers offer करता है:

Tier	Price	Generations	Key Features
Free	$0	4/month	Text, image, या panorama input
Standard	$20/month	12/month	Multi-image/video input, advanced editing
Pro	$35/month	25/month	Scene expansion, commercial rights
Max	$95/month	75/month	All features, maximum generations

Free tier आपको technology evaluate करने देता है। Commercial rights require करने वाले production work के लिए, Pro tier $35/month में इतनी novel capability के लिए reasonable entry pricing represent करता है।

Spatial Intelligence क्यों Matter करती है

"Spatial intelligence है next decade की defining challenge।" - Fei-Fei Li

Li argue करती हैं कि current AI की एक fundamental limitation है: यह 3D space के बारे में poorly reason करता है। Language models physics के बारे में hallucinate करते हैं। Video models impossible geometries create करते हैं। Image generators consistent spatial relationships के साथ struggle करते हैं।

✗Current Approaches

Video models true 3D understanding के बिना frame sequences generate करते हैं। Camera movements inconsistencies reveal करते हैं। Objects position change करते हैं या disappear करते हैं।

✓Spatial Intelligence

Native 3D representation physically consistent worlds enable करता है। Camera को freely move करें। Environment persist करता है क्योंकि यह geometry के रूप में exist करता है, pixels के रूप में नहीं।

Robotics के लिए, यह enormously matter करता है। एक kitchen navigate करने वाले robot को spatial understanding चाहिए, frame prediction नहीं। VFX के लिए, directors को explorable environments चाहिए, fixed camera paths नहीं।

Use Cases Taking Shape

Gaming Ambient environments और background spaces generate करें। Indie developers ऐसे exploration areas create कर सकते हैं जिनमें traditional art production में months लगते।

Visual Effects Pre-visualization interactive हो जाता है। Scene को spatially block out करें, फिर shots पर commit करने से पहले camera angles explore करें।

Architecture Floor plans को explorable walkthroughs में convert करें। Clients construction शुरू होने से पहले spaces experience करते हैं।

Education Li envision करती हैं students cell के अंदर walking कर रहे हैं, surgeons anatomical simulations के अंदर practicing कर रहे हैं।

World Expansion और Composer Mode

दो features scale limitations को address करते हैं:

World Expansion आपको generated world को एक बार extend करने देता है, edge regions में detail add करके जहां quality typically degrade होती है। यह explorable space की boundaries को initial generation limits के beyond push करता है।

Composer Mode multiple worlds को larger environments में combine करता है। Individual rooms generate करें, फिर उन्हें complete building में stitch करें।

ये tools current constraints को acknowledge करते हैं practical workarounds provide करते हुए।

Competition Landscape

Marble एक crowded field में enter कर रहा है:

Product	Approach	Differentiator
Decart Oasis	Real-time game generation	Interactive, लेकिन worlds exploration में shift करते हैं
Google Genie	Game world generation	True 3D के बिना frame prediction
Odyssey	Persistent world models	Enterprise focus
World Labs Marble	Static 3D generation	Downloadable, editable, VR-ready

Trade-off clear है। Real-time models जैसे Oasis immediacy offer करते हैं लेकिन instability। Marble interactivity के ऊपर persistence और editability को prioritize करता है।

Video Generation से Connecting

💡

Spatial AI में use किए जाने वाले diffusion architectures की background के लिए, देखें हमारा diffusion transformers का technical overview।

3D world generation video से कैसे relate करता है? वे diffusion models में mathematical foundations share करते हैं, लेकिन different problems solve करते हैं।

Video generation temporal sequences create करता है, frame after frame। Spatial AI geometric representations create करता है, surfaces और volumes। Video "what happens next?" answer करता है। Spatial AI "what exists here?" answer करता है।

Convergence point: navigable video। एक 3D world generate करें, फिर जब आप इसके through move करते हैं तो video render करें। यह approach pure video generation के साथ impossible camera control offer करता है।

Limitations Consider करने लायक

Marble complete solution नहीं है:

○कोई animated characters या dynamic elements नहीं
○Generation caps production workflows को limit कर सकते हैं
○Edge degradation expansion passes require करता है
○सिर्फ static environments

Animated content के लिए, आपको अभी भी video generation models चाहिए। Marble environments और spaces में excel करता है, actors या actions में नहीं।

Bigger Picture

Fei-Fei Li spatial intelligence को AI progress के लिए essential देखती हैं:

"मुझे लगता है हम सबकी जिम्मेदारी है AI को better state में ले जाने की जब यह more powerful हो रहा है। हम सबको चाहना चाहिए कि humanity prevail करे और thrive करे।"

उनका vision entertainment के beyond extend करता है। Medical simulations जहां students anatomy explore करते हैं। Scientific visualizations जहां researchers molecular structures navigate करते हैं। Robotic training environments जो on demand generate होते हैं।

Marble है step one, एक commercial proof of concept। Research continue कर रही है more dynamic, interactive और physically accurate world generation की ओर।

Getting Started

World Labs महीने में 4 generations के साथ free tier offer करता है। Technology को evaluate करने और इसकी constraints समझने के लिए काफी।

जो creators already 3D में work कर रहे हैं उनके लिए, mesh export capability existing pipelines के साथ integrate करती है। Video producers के लिए, video export कहीं और unavailable pre-visualization capabilities provide करता है।

💡

Related reading: हमारी AI video character consistency की guide generated content across coherence maintain करने के techniques cover करती है, एक challenge जो Marble persistent 3D representation के through address करता है।

2D generation से 3D world creation में transition AI जो produce कर सकता है उसमें fundamental shift represent करता है। Marble उस shift को accessible बनाता है।