169 results for “topic:world-models”
Advancing Open-source World Models
Mastering Diverse Domains through World Models
Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels with Hunyuan3D World Model
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Voyager is an interactive RGBD video generation model conditioned on camera input, and supports real-time 3D reconstruction.
HY-World 1.5: A Systematic Framework for Interactive World Modeling with Real-Time Latency and Geometric Consistency
Helios: Real Real-Time Long Video Generation Model
Fast and Universal 3D reconstruction model for versatile tasks
Mastering Atari with Discrete World Models
Cosmos-Predict2.5, the latest version of the Cosmos World Foundation Models (WFMs) family, specialized for simulating and predicting the future state of the world in the form of video.
Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.
🌐 3D and 4D World Modeling: A Survey
Notably, GenAD & Dataset Survey. A Collection of Foundation Driving Models by OpenDriveLab. For Vista and DriveLM, please refer to individual page.
Generate large-scale explorable 3D scenes with high-quality panorama videos from a single image or text prompt.
Dream to Control: Learning Behaviors by Latent Imagination
[ICLR 2026] rCM: SOTA JVP-Based Diffusion Distillation & Few-Step Video Generation & Scaling Up sCM/MeanFlow & Real-Time Autoregressive Video Diffusion
A most Frontend Collection and survey of vision-language model papers, and models GitHub repository. Continuous updates.
[CVPR 2026] SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
[ACM CSUR 2025] Understanding World or Predicting Future? A Comprehensive Survey of World Models
Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.
A curated list of world models for autonomous driving.
Official codebase for "Causal Forcing: Autoregressive Diffusion Distillation Done Right for High-Quality Real-Time Interactive Video Generation"
[ICCV 2025 ⭐highlight⭐] Implementation of VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory
DayDreamer: World Models for Physical Robot Learning
ICCV 2025 | TesserAct: Learning 4D Embodied World Models
An open source code repository of driving world models, with training, inferencing, evaluation tools, and pretrained checkpoints.
World Model based Autonomous Driving Platform in CARLA :car:
Official Code for Epona: Autoregressive Diffusion World Model for Autonomous Driving (ICCV 2025)
A comprehensive list of papers investigating physical cognition in video generation, including papers, codes, and related websites.