ARXIV:2604.01765 · EMBODIED AI / DRIVING SIMULATION · SUBMITTED 03 APR · 20:50 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

Yang Zhou · Xiaofeng Wang · Hao Shao · Letian Wang · Guosheng Zhao · Jiangnan Shao · +5 at arXiv

A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning. However, existing WAM approaches often focus on modeling 2D appearance or latent representations, with limited geometric grounding-an essential element for embodied…

METHOD

Full abstract

Recently, world-action models (WAM) have emerged to bridge vision-language-action (VLA) models and world models, unifying their reasoning and instruction-following capabilities and spatio-temporal world modeling. However, existing WAM approaches often focus on modeling 2D appearance or latent representations, with limited geometric grounding-an essential element for embodied systems operating in the physical world. We present DriveDreamer-Policy, a unified driving world-action model that integrates depth generation, future video generation, and motion planning within a single modular architecture. The model employs a large language model to process language instructions, multi-view images, and actions, followed by three lightweight generators that produce depth, future video, and actions. By learning a geometry-aware world representation and using it to guide both future prediction and planning within a unified framework, the proposed model produces more coherent imagined futures and more informed driving actions, while maintaining modularity and controllable latency. Experiments on the Navsim v1 and v2 benchmarks demonstrate that DriveDreamer-Policy achieves strong performance on both closed-loop planning and world generation tasks. In particular, our model reaches 89.2 PDMS on Navsim v1 and 88.7 EPDMS on Navsim v2, outperforming existing world-model-based approaches while producing higher-quality future video and depth predictions. Ablation studies further show that explicit depth learning provides complementary benefits to video imagination and improves planning robustness.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Experiments on the Navsim v1 and v2 benchmarks demonstrate that DriveDreamer-Policy achieves strong performance on both closed-loop planning and world generation tasks. Code availability…

WHY NOW

Embodied AI / Driving Simulation moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

Evidence0 refs | 0 sources | 33% coverage

Blockerno shell-level blocker reported

Analysis summary

A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

Segment

Embodied AI / Driving Simulation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "410de375-3dad-48bf-aa35-1787b5ad7bf6", "arxiv_id": "2604.01765", "canonical_route": "/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "endpoints": { "paper_pack": "/api/v1/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning/paper-pack", "build_passport": "/api/v1/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning", "normalized_query": "2604.01765", "route": "/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "paper_ref": "drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning#webpage", "url": "https://sciencetostartup.com/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "name": "DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning", "description": "A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning#scholarlyArticle", "headline": "DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning", "description": "A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.", "url": "https://sciencetostartup.com/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning", "sameAs": "https://arxiv.org/abs/2604.01765", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.01765" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-02T08:33:18.000Z", "author": [ { "@type": "Person", "name": "Yang Zhou" }, { "@type": "Person", "name": "Xiaofeng Wang" }, { "@type": "Person", "name": "Hao Shao" }, { "@type": "Person", "name": "Letian Wang" }, { "@type": "Person", "name": "Guosheng Zhao" }, { "@type": "Person", "name": "Jiangnan Shao" }, { "@type": "Person", "name": "Jiagang Zhu" }, { "@type": "Person", "name": "Tingdong Yu" }, { "@type": "Person", "name": "Zheng Zhu" }, { "@type": "Person", "name": "Guan Huang" }, { "@type": "Person", "name": "Steven L. Waslander" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Embodied AI / Driving Simulation" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Embodied AI / Driving Simulation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "DriveDreamer-Policy: A Geometry-Grounded World-Action Model ", "item": "https://sciencetostartup.com/paper/drivedreamer-policy-a-geometry-grounded-world-action-model-for-unified-generation-and-planning" } ] } ] }

Competitive landscape

A geometry-grounded world-action model for unified driving simulation, future prediction, and motion planning.

Segment

Embodied AI / Driving Simulation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline