ARXIV:2603.28545 · ROBOTICS EVALUATION FRAMEWORK · SUBMITTED 31 MAR · 20:17 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation

Yu Sun · Meng Cao · Ping Yang · Rongtao Xu · Yunxiao Yan · Runze Xu · +12 at arXiv

A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

Evidence 17 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap. Existing benchmarks are largely simulator-centric, which provide controllability but fail to capture the reality gap caused by perception noise,…

METHOD

Full abstract

Vision-Language-Action (VLA) models and world models have recently emerged as promising paradigms for general-purpose robotic intelligence, yet their progress is hindered by the lack of reliable evaluation protocols that reflect real-world deployment. Existing benchmarks are largely simulator-centric, which provide controllability but fail to capture the reality gap caused by perception noise, complex contact dynamics, hardware constraints, and system latency. Moreover, fragmented real-world evaluations across different robot platforms prevent fair and reproducible comparison. To address these challenges, we introduce ManipArena, a standardized evaluation framework designed to bridge simulation and real-world execution. ManipArena comprises 20 diverse tasks across 10,812 expert trajectories emphasizing reasoning-oriented manipulation tasks requiring semantic and spatial reasoning, supports multi-level generalization through controlled out-of-distribution settings, and incorporates long-horizon mobile manipulation beyond tabletop scenarios. The framework further provides rich sensory diagnostics, including low-level motor signals, and synchronized real-to-sim environments constructed via high-quality 3D scanning. Together, these features enable fair, realistic, and reproducible evaluation for both VLA and world model approaches, providing a scalable foundation for diagnosing and advancing embodied intelligence systems.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. ManipArena comprises 20 diverse tasks across 10,812 expert trajectories emphasizing reasoning-oriented manipulation tasks requiring semantic and spatial reasoning, supports multi-level generalization through controlled out-of-distribution…

WHY NOW

Robotics Evaluation Framework moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

Evidence17 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

Segment

Robotics Evaluation Framework

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "947ac10b-e981-4fbd-b83b-9784a29e079b", "arxiv_id": "2603.28545", "canonical_route": "/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "endpoints": { "paper_pack": "/api/v1/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation/paper-pack", "build_passport": "/api/v1/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation", "normalized_query": "2603.28545", "route": "/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "paper_ref": "maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation#webpage", "url": "https://sciencetostartup.com/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "name": "ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation", "description": "A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation#scholarlyArticle", "headline": "ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation", "description": "A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.", "url": "https://sciencetostartup.com/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation", "sameAs": "https://arxiv.org/abs/2603.28545", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.28545" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-30T15:06:41.000Z", "author": [ { "@type": "Person", "name": "Yu Sun" }, { "@type": "Person", "name": "Meng Cao" }, { "@type": "Person", "name": "Ping Yang" }, { "@type": "Person", "name": "Rongtao Xu" }, { "@type": "Person", "name": "Yunxiao Yan" }, { "@type": "Person", "name": "Runze Xu" }, { "@type": "Person", "name": "Liang Ma" }, { "@type": "Person", "name": "Roy Gan" }, { "@type": "Person", "name": "Andy Zhai" }, { "@type": "Person", "name": "Qingxuan Chen" }, { "@type": "Person", "name": "Zunnan Xu" }, { "@type": "Person", "name": "Hao Wang" }, { "@type": "Person", "name": "Jincheng Yu" }, { "@type": "Person", "name": "Lucy Liang" }, { "@type": "Person", "name": "Qian Wang" }, { "@type": "Person", "name": "Ivan Laptev" }, { "@type": "Person", "name": "Ian D Reid" }, { "@type": "Person", "name": "Xiaodan Liang" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Robotics Evaluation Framework" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Robotics Evaluation Framework", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "ManipArena: Comprehensive Real-world Evaluation of Reasoning", "item": "https://sciencetostartup.com/paper/maniparena-comprehensive-real-world-evaluation-of-reasoning-oriented-generalist-robot-manipulation" } ] } ] }

Competitive landscape

A standardized real-world evaluation framework for generalist robot manipulation models to bridge the simulation-to-reality gap.

Segment

Robotics Evaluation Framework

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation

ManipArena: Comprehensive Real-world Evaluation of Reasoning-Oriented Generalist Robot Manipulation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline