ARXIV:2604.12320 · VIDEO LLMS · SUBMITTED 15 APR · 16:59 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports

Jianzhe Ma · Zhonghao Cao · Shangkui Chen · Yichen Xu · Wenxuan Wang · Qin Jin · arXiv

A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development. Existing benchmarks focus on daily activities, yet lack a rigorous testbed for evaluating fast,…

METHOD

Full abstract

While video large language models (Video-LLMs) excel in understanding slow-paced, real-world egocentric videos, their capabilities in high-velocity, information-dense virtual environments remain under-explored. Existing benchmarks focus on daily activities, yet lack a rigorous testbed for evaluating fast, rule-bound reasoning in virtual scenarios. To fill this gap, we introduce EgoEsportsQA, a pioneering video question-answering (QA) benchmark for grounding perception and reasoning in expert esports knowledge. We curate 1,745 high-quality QA pairs from professional matches across 3 first-person shooter games via a scalable six-stage pipeline. These questions are structured into a two-dimensional decoupled taxonomy: 11 sub-tasks in the cognitive capability dimension (covering perception and reasoning levels) and 6 sub-tasks in the esports knowledge dimension. Comprehensive evaluations of state-of-the-art Video-LLMs reveal that current models still fail to achieve satisfactory performance, with the best model only 71.58%. The results expose notable gaps across both axes: models exhibit stronger capabilities in basic visual perception than in deep tactical reasoning, and they grasp overall macro-progression better than fine-grained micro-operations. Extensive ablation experiments demonstrate the intrinsic weaknesses of current Video-LLM architectures. Further analysis suggests that our dataset not only reveals the connections between real-world and virtual egocentric domains, but also offers guidance for optimizing downstream esports applications, thereby fostering the future advancement of Video-LLMs in various egocentric environments.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Comprehensive evaluations of state-of-the-art Video-LLMs reveal that current models still fail to achieve satisfactory performance, with the best model only 71.58%. Code availability is…

WHY NOW

Video LLMs moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

Segment

Video LLMs

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "43f0a0f8-4aa7-474a-b0bb-58f366b374f5", "arxiv_id": "2604.12320", "canonical_route": "/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "endpoints": { "paper_pack": "/api/v1/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports/paper-pack", "build_passport": "/api/v1/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports", "normalized_query": "2604.12320", "route": "/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "paper_ref": "egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports#webpage", "url": "https://sciencetostartup.com/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "name": "EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports", "description": "A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports#scholarlyArticle", "headline": "EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports", "description": "A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.", "url": "https://sciencetostartup.com/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports", "sameAs": "https://arxiv.org/abs/2604.12320", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.12320" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-14T05:53:16.000Z", "author": [ { "@type": "Person", "name": "Jianzhe Ma" }, { "@type": "Person", "name": "Zhonghao Cao" }, { "@type": "Person", "name": "Shangkui Chen" }, { "@type": "Person", "name": "Yichen Xu" }, { "@type": "Person", "name": "Wenxuan Wang" }, { "@type": "Person", "name": "Qin Jin" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Video LLMs" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Video LLMs", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "EgoEsportsQA: An Egocentric Video Benchmark for Perception a", "item": "https://sciencetostartup.com/paper/egoesportsqa-an-egocentric-video-benchmark-for-perception-and-reasoning-in-esports" } ] } ] }

Competitive landscape

A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.

Segment

Video LLMs

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports

EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline