ARXIV:2605.21443 · VISION-LANGUAGE MODELS · SUBMITTED 21 MAY · 20:27 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos

Yakun Yu · Ashley Wiens · Adrián Barahona-Ríos · Benedict Wilkins · Saman Zadtootaghaj · Nabajeet Barman · +1 at arXiv

A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations. Most existing evaluations, however, treat glitches as static visual anomalies, asking models to detect failures…

METHOD

Full abstract

Vision-language models (VLMs) are increasingly being explored for video game quality assurance, especially gameplay glitch detection. Most existing evaluations, however, treat glitches as static visual anomalies, asking models to detect failures from a single frame. We argue that this framing misses a key distinction: some glitches are spatial and visible in an isolated frame, whereas others are temporal and become evident only through changes across ordered frames. A preliminary study confirms this gap, showing that temporal glitches are substantially harder for VLMs to detect than spatial ones. To enable systematic evaluation of this underexplored setting, we introduce TempGlitch, a controlled gameplay video benchmark for temporal glitch detection. TempGlitch covers five temporal glitch types with balanced per-category samples, together with paired glitch-free videos that enable reliable binary evaluation. We evaluate 12 proprietary and open-weight VLMs across multiple frame-sampling settings. Our results show that current VLMs remain near chance on TempGlitch, often collapsing into either overly conservative behavior that misses most glitches or overly sensitive behavior that flags clean videos as glitchy. Moreover, denser frame sampling and larger model size do not reliably resolve these failures. TempGlitch provides a focused testbed for temporal reasoning, robust gameplay understanding, and automated glitch detection with VLMs. Code and data are available at the project website.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. To enable systematic evaluation of this underexplored setting, we introduce TempGlitch, a controlled gameplay video benchmark for temporal glitch detection. Code availability is flagged…

WHY NOW

Vision-Language Models moved forward this cycle; last verified May 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "09e966f9-1e81-4e4d-9aec-2b52cd90a976", "arxiv_id": "2605.21443", "canonical_route": "/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "endpoints": { "paper_pack": "/api/v1/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos/paper-pack", "build_passport": "/api/v1/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos", "normalized_query": "2605.21443", "route": "/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "paper_ref": "tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos#webpage", "url": "https://sciencetostartup.com/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "name": "TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos", "description": "A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos#scholarlyArticle", "headline": "TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos", "description": "A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.", "url": "https://sciencetostartup.com/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos", "sameAs": "https://arxiv.org/abs/2605.21443", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.21443" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-20T17:32:26.000Z", "author": [ { "@type": "Person", "name": "Yakun Yu" }, { "@type": "Person", "name": "Ashley Wiens" }, { "@type": "Person", "name": "Adrián Barahona-Ríos" }, { "@type": "Person", "name": "Benedict Wilkins" }, { "@type": "Person", "name": "Saman Zadtootaghaj" }, { "@type": "Person", "name": "Nabajeet Barman" }, { "@type": "Person", "name": "Cor-Paul Bezemer" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Vision-Language Models" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Vision-Language Models", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "TempGlitch: Evaluating Vision-Language Models for Temporal G", "item": "https://sciencetostartup.com/paper/tempglitch-evaluating-vision-language-models-for-temporal-glitch-detection-in-gameplay-videos" } ] } ] }

Competitive landscape

A new benchmark and evaluation of vision-language models for detecting temporal glitches in gameplay videos, revealing current model limitations.

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos

TempGlitch: Evaluating Vision-Language Models for Temporal Glitch Detection in Gameplay Videos

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline