ARXIV:2605.10142 · COMPUTER VISION EXPLAINABILITY · SUBMITTED 12 MAY · 20:16 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality

Mateusz Cedro · Marcin Chlebus · arXiv

Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

Ship in 2-4 weeks›Score4.0Evidence unverified

Opportunity summary

Pain Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

Evidence 0 refs | 0 sources | 0% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy. We investigate this relationship by evaluating 11 computer vision models…

METHOD

Full abstract

Artificial intelligence models are increasingly scaled to improve predictive accuracy, yet it remains unclear whether scale improves the quality of post-hoc explanations. We investigate this relationship by evaluating 11 computer vision models representing increasing levels of depth and complexity within the ResNet, DenseNet, and Vision Transformer families, trained from scratch or pretrained, across three image datasets with ground-truth segmentation masks. For each model, we generate explanations using five post-hoc explainable AI methods and quantify mask alignment using two localisation metrics: Relevance Rank Accuracy (Arras et al., 2022) and the proposed Dual-Polarity Precision, which measures positive attributions inside the class mask and negative attributions outside it. Across datasets and methods, increasing architectural depth and parameter count does not improve explanation quality in most statistical comparisons, and smaller models often match or exceed deeper variants. While pretraining typically improves predictive performance and increases the dependence of explanations on learned weights, it does not consistently increase localisation scores. We also observe scenarios in which models achieve strong predictive performance while localisation precision is near zero, suggesting that performance metrics alone may not indicate whether predictions are based on the annotated regions. These results indicate that larger models do not reliably provide higher-quality explanations, and that explainability should therefore be assessed explicitly during model selection for safety-sensitive deployments.

RESULT

ScienceToStartup currently rates this 4.0/10 on the public viability pass. Artificial intelligence models are increasingly scaled to improve predictive accuracy, yet it remains unclear whether scale improves the quality of post-hoc explanations. Code availability…

WHY NOW

Computer Vision Explainability moved forward this cycle; last verified May 2026. Public score 4.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score4.0

PainInvestigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

Evidence0 refs | 0 sources | 0% coverage

Blockerno shell-level blocker reported

Analysis summary

Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

Segment

Computer Vision Explainability

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "f59b4492-b7ec-481e-bae5-5ec5a1809941", "arxiv_id": "2605.10142", "canonical_route": "/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "endpoints": { "paper_pack": "/api/v1/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality/paper-pack", "build_passport": "/api/v1/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality", "normalized_query": "2605.10142", "route": "/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "paper_ref": "scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality#webpage", "url": "https://sciencetostartup.com/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "name": "Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality", "description": "Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality#scholarlyArticle", "headline": "Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality", "description": "Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.", "url": "https://sciencetostartup.com/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality", "sameAs": "https://arxiv.org/abs/2605.10142", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.10142" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-11T07:51:33.000Z", "author": [ { "@type": "Person", "name": "Mateusz Cedro" }, { "@type": "Person", "name": "Marcin Chlebus" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 4 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Computer Vision Explainability" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Computer Vision Explainability", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Scaling Vision Models Does Not Consistently Improve Localisa", "item": "https://sciencetostartup.com/paper/scaling-vision-models-does-not-consistently-improve-localisation-based-explanation-quality" } ] } ] }

Competitive landscape

Investigating the impact of scaling computer vision models on the quality of localization-based explanations, finding that larger models do not consistently improve explanation accuracy.

Segment

Computer Vision Explainability

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality

Scaling Vision Models Does Not Consistently Improve Localisation-Based Explanation Quality

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline