ARXIV:2603.01305 · VISUAL ANOMALY SEGMENTATION · SUBMITTED 19 MAR · 21:31 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

arXiv

AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

Blocked on Code›Score8.0Evidence failed

Opportunity summary

Pain AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence failed

Open Build Read PDF Signal Canvas Track

PROBLEM

AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models. However, existing LMM-based segmentation approaches still face fundamental limitations: anomaly concepts are inherently abstract and context-dependent, lacking stable visual…

METHOD

Full abstract

Large multimodal models (LMMs) exhibit strong task generalization capabilities, offering new opportunities for zero-shot visual anomaly segmentation (ZSAS). However, existing LMM-based segmentation approaches still face fundamental limitations: anomaly concepts are inherently abstract and context-dependent, lacking stable visual prototypes, and the weak alignment between high-level semantic embeddings and pixel-level spatial features hinders precise anomaly localization. To address these challenges, we present AG-VAS (Anchor-Guided Visual Anomaly Segmentation), a new framework that expands the LMM vocabulary with three learnable semantic anchor tokens-[SEG], [NOR], and [ANO], establishing a unified anchor-guided segmentation paradigm. Specifically, [SEG] serves as an absolute semantic anchor that translates abstract anomaly semantics into explicit, spatially grounded visual entities (e.g., holes or scratches), while [NOR] and [ANO] act as relative anchors that model the contextual contrast between normal and abnormal patterns across categories. To further enhance cross-modal alignment, we introduce a Semantic-Pixel Alignment Module (SPAM) that aligns language-level semantic embeddings with high-resolution visual features, along with an Anchor-Guided Mask Decoder (AGMD) that performs anchor-conditioned mask prediction for precise anomaly localization. In addition, we curate Anomaly-Instruct20K, a large-scale instruction dataset that organizes anomaly knowledge into structured descriptions of appearance, shape, and spatial attributes, facilitating effective learning and integration of the proposed semantic anchors. Extensive experiments on six industrial and medical benchmarks demonstrate that AG-VAS achieves consistent state-of-the-art performance in the zero-shot setting.

RESULT

ScienceToStartup currently rates this 8.0/10 on the public viability pass. Extensive experiments on six industrial and medical benchmarks demonstrate that AG-VAS achieves consistent state-of-the-art performance in the zero-shot setting.

WHY NOW

Visual Anomaly Segmentation moved forward this cycle; last verified April 2026. Public score 8.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score8.0

PainAG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

Competitive landscape

AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

Segment

Visual Anomaly Segmentation

Adoption evidence

No public code link in the paper record yet

Commercial read

8.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "9d6f5466-9b7f-45f3-aada-f7dc6cc59073", "arxiv_id": "2603.01305", "canonical_route": "/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "endpoints": { "paper_pack": "/api/v1/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models/paper-pack", "build_passport": "/api/v1/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models", "normalized_query": "2603.01305", "route": "/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "paper_ref": "ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models#webpage", "url": "https://sciencetostartup.com/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "name": "AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models", "description": "AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models#scholarlyArticle", "headline": "AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models", "description": "AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.", "url": "https://sciencetostartup.com/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models", "sameAs": "https://arxiv.org/abs/2603.01305", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.01305" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-01T22:25:23.000Z", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 8 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Visual Anomaly Segmentation" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Visual Anomaly Segmentation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation ", "item": "https://sciencetostartup.com/paper/ag-vas-anchor-guided-zero-shot-visual-anomaly-segmentation-with-large-multimodal-models" } ] } ] }

Competitive landscape

AG-VAS offers advanced zero-shot visual anomaly segmentation for industrial and medical applications using multimodal models.

Segment

Visual Anomaly Segmentation

Adoption evidence

No public code link in the paper record yet

Commercial read

8.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

AG-VAS: Anchor-Guided Zero-Shot Visual Anomaly Segmentation with Large Multimodal Models

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline