ARXIV:2602.22427 · AI SECURITY · SUBMITTED 19 MAR · 21:31 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

arXiv

HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

Blocked on Code›Score9.0Evidence failed

Opportunity summary

Pain HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence failed

Open Build Read PDF Signal Canvas Track

PROBLEM

HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access. Nevertheless, these systems encounter a significant security flaw: hubness - items that frequently appear in the top-k retrieval…

METHOD

Full abstract

Retrieval-Augmented Generation (RAG) systems are essential to contemporary AI applications, allowing large language models to obtain external knowledge via vector similarity search. Nevertheless, these systems encounter a significant security flaw: hubness - items that frequently appear in the top-k retrieval results for a disproportionately high number of varied queries. These hubs can be exploited to introduce harmful content, alter search rankings, bypass content filtering, and decrease system performance. We introduce hubscan, an open-source security scanner that evaluates vector indices and embeddings to identify hubs in RAG systems. Hubscan presents a multi-detector architecture that integrates: (1) robust statistical hubness detection utilizing median/MAD-based z-scores, (2) cluster spread analysis to assess cross-cluster retrieval patterns, (3) stability testing under query perturbations, and (4) domain-aware and modality-aware detection for category-specific and cross-modal attacks. Our solution accommodates several vector databases (FAISS, Pinecone, Qdrant, Weaviate) and offers versatile retrieval techniques, including vector similarity, hybrid search, and lexical matching with reranking capabilities. We evaluate hubscan on Food-101, MS-COCO, and FiQA adversarial hubness benchmarks constructed using state-of-the-art gradient-optimized and centroid-based hub generation methods. hubscan achieves 90% recall at a 0.2% alert budget and 100% recall at 0.4%, with adversarial hubs ranking above the 99.8th percentile. Domain-scoped scanning recovers 100% of targeted attacks that evade global detection. Production validation on 1M real web documents from MS MARCO demonstrates significant score separation between clean documents and adversarial content. Our work provides a practical, extensible framework for detecting hubness threats in production RAG systems.

RESULT

ScienceToStartup currently rates this 9.0/10 on the public viability pass. Nevertheless, these systems encounter a significant security flaw: hubness - items that frequently appear in the top-k retrieval results for a disproportionately high number…

WHY NOW

AI Security moved forward this cycle; last verified April 2026. Public score 9.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score9.0

PainHubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

Competitive landscape

HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

Segment

AI Security

Adoption evidence

No public code link in the paper record yet

Commercial read

9.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "bb63de00-9117-4fa8-a9b7-425c51547463", "arxiv_id": "2602.22427", "canonical_route": "/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "endpoints": { "paper_pack": "/api/v1/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems/paper-pack", "build_passport": "/api/v1/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems", "normalized_query": "2602.22427", "route": "/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "paper_ref": "hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems#webpage", "url": "https://sciencetostartup.com/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "name": "HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems", "description": "HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems#scholarlyArticle", "headline": "HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems", "description": "HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.", "url": "https://sciencetostartup.com/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems", "sameAs": "https://arxiv.org/abs/2602.22427", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2602.22427" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-02-25T21:37:53.000Z", "author": [ { "@type": "Person", "name": "Idan Habler", "affiliation": { "@type": "Organization", "name": "Cisco" } }, { "@type": "Person", "name": "Vineeth Sai Narajala", "affiliation": { "@type": "Organization", "name": "Cisco" } }, { "@type": "Person", "name": "Stav Koren", "affiliation": { "@type": "Organization", "name": "Tel Aviv University" } }, { "@type": "Person", "name": "Amy Chang", "affiliation": { "@type": "Organization", "name": "Cisco" } }, { "@type": "Person", "name": "Tiffany Saade", "affiliation": { "@type": "Organization", "name": "Cisco" } } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 9 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "AI Security" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "AI Security", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "HubScan: Detecting Hubness Poisoning in Retrieval-Augmented ", "item": "https://sciencetostartup.com/paper/hubscan-detecting-hubness-poisoning-in-retrieval-augmented-generation-systems" } ] }, { "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "What is the startup potential of \"HubScan: Detecting Hubness Poisoning in Retrieval-Augmented \"?", "acceptedAnswer": { "@type": "Answer", "text": "HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access." } }, { "@type": "Question", "name": "What products could be built from this research?", "acceptedAnswer": { "@type": "Answer", "text": "Transform HubScan into a plug-in or standalone security tool for AI-driven applications that utilize RAG systems, with integration options for common vector databases like FAISS and Weaviate." } }, { "@type": "Question", "name": "What are the practical use cases?", "acceptedAnswer": { "@type": "Answer", "text": "Commercial cybersecurity software for companies using RAG systems to prevent data poisoning attacks, ensuring reliable AI outputs." } }, { "@type": "Question", "name": "What industries could this research disrupt?", "acceptedAnswer": { "@type": "Answer", "text": "Replaces manual oversight and traditional security protocols in AI systems which are ineffective in real-time detection of hubness attacks, offering automated, proactive threat detection." } } ] } ] }

Competitive landscape

HubScan detects and mitigates hubness poisoning attacks in retrieval-augmented generation systems for secure AI data access.

Segment

AI Security

Adoption evidence

No public code link in the paper record yet

Commercial read

9.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

HubScan: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline