ARXIV:2605.07209 · LLM HALLUCINATION DETECTION · SUBMITTED 11 MAY · 20:36 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: partial proof status

Hallucination Detection via Activations of Open-Weight Proxy Analyzers

Akshita Singh · Prabesh Paudel · Siddhartha Roy · arXiv

A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

Ship in 2-4 weeks›Score7.0Evidence partial

Opportunity summary

Pain A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

Evidence 0 refs | 4 sources | 83% coverage

Blocker Evidence partial

Open Build Read PDF Signal Canvas Track

PROBLEM

A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods. Instead of looking inside the generating model, our system reads already-generated text through a small locally hosted…

METHOD

Full abstract

We introduce a proxy-analyzer framework for detecting hallucinations in large language models. Instead of looking inside the generating model, our system reads already-generated text through a small locally hosted open-weight model and spots hallucinations using the reader's own internal activations. This works just as well when the generator is a closed API like GPT-4 as when it is any open-weight model. We built eighteen features grounded in how transformers process text, covering residual stream norms, per-head source-document attention, entropy, MLP activations, logit-lens trajectories, and three new token-level grounding statistics. We trained a stacking ensemble on 72,135 samples from five hallucination datasets. We tested across seven analyzer architectures from 0.5 billion to 9 billion parameters: Qwen2.5 at 0.5B and 7B, Gemma-2 at 2B and 9B, Pythia at 1.4B, and LLaMA-3 at both 3B and 8B. Across all seven, we consistently beat ReDeEP's token-level AUC of 0.73 on RAGTruth by 7.4 to 10.3 percentage points. Qwen2.5-7B reached an F1 of 0.717, just above ReDeEP's 0.713, while Qwen2.5-0.5B hit 0.706. The most striking finding is how tightly all seven models cluster: AUC spans only 2.3 percentage points across an eighteen-fold difference in model size. Even more surprising, our 3B LLaMA outperforms our 8B LLaMA on RAGTruth, showing that bigger is not always better even within the same model family. Both RAGTruth and LLM-AggreFact include outputs from multiple LLM families, so our results are not skewed toward any particular generator.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Both RAGTruth and LLM-AggreFact include outputs from multiple LLM families, so our results are not skewed toward any particular generator. A public repository is…

WHY NOW

LLM Hallucination Detection moved forward this cycle; last verified May 2026. Public score 7.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

Evidence0 refs | 4 sources | 83% coverage

Blockerno shell-level blocker reported

Analysis summary

A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: partial proof status

Competitive landscape

A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

Segment

LLM Hallucination Detection

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "8bc4b62a-d829-4b10-8f72-68ff448dd229", "arxiv_id": "2605.07209", "canonical_route": "/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "endpoints": { "paper_pack": "/api/v1/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers/paper-pack", "build_passport": "/api/v1/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Hallucination Detection via Activations of Open-Weight Proxy Analyzers", "normalized_query": "2605.07209", "route": "/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "paper_ref": "hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers#webpage", "url": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "name": "Hallucination Detection via Activations of Open-Weight Proxy Analyzers", "description": "A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers#scholarlyArticle", "headline": "Hallucination Detection via Activations of Open-Weight Proxy Analyzers", "description": "A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.", "url": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers", "sameAs": "https://arxiv.org/abs/2605.07209", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.07209" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-08T03:57:41.000Z", "author": [ { "@type": "Person", "name": "Akshita Singh" }, { "@type": "Person", "name": "Prabesh Paudel" }, { "@type": "Person", "name": "Siddhartha Roy" } ], "codeRepository": "https://github.com/hallu-detect/llm_hallucination_detection", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Hallucination Detection" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers#software", "name": "Hallucination Detection via Activations of Open-Weight Proxy Analyzers - Source Code", "description": "A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.", "codeRepository": "https://github.com/hallu-detect/llm_hallucination_detection", "url": "https://github.com/hallu-detect/llm_hallucination_detection" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Hallucination Detection", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Hallucination Detection via Activations of Open-Weight Proxy", "item": "https://sciencetostartup.com/paper/hallucination-detection-via-activations-of-open-weight-proxy-analyzers" } ] } ] }

Competitive landscape

A lightweight, open-weight model system that detects LLM hallucinations by analyzing generated text activations, outperforming existing methods.

Segment

LLM Hallucination Detection

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Hallucination Detection via Activations of Open-Weight Proxy Analyzers

Hallucination Detection via Activations of Open-Weight Proxy Analyzers

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline