ARXIV:2604.00455 · VISION-LANGUAGE MODELS · SUBMITTED 03 APR · 20:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models

Jiwoo Ha · Jongwoo Baek · Jinhyun So · arXiv

A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

Evidence 54 refs | 4 sources | 67% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead. However, object hallucination -- the generation of nonexistent objects in answers -- remains a persistent…

METHOD

Full abstract

Recent Large Vision-Language Models (LVLMs) have demonstrated remarkable performance across various multimodal tasks that require understanding both visual and linguistic inputs. However, object hallucination -- the generation of nonexistent objects in answers -- remains a persistent challenge. Although several approaches such as retraining and external grounding methods have been proposed to mitigate this issue, they still suffer from high data costs or structural complexity. Training-free methods such as Contrastive Decoding (CD) are more cost-effective, avoiding additional training or external models, but still suffer from long-term decay, where visual grounding weakens and language priors dominate as the generation progresses. In this paper, we propose First Logit Boosting (FLB), a simple yet effective training-free technique designed to alleviate long-term decay in LVLMs. FLB stores the logit of the first generated token and adds it to subsequent token predictions, effectively mitigating long-term decay of visual information. We observe that FLB (1) sustains the visual information embedded in the first token throughout generation, and (2) suppresses hallucinated words through the stabilizing effect of the ``The'' token. Experimental results show that FLB significantly reduces object hallucination across various tasks, benchmarks, and backbone models. Notably, it causes negligible inference overhead, making it highly applicable to real-time multimodal systems. Code is available at https://github.com/jiwooha20/FLB

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Experimental results show that FLB significantly reduces object hallucination across various tasks, benchmarks, and backbone models. A public repository is linked, so build verification…

WHY NOW

Vision-Language Models moved forward this cycle; last verified April 2026. Public score 7.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

Evidence54 refs | 4 sources | 67% coverage

Blockerno shell-level blocker reported

Analysis summary

A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

Segment

Vision-Language Models

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "cdbc7089-f89e-4503-9447-0072462e9a1a", "arxiv_id": "2604.00455", "canonical_route": "/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "endpoints": { "paper_pack": "/api/v1/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models/paper-pack", "build_passport": "/api/v1/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models", "normalized_query": "2604.00455", "route": "/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "paper_ref": "first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models#webpage", "url": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "name": "First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models", "description": "A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models#scholarlyArticle", "headline": "First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models", "description": "A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.", "url": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models", "sameAs": "https://arxiv.org/abs/2604.00455", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.00455" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-01T04:05:50.000Z", "author": [ { "@type": "Person", "name": "Jiwoo Ha" }, { "@type": "Person", "name": "Jongwoo Baek" }, { "@type": "Person", "name": "Jinhyun So" } ], "codeRepository": "https://github.com/jiwooha20/FLB", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Vision-Language Models" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models#software", "name": "First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models - Source Code", "description": "A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.", "codeRepository": "https://github.com/jiwooha20/FLB", "url": "https://github.com/jiwooha20/FLB" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Vision-Language Models", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "First Logit Boosting: Visual Grounding Method to Mitigate Ob", "item": "https://sciencetostartup.com/paper/first-logit-boosting-visual-grounding-method-to-mitigate-object-hallucination-in-large-vision-language-models" } ] } ] }

Competitive landscape

A training-free method called First Logit Boosting to mitigate object hallucination in Large Vision-Language Models with negligible inference overhead.

Segment

Vision-Language Models

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models

First Logit Boosting: Visual Grounding Method to Mitigate Object Hallucination in Large Vision-Language Models

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline