ARXIV:2605.07640 · REMOTE SENSING INTERPRETATION · SUBMITTED 11 MAY · 20:40 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

Jun Wang · Fengpeng Li · Hang Dong · Tianjin Huang · Wei Han · arXiv

LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding. Unlike general land-cover recognition, lithology interpretation is a knowledge-intensive task that requires…

METHOD

Full abstract

Remote sensing lithology interpretation is fundamental to geological surveys, mineral exploration, and regional geological mapping. Unlike general land-cover recognition, lithology interpretation is a knowledge-intensive task that requires experts to infer rock types from various features, e.g., subtle visual, spectral, textural, geomorphological, and contextual cues, making reliable automated interpretation highly challenging. Geological knowledge-guided large multimodal models offer new opportunities, yet their evaluation remains constrained by the lack of benchmarks that capture lithological annotations, multi-level geological semantics, and expert-informed assessment. Here, we propose LithoBench, a multi-level benchmark for evaluating geological semantic understanding in remote sensing lithology interpretation. LithoBench contains 10,000 expert-annotated interpretation instances across 12 representative lithological categories, including 4,000 multiple-choice and 6,000 open-ended tasks organized into five cognitive levels: Identification and Description, Comparative Analysis, Mechanism Explanation, Practical Application, and Comprehensive Reasoning. We further develop an expert-in-the-loop, knowledge-grounded semi-automated construction pipeline, coupling multi sub-processes, e.g., structured geological image descriptions, to enhance geological validity and evaluation reliability. Experiments with multiple large vision-language models eveal substantial limitations in geological semantic understanding, particularly on higher-order explanation, application, and reasoning tasks.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Experiments with multiple large vision-language models eveal substantial limitations in geological semantic understanding, particularly on higher-order explanation, application, and reasoning tasks. Code availability is…

WHY NOW

Remote Sensing Interpretation moved forward this cycle; last verified May 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainLithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

Segment

Remote Sensing Interpretation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "c2e5f68d-8b0c-4186-a7f3-dfa0cbe634c6", "arxiv_id": "2605.07640", "canonical_route": "/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "endpoints": { "paper_pack": "/api/v1/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation/paper-pack", "build_passport": "/api/v1/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation", "normalized_query": "2605.07640", "route": "/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "paper_ref": "lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation#webpage", "url": "https://sciencetostartup.com/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "name": "LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation", "description": "LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation#scholarlyArticle", "headline": "LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation", "description": "LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.", "url": "https://sciencetostartup.com/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation", "sameAs": "https://arxiv.org/abs/2605.07640", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.07640" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-08T12:07:26.000Z", "author": [ { "@type": "Person", "name": "Jun Wang" }, { "@type": "Person", "name": "Fengpeng Li" }, { "@type": "Person", "name": "Hang Dong" }, { "@type": "Person", "name": "Tianjin Huang" }, { "@type": "Person", "name": "Wei Han" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Remote Sensing Interpretation" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Remote Sensing Interpretation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "LithoBench: Benchmarking Large Multimodal Models for Remote-", "item": "https://sciencetostartup.com/paper/lithobench-benchmarking-large-multimodal-models-for-remote-sensing-lithology-interpretation" } ] } ] }

Competitive landscape

LithoBench is a multi-level benchmark for evaluating large multimodal models in remote sensing lithology interpretation, revealing significant limitations in geological semantic understanding.

Segment

Remote Sensing Interpretation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

LithoBench: Benchmarking Large Multimodal Models for Remote-Sensing Lithology Interpretation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline