ARXIV:2604.02020 · VISION-LANGUAGE MODELS · SUBMITTED 03 APR · 20:50 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence

Dian Liu · Jie Feng · Di Li · Yuhui Zheng · Guanbin Li · Weisheng Dong · +1 at arXiv

A new benchmark and adapter for Vision-Language Models to enable dynamic spatial intelligence between UAVs and satellites, addressing a critical gap in cross-view reasoning for emergency response and security.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A new benchmark and adapter for Vision-Language Models to enable dynamic spatial intelligence between UAVs and satellites, addressing a critical gap in cross-view reasoning for emergency response and security.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

Synergistic spatial intelligence between UAVs and satellites is indispensable for emergency response and security operations, as it uniquely integrates macro-scale global coverage with dynamic, real-time local perception. However, the capacity of Vision-Language Models (VLMs) to master this complex interplay remains largely unexplored. This gap persists primarily because existing benchmarks are confined to isolated Unmanned Aerial Vehicle (UAV) videos or static satellite imagery, failing to evaluate the dynamic local-to-global spatial mapping essential for comprehensive cross-view reasoning. To bridge this gap, we introduce LinkS$^2$Bench, the first comprehensive benchmark designed to evaluate VLMs' wide-area, dynamic cross-view spatial intelligence. LinkS$^2$Bench links 1,022 minutes of dynamic UAV footage with high-resolution satellite imagery covering over 200 km$^2$. Through an LMM-assisted pipeline and rigorous human annotation, we constructed 17.9k high-quality question-answer pairs comprising 12 fine-grained tasks across four dimensions: perception, localization, relation, and reasoning. Evaluations of 18 representative VLMs reveal a substantial gap compared to human baselines, identifying accurate cross-view dynamic alignment as the critical bottleneck. To alleviate this, we design a Cross-View Alignment Adapter, demonstrating that explicit alignment significantly improves model performance. Furthermore, fine-tuning experiments underscore the potential of LinkS$^2$Bench in advancing VLM adaptation for complex spatial reasoning.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. To alleviate this, we design a Cross-View Alignment Adapter, demonstrating that explicit alignment significantly improves model performance. Code availability is flagged in the production…

WHY NOW

Vision-Language Models moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA new benchmark and adapter for Vision-Language Models to enable dynamic spatial intelligence between UAVs and satellites, addressing a critical gap in cross-view reasoning for emergency response and security.

Evidence0 refs | 0 sources | 33% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "dffcaa3c-96c7-4b36-9bbf-eb76bac9f404", "arxiv_id": "2604.02020", "canonical_route": "/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "endpoints": { "paper_pack": "/api/v1/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence/paper-pack", "build_passport": "/api/v1/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence", "normalized_query": "2604.02020", "route": "/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "paper_ref": "are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence#webpage", "url": "https://sciencetostartup.com/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "name": "Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence", "description": "A new benchmark and adapter for Vision-Language Models to enable dynamic spatial intelligence between UAVs and satellites, addressing a critical gap in cross-view reasoning for emergency response and security.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence#scholarlyArticle", "headline": "Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence", "description": "A new benchmark and adapter for Vision-Language Models to enable dynamic spatial intelligence between UAVs and satellites, addressing a critical gap in cross-view reasoning for emergency response and security.", "url": "https://sciencetostartup.com/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence", "sameAs": "https://arxiv.org/abs/2604.02020", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.02020" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-02T13:22:57.000Z", "author": [ { "@type": "Person", "name": "Dian Liu" }, { "@type": "Person", "name": "Jie Feng" }, { "@type": "Person", "name": "Di Li" }, { "@type": "Person", "name": "Yuhui Zheng" }, { "@type": "Person", "name": "Guanbin Li" }, { "@type": "Person", "name": "Weisheng Dong" }, { "@type": "Person", "name": "Guangming Shi" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Vision-Language Models" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Vision-Language Models", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-", "item": "https://sciencetostartup.com/paper/are-vlms-lost-between-sky-and-space-links-2-bench-for-uav-satellite-dynamic-cross-view-spatial-intelligence" } ] } ] }

Competitive landscape

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence

Are VLMs Lost Between Sky and Space? LinkS$^2$Bench for UAV-Satellite Dynamic Cross-View Spatial Intelligence

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline