ARXIV:2603.28458 · LLM OPTIMIZATION · SUBMITTED 31 MAR · 20:18 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

Yufei Xu · Fanxu Meng · Fan Jiang · Yuxuan Wang · Ruijie Zhou · Jiexi Wu · +8 at arXiv

HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

Evidence 4 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss. While the downstream sparse attention scales efficiently, the indexer still scans the entire prefix…

METHOD

Full abstract

Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical token for each query using a lightweight indexer, and then computing attention only over the selected subset. While the downstream sparse attention scales efficiently, the indexer still scans the entire prefix for every query, introducing an O($L^2$) per-layer bottleneck that becomes prohibitive as context length grows. We propose HISA (Hierarchical Indexed Sparse Attention), a drop-in replacement for the indexer that transforms the search process from a flat token scan into a two-stage hierarchical procedure. First, a block-level coarse filter scores pooled block representatives to prune irrelevant regions. Then, a token-level refinement applies the original indexer only within the remaining candidate blocks. HISA preserves the exact token-level top-k sparsity pattern required by the downstream Sparse MLA operator and requires no additional training. On kernel-level benchmarks, HISA achieves a 2$\times$ speedup at 32K context length and 4$\times$ at 128K. On Needle-in-a-Haystack and LongBench, we directly replace the indexer in DeepSeek-V3.2 with HISA, without any fine-tuning. HISA closely matches the original DSA in quality while significantly outperforming block-sparse baselines. Moreover, the token selection sets produced by HISA and the original DSA exhibit a mean IoU greater than 99%, indicating that the efficiency gains come with virtually no impact on selection fidelity.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Token-level sparse attention mechanisms, exemplified by DeepSeek Sparse Attention (DSA), achieve fine-grained key selection by scoring every historical token for each query using a…

WHY NOW

LLM Optimization moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainHISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

Evidence4 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

Segment

LLM Optimization

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "c7148702-5fc3-4907-a4ea-bd89397e0f7c", "arxiv_id": "2603.28458", "canonical_route": "/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "endpoints": { "paper_pack": "/api/v1/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention/paper-pack", "build_passport": "/api/v1/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention", "normalized_query": "2603.28458", "route": "/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "paper_ref": "hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention#webpage", "url": "https://sciencetostartup.com/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "name": "HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention", "description": "HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention#scholarlyArticle", "headline": "HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention", "description": "HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.", "url": "https://sciencetostartup.com/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention", "sameAs": "https://arxiv.org/abs/2603.28458", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.28458" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-30T13:59:51.000Z", "author": [ { "@type": "Person", "name": "Yufei Xu" }, { "@type": "Person", "name": "Fanxu Meng" }, { "@type": "Person", "name": "Fan Jiang" }, { "@type": "Person", "name": "Yuxuan Wang" }, { "@type": "Person", "name": "Ruijie Zhou" }, { "@type": "Person", "name": "Jiexi Wu" }, { "@type": "Person", "name": "Zhixin Pan" }, { "@type": "Person", "name": "Zhaohui Wang" }, { "@type": "Person", "name": "Xiaojuan Tang" }, { "@type": "Person", "name": "Wenjie Pei" }, { "@type": "Person", "name": "Tongxuan Liu" }, { "@type": "Person", "name": "Di yin" }, { "@type": "Person", "name": "Xing Sun" }, { "@type": "Person", "name": "Muhan Zhang" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Optimization" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Optimization", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "HISA: Efficient Hierarchical Indexing for Fine-Grained Spars", "item": "https://sciencetostartup.com/paper/hisa-efficient-hierarchical-indexing-for-fine-grained-sparse-attention" } ] } ] }

Competitive landscape

HISA offers a drop-in replacement for sparse attention indexers, enabling 2-4x speedups at long context lengths with minimal quality loss.

Segment

LLM Optimization

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

HISA: Efficient Hierarchical Indexing for Fine-Grained Sparse Attention

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline