ARXIV:2602.11799 · AI FOR RECOMMENDATIONS · SUBMITTED 17 MAR · 19:46 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Q: What products could be built from this research?

This framework can be productized by integrating into existing recommendation systems of online platforms such as social media, e-commerce, or streaming services to harness multi-modal data for better personalization and user engagement.

Q: What are the practical use cases?

A recommendation engine for a music streaming service that offers users personalized song suggestions based on both their listening behavior and metadata such as song cover art and descriptions.

Q: What industries could this research disrupt?

Hi-SAM could replace traditional recommendation systems that rely heavily on sparse IDs by providing richer, multi-modal insights, improving recommendation quality especially in cold-start scenarios.

arXiv

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Blocked on Code›Score8.0Evidence failed

Opportunity summary

Pain Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence failed

Open Build Read PDF Signal Canvas Track

PROBLEM

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement. Semantic ID-based approaches effectively discretize this information into compact tokens.

METHOD

Multi-modal recommendation has gained traction as items possess rich attributes like text and images. Semantic ID-based approaches effectively discretize this information into compact tokens.

Full abstract

Multi-modal recommendation has gained traction as items possess rich attributes like text and images. Semantic ID-based approaches effectively discretize this information into compact tokens. However, two challenges persist: (1) Suboptimal Tokenization: existing methods (e.g., RQ-VAE) lack disentanglement between shared cross-modal semantics and modality-specific details, causing redundancy or collapse; (2) Architecture-Data Mismatch: vanilla Transformers treat semantic IDs as flat streams, ignoring the hierarchy of user interactions, items, and tokens. Expanding items into multiple tokens amplifies length and noise, biasing attention toward local details over holistic semantics. We propose Hi-SAM, a Hierarchical Structure-Aware Multi-modal framework with two designs: (1) Disentangled Semantic Tokenizer (DST): unifies modalities via geometry-aware alignment and quantizes them via a coarse-to-fine strategy. Shared codebooks distill consensus while modality-specific ones recover nuances from residuals, enforced by mutual information minimization; (2) Hierarchical Memory-Anchor Transformer (HMAT): splits positional encoding into inter- and intra-item subspaces via Hierarchical RoPE to restore hierarchy. It inserts Anchor Tokens to condense items into compact memory, retaining details for the current item while accessing history only through compressed summaries. Experiments on real-world datasets show consistent improvements over SOTA baselines, especially in cold-start scenarios. Deployed on a large-scale social platform serving millions of users, Hi-SAM achieved a 6.55% gain in the core online metric.

RESULT

ScienceToStartup currently rates this 8.0/10 on the public viability pass. Experiments on real-world datasets show consistent improvements over SOTA baselines, especially in cold-start scenarios.

WHY NOW

AI for Recommendations moved forward this cycle; last verified April 2026. Public score 8.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score8.0

PainHi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

ARXIV:2602.11799 · AI FOR RECOMMENDATIONS · SUBMITTED 17 MAR · 19:46 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

arXiv

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Blocked on Code›Score8.0Evidence failed

Opportunity summary

Pain Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence failed

Open Build Read PDF Signal Canvas Track

PROBLEM

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement. Semantic ID-based approaches effectively discretize this information into compact tokens.

METHOD

Multi-modal recommendation has gained traction as items possess rich attributes like text and images. Semantic ID-based approaches effectively discretize this information into compact tokens.

Full abstract

RESULT

ScienceToStartup currently rates this 8.0/10 on the public viability pass. Experiments on real-world datasets show consistent improvements over SOTA baselines, especially in cold-start scenarios.

WHY NOW

AI for Recommendations moved forward this cycle; last verified April 2026. Public score 8.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score8.0

PainHi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Evidence0 refs | 0 sources | 33% coverage

Blockermissing authors

Analysis summary

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsErrorProof: failed

Paper Pack

10.48550/arXiv.2602.11799

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Abstract

Source availability

PDF linked

The paper record includes a public PDF URL.

Extraction status

Derived fallback

Read summaries are estimated from adjacent metadata, not verified extraction rows.

Proof status

failed

0 refs; 0 sources; 33% coverage.

What was readable

linkedon filenot materialized8 extracted42 indexednot indexed

Derived fallback: Estimated from adjacent evidence; not verified from source.

Viability

8.0

Time to MVP

MVP estimate missing

Commercial

No commercial flags on file

Export

Preparing verified analysis

lens / founder

PROBLEM

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement. Semantic ID-based approaches effectively discretize this information into compact tokens.

METHOD

Multi-modal recommendation has gained traction as items possess rich attributes like text and images. Semantic ID-based approaches effectively discretize this information into compact tokens.

RESULT

ScienceToStartup currently rates this 8.0/10 on the public viability pass. Experiments on real-world datasets show consistent improvements over SOTA baselines, especially in cold-start scenarios.

WHY NOW

AI for Recommendations moved forward this cycle; last verified April 2026. Public score 8.0/10.

Claim map

Strong 8Mixed 0Weak 0

Evidencepartial
Disentangled Semantic Tokenizer (DST): unifies modalities via geometry-aware alignment and quantizes them via a coarse-to-fine strategy. Shared codebooks distill consensus while modality-specific ones recover nuances from residuals
Implicationpartial
Directly stated in the abstract with specific technical details about the method's design
Verificationpartial
partial
Evidencepartial
Hierarchical Memory-Anchor Transformer (HMAT): splits positional encoding into inter- and intra-item subspaces via Hierarchical RoPE to restore hierarchy
Implicationpartial
Directly stated in the abstract with specific technical details about the method's design
Verificationpartial
partial
Evidencepartial
Deployed on a large-scale social platform serving millions of users, Hi-SAM achieved a 6.55% gain in the core online metric
Implicationpartial
Directly stated in the abstract with specific numeric evidence from real-world deployment
Verificationpartial
partial
Evidencepartial
Experiments on real-world datasets show consistent improvements over SOTA baselines, especially in cold-start scenarios
Implicationpartial
Directly stated in the abstract with supporting evidence from experiments
Verificationpartial
partial
Evidencepartial
existing methods (e.g., RQ-VAE) lack disentanglement between shared cross-modal semantics and modality-specific details, causing redundancy or collapse
Implicationpartial
Directly stated in the abstract as a limitation of existing methods that Hi-SAM addresses
Verificationpartial
partial
Evidencepartial
vanilla Transformers treat semantic IDs as flat streams, ignoring the hierarchy of user interactions, items, and tokens. Expanding items into multiple tokens amplifies length and noise, biasing attention toward local details over holistic semantics
Implicationpartial
Directly stated in the abstract as a technical limitation that Hi-SAM addresses
Verificationpartial
partial
Evidencepartial
The system's success heavily depends on the quality and richness of multi-modal input data, and its performance may degrade if data is sparse or inconsistent across modalities
Implicationpartial
Directly stated in the analysis section as a caveat of the system
Verificationpartial
partial
Evidencepartial
It inserts Anchor Tokens to condense items into compact memory, retaining details for the current item while accessing history only through compressed summaries
Implicationpartial
Directly stated in the abstract with specific technical details about the method's design
Verificationpartial
partial

Constellation map

Paper-native neighborhood for concepts, methods, materials, markets, and competitors. Missing lanes stay labeled instead of disappearing behind commercialization gates.

Open full Signal Canvas

Concepts

not indexed

Methods

Multi-modal recommendation has gained traction as items possess rich attributes like text and images. Semantic ID-based approaches effectively discretize this information into compact tokens.

Materials

PDF linked

Markets

AI for Recommendations

Competitors

not indexed

Competitive landscape

Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.

Segment

AI for Recommendations

Adoption evidence

No public code link in the paper record yet

Commercial read

8.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Buzz

No indexed public discussion is attached to 2602.11799 yet. That is a visibility signal, not a blank module: the monitor is watching the public channels below.

Hacker News

Not indexed yet

Bluesky

Not indexed yet

PDF

Preview the source document here, or use the hero PDF action for a new tab.

References(42)

When Text-as-Vision Meets Semantic IDs in Generative Recommendation: An Empirical Study

2026Shutong Qiao, Wei Yuan et al.

Personalized Multi Modal Alignment Encoding for CTR-Recommendation in WeChat

2025Jiawei Zheng, Hao Gu et al.

VT-FSL: Bridging Vision and Text with LLMs for Few-Shot Learning

2025Wenhao Li, Qiangchang Wang et al.

Progressive Semantic Residual Quantization for Multimodal-Joint Interest Modeling in Music Recommendation

2025Shijia Wang, Tianpei Ouyang et al.

MTGR: Industrial-Scale Generative Recommendation Framework in Meituan

2025Ruidong Han, Bin Yin et al.

Gramian Multimodal Representation Learning and Alignment

2024Giordano Cicchetti, Eleonora Grassucci et al.

Bridging Language and Items for Retrieval and Recommendation

2024Yupeng Hou, Jiacheng Li et al.

Wukong: Towards a Scaling Law for Large-Scale Recommendation

2024Buyun Zhang, Liang Luo et al.

CoLLM: Integrating Collaborative Embeddings Into Large Language Models for Recommendation

2023Yang Zhang, Fuli Feng et al.

Qwen Technical Report

2023Jinze Bai, Shuai Bai et al.

A survey on large language models for recommendation

2023Likang Wu, Zhilan Zheng et al.

Recommender Systems with Generative Retrieval

2023Shashank Rajput, Nikhil Mehta et al.

Where to Go Next for Recommender Systems? ID- vs. Modality-based Recommender Models Revisited

2023Zheng Yuan, Fajie Yuan et al.

Multi-Modal Self-Supervised Learning for Recommendation

2023Wei Wei, Chao Huang et al.

Multimodal Movie Recommendation System Using Deep Learning

2023Yongheng Mu, Yun Wu

BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models

2023Junnan Li, Dongxu Li et al.

Learning Vector-Quantized Item Representation for Transferable Sequential Recommenders

2022Yupeng Hou, Zhankui He et al.

Understanding Scaling Laws for Recommendation Models

2022Newsha Ardalani, Carole-Jean Wu et al.

Towards Universal Sequence Representation Learning for Recommender Systems

2022Yupeng Hou, Shanlei Mu et al.

Balanced Multimodal Learning via On-the-fly Gradient Modulation

2022Xiaokang Peng, Yake Wei et al.

Showing 20 of 42 references

CITED BY

No citing papers are indexed in the public S2S graph yet. This is an explicit zero-signal state, not a hidden lookup.

Foundation

Prior WorkVLM4Rec: Multimodal Semantic Representation for Recommendation with Large Vision-Language Models

8.0

Prior WorkAnchored Alignment: Preventing Positional Collapse in Multimodal Recommender Systems

8.0

Extension

Builds On ThisStructSAM: Structure- and Spectrum-Preserving Token Merging for Segment Anything Models

7.0

Builds On ThisTokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds

7.0

Builds On ThisRethinking Generative Recommender Tokenizer: Recsys-Native Encoding and Semantic Quantization Beyond LLMs

7.0

Builds On ThisET-SAM: Efficient Point Prompt Prediction in SAM for Unified Scene Text Detection and Layout Analysis

7.0

Builds On ThisVLM2Rec: Resolving Modality Collapse in Vision-Language Model Embedders for Multimodal Sequential Recommendation

7.0

Builds On ThisHFP-SAM: Hierarchical Frequency Prompted SAM for Efficient Marine Animal Segmentation

7.0

Builds On ThisPRISM: Purified Representation and Integrated Semantic Modeling for Generative Sequential Recommendation

5.0

Builds On ThisAsymmetric Generative Recommendation via Multi-Expert Projection and Multi-Faceted Hierarchical Quantization

5.0

Commercially relevant

none indexed

Conflicting

none indexed

Owned Distribution

Subscribe to the weekly brief

Get the weekly shortlist of commercializable papers, benchmark movers, and proof receipts that matter for product execution.

Agent drawer

5 surfaces preserved for agents. Humans can ignore.

Developer contracts, payload previews, evidence maps, and run controls stay here instead of the Read, Build, and Track workspace.

Run context

Paper: 2602.11799
Route: /paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation
Active tab: read
Artifact: hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

Available agents

Read extractor
Build planner
Track monitor
Competitive mapper
Related-paper scout

API/MCP endpoints

REST paper pack API/api/v1/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation/paper-pack
REST build passport API/api/v1/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation/build-passport
REST OpenAPI/api/openapi.json
MCP descriptor/api/mcp
MCP resourcesciencetostartup://surfaces/paper-workspace

Tool contracts

paper_packbuild_passportopportunity_kernelforesightsource_proofevidence_state

Payload preview

Inspect payload

{
  "contract_version": "paper-r2",
  "paper_id": "1ef11a8d-daf1-408d-a523-6fecf27d40a2",
  "arxiv_id": "2602.11799",
  "canonical_route": "/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
  "active_tab": "synced from current hash by the drawer client",
  "selected_artifact": "hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
  "endpoints": {
    "paper_pack": "/api/v1/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation/paper-pack",
    "build_passport": "/api/v1/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation/build-passport",
    "mcp_resource": "sciencetostartup://surfaces/paper-workspace"
  }
}

Schema validation

paper-r2 contract: present
JSON-LD twin: SSR emitted
OpenAPI path parity: /api/openapi.json
MCP resource parity: paper-workspace

Job trace

queued: drawer opened by user action
running: inspect or copy payload
succeeded: payload available in SSR
failed: route errors appear in evidence cards

Evidence map

sources used: page freshness, source proof anchors, JSON-LD
missing sources: exposed by PaperPack and EvidenceState chips
derived fallbacks: marked unverified before handoff

Page Freshness

Canonical route, proof status, last verified, refs, sources, and coverage.

Page Freshness

Paper proof surface

Canonical route: /paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

degraded

Proof freshness: stale
Proof status: failed
Display score: 8/10
Last proof check: 2026-03-17
Score updated: 2026-04-02
Score fresh until: 2026-05-02
References: 0
Source count: 0
Coverage: 33%

This page has proof data, but the latest verification did not complete cleanly.

OpenAlex: pending — this preprint is not yet indexed by OpenAlex.

Agent Handoff

Endpoint list, payload shape, route context, and copyable handoff data.

Agent Handoff

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Canonical ID hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation | Route /paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

REST example

curl https://sciencetostartup.com/api/v1/agent-handoff/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

MCP example

{
  "tool": "get_paper",
  "arguments": {
    "arxiv_id": "2602.11799"
  }
}

source_context

{
  "surface": "paper",
  "mode": "paper",
  "query": "Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation",
  "normalized_query": "2602.11799",
  "route": "/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
  "paper_ref": "hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
  "topic_slug": null,
  "benchmark_ref": null,
  "dataset_ref": null
}

Buildability Receipt

Verdict, compute envelope, blockers, signature state, and receipt links.

Paper proof page receipt window

Watch and verify: Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

/buildability/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

Watchwatch

Subject: Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Verdict

Watch

Verdict is Watch because viability or proof quality is intermediate and should be re-evaluated before execution.

Time to first demo

Insufficient data

No first-demo timestamp, owner estimate, or elapsed demo receipt is attached to this surface.

Compute envelope

Structured compute envelope

Insufficient data

No data, compute, hardware, memory, latency, dependency, or serving requirement receipt is attached.

Evidence ids

Receipt path

/buildability/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

Paper ref

hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation

arXiv id

2602.11799

Freshness

Generated at

2026-03-17T19:46:04.153Z

Evidence freshness

stale

Last verification

2026-03-17T19:46:04.153Z

Sources

References

Coverage

33%

Hash state

Lineage hash

7613d6d577cdb37d99ecaab6a8b30259818bfc7defa2c3155ac2bcda8fd0ebd6

Canonical opportunity-kernel lineage hash.

Signature state

External signature

unsigned_external

No founder, registry, pilot, or production-adoption signature is attached to this receipt.

Verification

not_verified

Verification is blocked until an external signature is provided.

Blockers

Missing: repo_url
Missing: references
Missing: distribution_readiness_scores
Missing: paper_extraction_scorecards
Unknown: distribution readiness has not been computed yet

Verification pending / evidence receipt incomplete

repo_url

references

Missing proof, requirement, signature, approval, adoption, or telemetry fields are blockers and must not be inferred.

Open receipt API receipt Build Loop Signal Canvas Proof divergence Divergence API Brier outcomes API

Source Proof anchors

Visual citations from the paper document graph.

JSON-LD twin

The application/ld+json payload rendered for agents.

{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "WebPage",
      "@id": "https://sciencetostartup.com/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation#webpage",
      "url": "https://sciencetostartup.com/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
      "name": "Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation",
      "description": "Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.",
      "isPartOf": {
        "@id": "https://sciencetostartup.com/#website"
      }
    },
    {
      "@type": "ScholarlyArticle",
      "@id": "https://sciencetostartup.com/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation#scholarlyArticle",
      "headline": "Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation",
      "description": "Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement.",
      "url": "https://sciencetostartup.com/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation",
      "sameAs": "https://arxiv.org/abs/2602.11799",
      "identifier": {
        "@type": "PropertyValue",
        "propertyID": "arXiv",
        "value": "2602.11799"
      },
      "isAccessibleForFree": true,
      "isPartOf": {
        "@id": "https://sciencetostartup.com/#website"
      },
      "datePublished": "2026-02-12T10:26:15.000Z",
      "author": [
        {
          "@type": "Person",
          "name": "Pingjun Pan",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        },
        {
          "@type": "Person",
          "name": "Tingting Zhou",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        },
        {
          "@type": "Person",
          "name": "Peiyao Lu",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        },
        {
          "@type": "Person",
          "name": "Tingting Fei",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        },
        {
          "@type": "Person",
          "name": "Hongxiang Chen",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        },
        {
          "@type": "Person",
          "name": "Chuanjiang Luo",
          "affiliation": {
            "@type": "Organization",
            "name": "NetEase Cloud Music, NetEase"
          }
        }
      ],
      "citation": [
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "591c56d316c4c04dc9625397aa56e026363ec49f"
          },
          "url": "https://www.semanticscholar.org/paper/591c56d316c4c04dc9625397aa56e026363ec49f"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "dee40ee2d11b76d887d1ac008687dcbe4ab470af"
          },
          "url": "https://www.semanticscholar.org/paper/dee40ee2d11b76d887d1ac008687dcbe4ab470af"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "1ef78e6684b233d54a9111bbf5a448256346533f"
          },
          "url": "https://www.semanticscholar.org/paper/1ef78e6684b233d54a9111bbf5a448256346533f"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "6198c35af6b3dd77718802f49b94a0d9b0cb4d4c"
          },
          "url": "https://www.semanticscholar.org/paper/6198c35af6b3dd77718802f49b94a0d9b0cb4d4c"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "af68798a6abb93e82ad4f85234d155c64f6c341f"
          },
          "url": "https://www.semanticscholar.org/paper/af68798a6abb93e82ad4f85234d155c64f6c341f"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "6ec13b203f9c6ab17f11b16cc9954d8a9c5dd4e1"
          },
          "url": "https://www.semanticscholar.org/paper/6ec13b203f9c6ab17f11b16cc9954d8a9c5dd4e1"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "13965d8d68217308ff8c7e738ced637739c6a1b8"
          },
          "url": "https://www.semanticscholar.org/paper/13965d8d68217308ff8c7e738ced637739c6a1b8"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "b5948add54dc0e44cd40b9747523d2494453b8b4"
          },
          "url": "https://www.semanticscholar.org/paper/b5948add54dc0e44cd40b9747523d2494453b8b4"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "8f3b6a299098eb2e615e344b2f76a23dfca4d9ca"
          },
          "url": "https://www.semanticscholar.org/paper/8f3b6a299098eb2e615e344b2f76a23dfca4d9ca"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "5fc1a3a49e8f1d106118b69d1d6be3b6caa23da0"
          },
          "url": "https://www.semanticscholar.org/paper/5fc1a3a49e8f1d106118b69d1d6be3b6caa23da0"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "b486982fa7c68a8a08df1111ba9607119419c488"
          },
          "url": "https://www.semanticscholar.org/paper/b486982fa7c68a8a08df1111ba9607119419c488"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "a289100678e7d94af836d91cd48d7821ebc5b83d"
          },
          "url": "https://www.semanticscholar.org/paper/a289100678e7d94af836d91cd48d7821ebc5b83d"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "11628f656257e75e46447ac21cdaa86c4b340a0a"
          },
          "url": "https://www.semanticscholar.org/paper/11628f656257e75e46447ac21cdaa86c4b340a0a"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "81a3f2545103d640397c36afa244a825f108e452"
          },
          "url": "https://www.semanticscholar.org/paper/81a3f2545103d640397c36afa244a825f108e452"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "f6fbc035aa6b04fa384fa993d1703e0c1ad00688"
          },
          "url": "https://www.semanticscholar.org/paper/f6fbc035aa6b04fa384fa993d1703e0c1ad00688"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "3f5b31c4f7350dc88002c121aecbdc82f86eb5bb"
          },
          "url": "https://www.semanticscholar.org/paper/3f5b31c4f7350dc88002c121aecbdc82f86eb5bb"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "5cb1872a34e1755d41ed9cd481fbeb33d0665b5f"
          },
          "url": "https://www.semanticscholar.org/paper/5cb1872a34e1755d41ed9cd481fbeb33d0665b5f"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "93f9d29445a1236c0b1ab45026c2e308b9b74c15"
          },
          "url": "https://www.semanticscholar.org/paper/93f9d29445a1236c0b1ab45026c2e308b9b74c15"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "e7f3f2b77994c2cabad612ba0881e63763ec2dad"
          },
          "url": "https://www.semanticscholar.org/paper/e7f3f2b77994c2cabad612ba0881e63763ec2dad"
        },
        {
          "@type": "ScholarlyArticle",
          "identifier": {
            "@type": "PropertyValue",
            "propertyID": "SemanticScholar",
            "value": "48967206cd2781e2a390754a5ca79230b8735d42"
          },
          "url": "https://www.semanticscholar.org/paper/48967206cd2781e2a390754a5ca79230b8735d42"
        }
      ],
      "additionalProperty": [
        {
          "@type": "PropertyValue",
          "propertyID": "viabilityScore",
          "value": 8
        },
        {
          "@type": "PropertyValue",
          "propertyID": "researchDomain",
          "value": "AI for Recommendations"
        }
      ]
    },
    {
      "@type": "BreadcrumbList",
      "itemListElement": [
        {
          "@type": "ListItem",
          "position": 1,
          "name": "Home",
          "item": "https://sciencetostartup.com"
        },
        {
          "@type": "ListItem",
          "position": 2,
          "name": "AI for Recommendations",
          "item": "https://sciencetostartup.com/topics"
        },
        {
          "@type": "ListItem",
          "position": 3,
          "name": "Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework",
          "item": "https://sciencetostartup.com/paper/hi-sam-a-hierarchical-structure-aware-multi-modal-framework-for-large-scale-recommendation"
        }
      ]
    },
    {
      "@type": "FAQPage",
      "mainEntity": [
        {
          "@type": "Question",
          "name": "What is the startup potential of \"Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework\"?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "Hi-SAM leverages multi-modal data to enhance large-scale recommendation systems for improved user engagement."
          }
        },
        {
          "@type": "Question",
          "name": "What products could be built from this research?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "This framework can be productized by integrating into existing recommendation systems of online platforms such as social media, e-commerce, or streaming services to harness multi-modal data for better personalization and user engagement."
          }
        },
        {
          "@type": "Question",
          "name": "What are the practical use cases?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "A recommendation engine for a music streaming service that offers users personalized song suggestions based on both their listening behavior and metadata such as song cover art and descriptions."
          }
        },
        {
          "@type": "Question",
          "name": "What industries could this research disrupt?",
          "acceptedAnswer": {
            "@type": "Answer",
            "text": "Hi-SAM could replace traditional recommendation systems that rely heavily on sparse IDs by providing richer, multi-modal insights, improving recommendation quality especially in cold-start scenarios."
          }
        }
      ]
    }
  ]
}

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Hi-SAM: A Hierarchical Structure-Aware Multi-modal Framework for Large-Scale Recommendation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

References(42)

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

References(42)

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline