ARXIV:2603.23650 · MULTIMODAL EMOTION RECOGNITION · SUBMITTED 02 APR · 02:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge

Masoumeh Chapariniya · Aref Farhadipour · Sarah Ebling · Volker Dellwo · Teodora Vukovic · arXiv

A multimodal system for blended emotion recognition that leverages late fusion of specialized encoders, including a novel application of Gemini Embedding 2.0 for competitive accuracy with short video inputs.

Blocked on Code›Score4.0Evidence unverified

Opportunity summary

Pain A multimodal system for blended emotion recognition that leverages late fusion of specialized encoders, including a novel application of Gemini Embedding 2.0 for competitive accuracy with short video inputs.

Evidence 0 refs | 0 sources | 17% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

We present our system for the BLEMORE Challenge at FG 2026 on blended emotion recognition with relative salience prediction. Our approach combines six encoder families through late probability fusion: an S4D-ViTMoE face encoder adapted with soft-label KL training, frozen layer-selective Wav2Vec2 audio features, finetuned body-language encoders (TimeSformer, VideoMAE), and -- for the first time in emotion recognition -- Gemini Embedding 2.0, a large multimodal model whose video embeddings produce competitive presence accuracy (ACCP = 0.320) from only 2 seconds of input. Three key findings emerge from our experiments: selecting prosody-encoding layers (6--12) from frozen Wav2Vec2 outperforms end-to-end finetuning (Score 0.207 vs. 0.161), as the non-verbal nature of BLEMORE audio makes phonetic layers irrelevant; the post-processing salience threshold $β$ varies from 0.05 to 0.43 across folds, revealing that personalized expression styles are the primary bottleneck; and task-adapted encoders collectively receive 62\% of ensemble weight over general-purpose baselines. Our 12-encoder system achieves Score = 0.279 (ACCP = 0.391, ACCS = 0.168) on the test set, placing 6th.

RESULT

ScienceToStartup currently rates this 4.0/10 on the public viability pass. Our 12-encoder system achieves Score = 0.279 (ACCP = 0.391, ACCS = 0.168) on the test set, placing 6th.

WHY NOW

Multimodal Emotion Recognition moved forward this cycle; last verified April 2026. Public score 4.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score4.0

PainA multimodal system for blended emotion recognition that leverages late fusion of specialized encoders, including a novel application of Gemini Embedding 2.0 for competitive accuracy with short video inputs.

Evidence0 refs | 0 sources | 17% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge

Masoumeh Chapariniya · Aref Farhadipour · Sarah Ebling · Volker Dellwo · Teodora Vukovic · arXiv

Competitive landscape

Segment

Multimodal Emotion Recognition

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "73dc7025-5d71-4044-bb03-dfe82a950929", "arxiv_id": "2603.23650", "canonical_route": "/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "endpoints": { "paper_pack": "/api/v1/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge/paper-pack", "build_passport": "/api/v1/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge", "normalized_query": "2603.23650", "route": "/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "paper_ref": "foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge#webpage", "url": "https://sciencetostartup.com/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "name": "Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge", "description": "A multimodal system for blended emotion recognition that leverages late fusion of specialized encoders, including a novel application of Gemini Embedding 2.0 for competitive accuracy with short video inputs.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge#scholarlyArticle", "headline": "Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge", "description": "A multimodal system for blended emotion recognition that leverages late fusion of specialized encoders, including a novel application of Gemini Embedding 2.0 for competitive accuracy with short video inputs.", "url": "https://sciencetostartup.com/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge", "sameAs": "https://arxiv.org/abs/2603.23650", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.23650" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-24T18:49:49.000Z", "author": [ { "@type": "Person", "name": "Masoumeh Chapariniya" }, { "@type": "Person", "name": "Aref Farhadipour" }, { "@type": "Person", "name": "Sarah Ebling" }, { "@type": "Person", "name": "Volker Dellwo" }, { "@type": "Person", "name": "Teodora Vukovic" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 4 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Multimodal Emotion Recognition" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Multimodal Emotion Recognition", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Foundation Model Embeddings Meet Blended Emotions: A Multimo", "item": "https://sciencetostartup.com/paper/foundation-model-embeddings-meet-blended-emotions-a-multimodal-fusion-approach-for-the-blemore-challenge" } ] } ] }

Competitive landscape

Segment

Multimodal Emotion Recognition

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge

Foundation Model Embeddings Meet Blended Emotions: A Multimodal Fusion Approach for the BLEMORE Challenge

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline