ARXIV:2605.12154 · MULTIMODAL OPTIMIZATION · SUBMITTED 13 MAY · 20:59 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling

Zhong Li · Qi Huang · Yuxuan Zhu · Mohammad Mohammadi Amiri · Niki van Stein · Thomas Bäck · +3 at arXiv

A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

Ship in 2-4 weeks›Score6.0Evidence unverified

Opportunity summary

Pain A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs. Although language models are increasingly used to generate optimization formulations and solver code, existing benchmarks are…

METHOD

Full abstract

Optimization modeling translates real decision-making problems into mathematical optimization models and solver-executable implementations. Although language models are increasingly used to generate optimization formulations and solver code, existing benchmarks are almost entirely text-only. This omits many optimization-modeling tasks that arise in operational practice, where requirements are described in text but instance information is conveyed through visual artifacts such as tables, graphs, maps, schedules, and dashboards. We introduce multimodal optimization modeling, a benchmark setting in which models must construct both a mathematical formulation and executable solver code from a text-and-visual problem specification. To evaluate this setting, we develop a solver-grounded framework that generates structured optimization instances, verifies each with an exact solver, and builds both the model-facing inputs and hidden reference files from the same verified source. We instantiate the framework as MM-OptBench, a benchmark of 780 solver-verified instances spanning 6 optimization families, 26 subcategories, and 3 structural difficulty levels. We evaluate 9 multimodal large language models (MLLMs), including 6 frontier general-purpose models and 3 math-specialized models, with aggregate, family-level, difficulty-level, and failure-mode analyses. The results show that the task remains far from solved: the best two models reach 52.1% and 51.3% pass@1, while on average across the six general-purpose MLLMs, pass@1 is 43.4% on easy instances and 15.9% on hard instances. All three math-specialized MLLMs solve 0/780 instances. Failure attribution shows that errors arise both when extracting instance data from text and visuals and when turning extracted data into solver-correct formulations and code. MM-OptBench provides a testbed for solver-grounded, decision-oriented multimodal intelligence.

RESULT

ScienceToStartup currently rates this 6.0/10 on the public viability pass. The results show that the task remains far from solved: the best two models reach 52.1% and 51.3% pass@1, while on average across the…

WHY NOW

Multimodal Optimization moved forward this cycle; last verified May 2026. Public score 6.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score6.0

PainA benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

Segment

Multimodal Optimization

Adoption evidence

No public code link in the paper record yet

Commercial read

6.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "b0839d41-b88b-48bc-a239-86fe514263ab", "arxiv_id": "2605.12154", "canonical_route": "/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "endpoints": { "paper_pack": "/api/v1/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling/paper-pack", "build_passport": "/api/v1/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling", "normalized_query": "2605.12154", "route": "/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "paper_ref": "mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling#webpage", "url": "https://sciencetostartup.com/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "name": "MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling", "description": "A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling#scholarlyArticle", "headline": "MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling", "description": "A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.", "url": "https://sciencetostartup.com/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling", "sameAs": "https://arxiv.org/abs/2605.12154", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.12154" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-12T14:07:36.000Z", "author": [ { "@type": "Person", "name": "Zhong Li" }, { "@type": "Person", "name": "Qi Huang" }, { "@type": "Person", "name": "Yuxuan Zhu" }, { "@type": "Person", "name": "Mohammad Mohammadi Amiri" }, { "@type": "Person", "name": "Niki van Stein" }, { "@type": "Person", "name": "Thomas Bäck" }, { "@type": "Person", "name": "Matthijs van Leeuwen" }, { "@type": "Person", "name": "Zaiwen Wen" }, { "@type": "Person", "name": "Lincen Yang" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 6 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Multimodal Optimization" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Multimodal Optimization", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "MM-OptBench: A Solver-Grounded Benchmark for Multimodal Opti", "item": "https://sciencetostartup.com/paper/mm-optbench-a-solver-grounded-benchmark-for-multimodal-optimization-modeling" } ] } ] }

Competitive landscape

A benchmark for multimodal optimization modeling that evaluates LLMs on constructing optimization models from text and visual inputs.

Segment

Multimodal Optimization

Adoption evidence

No public code link in the paper record yet

Commercial read

6.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling

MM-OptBench: A Solver-Grounded Benchmark for Multimodal Optimization Modeling

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline