ARXIV:2603.28068 · AI ILLUSTRATION GENERATION · SUBMITTED 31 MAR · 20:53 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

Zhaohe Liao · Kaixun Jiang · Zhihang Liu · Yujie Wei · Junqiu Yu · Quanhao Li · +8 at arXiv

AIBench provides the first benchmark for evaluating the visual-logical consistency of academic illustrations generated by AI, revealing significant gaps in current models and guiding future development.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain AIBench provides the first benchmark for evaluating the visual-logical consistency of academic illustrations generated by AI, revealing significant gaps in current models and guiding future development.

Evidence 100 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

Although image generation has boosted various applications via its rapid evolution, whether the state-of-the-art models are able to produce ready-to-use academic illustrations for papers is still largely unexplored.Directly comparing or evaluating the illustration with VLM is native but requires oracle multi-modal understanding ability, which is unreliable for long and complex texts and illustrations. To address this, we propose AIBench, the first benchmark using VQA for evaluating logic correctness of the academic illustrations and VLMs for assessing aesthetics. In detail, we designed four levels of questions proposed from a logic diagram summarized from the method part of the paper, which query whether the generated illustration aligns with the paper on different scales. Our VQA-based approach raises more accurate and detailed evaluations on visual-logical consistency while relying less on the ability of the judger VLM. With our high-quality AIBench, we conduct extensive experiments and conclude that the performance gap between models on this task is significantly larger than general ones, reflecting their various complex reasoning and high-density generation ability. Further, the logic and aesthetics are hard to optimize simultaneously as in handcrafted illustrations. Additional experiments further state that test-time scaling on both abilities significantly boosts the performance on this task.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Additional experiments further state that test-time scaling on both abilities significantly boosts the performance on this task. Code availability is flagged in the production…

WHY NOW

AI Illustration Generation moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainAIBench provides the first benchmark for evaluating the visual-logical consistency of academic illustrations generated by AI, revealing significant gaps in current models and guiding future development.

Evidence100 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Segment

AI Illustration Generation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "18cb326c-42b5-459d-87fd-a02bc4d529f5", "arxiv_id": "2603.28068", "canonical_route": "/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "endpoints": { "paper_pack": "/api/v1/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation/paper-pack", "build_passport": "/api/v1/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation", "normalized_query": "2603.28068", "route": "/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "paper_ref": "aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation#webpage", "url": "https://sciencetostartup.com/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "name": "AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation", "description": "AIBench provides the first benchmark for evaluating the visual-logical consistency of academic illustrations generated by AI, revealing significant gaps in current models and guiding future development.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation#scholarlyArticle", "headline": "AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation", "description": "AIBench provides the first benchmark for evaluating the visual-logical consistency of academic illustrations generated by AI, revealing significant gaps in current models and guiding future development.", "url": "https://sciencetostartup.com/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation", "sameAs": "https://arxiv.org/abs/2603.28068", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.28068" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-30T06:14:40.000Z", "author": [ { "@type": "Person", "name": "Zhaohe Liao" }, { "@type": "Person", "name": "Kaixun Jiang" }, { "@type": "Person", "name": "Zhihang Liu" }, { "@type": "Person", "name": "Yujie Wei" }, { "@type": "Person", "name": "Junqiu Yu" }, { "@type": "Person", "name": "Quanhao Li" }, { "@type": "Person", "name": "Hong-Tao Yu" }, { "@type": "Person", "name": "Pandeng Li" }, { "@type": "Person", "name": "Yuzheng Wang" }, { "@type": "Person", "name": "Zhen Xing" }, { "@type": "Person", "name": "Shiwei Zhang" }, { "@type": "Person", "name": "Chen-Wei Xie" }, { "@type": "Person", "name": "Yun Zheng" }, { "@type": "Person", "name": "Xihui Liu" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "AI Illustration Generation" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "AI Illustration Generation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "AIBench: Evaluating Visual-Logical Consistency in Academic I", "item": "https://sciencetostartup.com/paper/aibench-evaluating-visual-logical-consistency-in-academic-illustration-generation" } ] } ] }

Competitive landscape

Segment

AI Illustration Generation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

AIBench: Evaluating Visual-Logical Consistency in Academic Illustration Generation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline