ARXIV:2605.14355 · FINANCIAL AI AGENTS · SUBMITTED 15 MAY · 20:12 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Herculean: An Agentic Benchmark for Financial Intelligence

Xueqing Peng · Zhuohan Xie · Yupeng Cao · Haohang Li · Lingfei Qian · Yan Wang · +58 at arXiv

Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

Evidence 0 refs | 0 sources | 0% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows. Existing financial benchmarks offer only a partial view of this ability, as they primarily…

METHOD

Full abstract

As AI agents improve, the central question is no longer whether they can solve isolated well-defined financial tasks, but whether they can reliably carry out financial professional work. Existing financial benchmarks offer only a partial view of this ability, as they primarily evaluate static competencies such as question answering, retrieval, summarization, and classification. We introduce Herculean, the first skilled benchmark for agentic financial intelligence spanning four representative workflows, including Trading, Hedging, Market Insights, and Auditing. Each workflow is instantiated as a standardized MCP-based skill environment with its own tools, interaction dynamics, constraints, and success criteria, enabling consistent end-to-end assessment of heterogeneous agent systems. Across frontier agents, we find agents perform relatively well on Trading and Market Insights, but struggle substantially on Hedging and Auditing, where long-horizon coordination, state consistency, and structured verification are critical. Overall, our results point to a key gap in current agents in turning financial reasoning into dependable workflow execution in high-stakes financial workflows.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. As AI agents improve, the central question is no longer whether they can solve isolated well-defined financial tasks, but whether they can reliably carry…

WHY NOW

Financial AI Agents moved forward this cycle; last verified May 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainHerculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

Evidence0 refs | 0 sources | 0% coverage

Blockerno shell-level blocker reported

Analysis summary

Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

Segment

Financial AI Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "ebbe648b-8730-4699-8ebf-182e6a51c4d1", "arxiv_id": "2605.14355", "canonical_route": "/paper/herculean-an-agentic-benchmark-for-financial-intelligence", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "herculean-an-agentic-benchmark-for-financial-intelligence", "endpoints": { "paper_pack": "/api/v1/paper/herculean-an-agentic-benchmark-for-financial-intelligence/paper-pack", "build_passport": "/api/v1/paper/herculean-an-agentic-benchmark-for-financial-intelligence/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Herculean: An Agentic Benchmark for Financial Intelligence", "normalized_query": "2605.14355", "route": "/paper/herculean-an-agentic-benchmark-for-financial-intelligence", "paper_ref": "herculean-an-agentic-benchmark-for-financial-intelligence", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/herculean-an-agentic-benchmark-for-financial-intelligence#webpage", "url": "https://sciencetostartup.com/paper/herculean-an-agentic-benchmark-for-financial-intelligence", "name": "Herculean: An Agentic Benchmark for Financial Intelligence", "description": "Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/herculean-an-agentic-benchmark-for-financial-intelligence#scholarlyArticle", "headline": "Herculean: An Agentic Benchmark for Financial Intelligence", "description": "Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.", "url": "https://sciencetostartup.com/paper/herculean-an-agentic-benchmark-for-financial-intelligence", "sameAs": "https://arxiv.org/abs/2605.14355", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.14355" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-14T04:30:49.000Z", "author": [ { "@type": "Person", "name": "Xueqing Peng" }, { "@type": "Person", "name": "Zhuohan Xie" }, { "@type": "Person", "name": "Yupeng Cao" }, { "@type": "Person", "name": "Haohang Li" }, { "@type": "Person", "name": "Lingfei Qian" }, { "@type": "Person", "name": "Yan Wang" }, { "@type": "Person", "name": "Vincent Jim Zhang" }, { "@type": "Person", "name": "Huan He" }, { "@type": "Person", "name": "Xuguang Ai" }, { "@type": "Person", "name": "Linhai Ma" }, { "@type": "Person", "name": "Ruoyu Xiang" }, { "@type": "Person", "name": "Yueru He" }, { "@type": "Person", "name": "Yi Han" }, { "@type": "Person", "name": "Shuyao Wang" }, { "@type": "Person", "name": "Yuqing Guo" }, { "@type": "Person", "name": "Mingyang Jiang" }, { "@type": "Person", "name": "Yilun Zhao" }, { "@type": "Person", "name": "Youzhong Dong" }, { "@type": "Person", "name": "Xiaoyu Wang" }, { "@type": "Person", "name": "Yankai Chen" }, { "@type": "Person", "name": "Ye Yuan" }, { "@type": "Person", "name": "Qiyuan Zhang" }, { "@type": "Person", "name": "Fuyuan Lyu" }, { "@type": "Person", "name": "Haolun Wu" }, { "@type": "Person", "name": "Yonghan Yang" }, { "@type": "Person", "name": "Zichen Zhao" }, { "@type": "Person", "name": "Yuyang Dai" }, { "@type": "Person", "name": "Fan Zhang" }, { "@type": "Person", "name": "Rania Elbadry" }, { "@type": "Person", "name": "Ayesha Gull" }, { "@type": "Person", "name": "Muhammad Usman Safder" }, { "@type": "Person", "name": "Nuo Chen" }, { "@type": "Person", "name": "Fengbin Zhu" }, { "@type": "Person", "name": "Tianshi Cai" }, { "@type": "Person", "name": "Zimu Wang" }, { "@type": "Person", "name": "Polydoros Giannouris" }, { "@type": "Person", "name": "Yuechen Jiang" }, { "@type": "Person", "name": "Zhiwei Liu" }, { "@type": "Person", "name": "Mohsinul Kabir" }, { "@type": "Person", "name": "Yuyan Wang" }, { "@type": "Person", "name": "Yixiang Zheng" }, { "@type": "Person", "name": "Yangyang Yu" }, { "@type": "Person", "name": "Weijin Liu" }, { "@type": "Person", "name": "Wenbo Cao" }, { "@type": "Person", "name": "Anke Xu" }, { "@type": "Person", "name": "Peng Lu" }, { "@type": "Person", "name": "Jerry Huang" }, { "@type": "Person", "name": "Fengran Mo" }, { "@type": "Person", "name": "Mingquan Lin" }, { "@type": "Person", "name": "Prayag Tiwari" }, { "@type": "Person", "name": "Yijia Zhao" }, { "@type": "Person", "name": "Victor Gutierrez Basulto" }, { "@type": "Person", "name": "Xiao-Yang Liu" }, { "@type": "Person", "name": "Kaleb E Smith" }, { "@type": "Person", "name": "Jiahuan Pei" }, { "@type": "Person", "name": "Arman Cohan" }, { "@type": "Person", "name": "Jimin Huang" }, { "@type": "Person", "name": "Yuehua Tang" }, { "@type": "Person", "name": "Alejandro Lopez-Lira" }, { "@type": "Person", "name": "Xi Chen" }, { "@type": "Person", "name": "Xue Liu" }, { "@type": "Person", "name": "Junichi Tsujii" }, { "@type": "Person", "name": "Jian-Yun Nie" }, { "@type": "Person", "name": "Sophia Ananiadou" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Financial AI Agents" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Financial AI Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Herculean: An Agentic Benchmark for Financial Intelligence", "item": "https://sciencetostartup.com/paper/herculean-an-agentic-benchmark-for-financial-intelligence" } ] } ] }

Competitive landscape

Herculean is a benchmark for agentic financial intelligence, revealing a gap in current agents' ability to execute complex financial workflows.

Segment

Financial AI Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Herculean: An Agentic Benchmark for Financial Intelligence

Herculean: An Agentic Benchmark for Financial Intelligence

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline