ARXIV:2604.06814 · TABULAR DATA BENCHMARKING · SUBMITTED 10 APR · 00:15 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

Dihong Jiang · Ruoqi Cao · Zhiyuan Dang · Li Huang · Qingsong Zhang · Zhiyu Wang · +5 at arXiv

OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

Ship in 2-4 weeks›Score5.0Evidence unverified

Opportunity summary

Pain OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

Evidence 35 refs | 5 sources | 67% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection. Existing benchmarks typically contain fewer than 100 datasets, raising concerns about evaluation sufficiency and potential…

METHOD

Full abstract

While traditional tree-based ensemble methods have long dominated tabular tasks, deep neural networks and emerging foundation models have challenged this primacy, yet no consensus exists on a universally superior paradigm. Existing benchmarks typically contain fewer than 100 datasets, raising concerns about evaluation sufficiency and potential selection biases. To address these limitations, we introduce OmniTabBench, the largest tabular benchmark to date, comprising 3030 datasets spanning diverse tasks that are comprehensively collected from diverse sources and categorized by industry using large language models. We conduct an unprecedented large-scale empirical evaluation of state-of-the-art models from all model families on OmniTabBench, confirming the absence of a dominant winner. Furthermore, through a decoupled metafeature analysis, which examines individual properties such as dataset size, feature types, feature and target skewness/kurtosis, we elucidate conditions favoring specific model categories, providing clearer, more actionable guidance than prior compound-metric studies.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. Furthermore, through a decoupled metafeature analysis, which examines individual properties such as dataset size, feature types, feature and target skewness/kurtosis, we elucidate conditions favoring…

WHY NOW

Tabular Data Benchmarking moved forward this cycle; last verified April 2026. Public score 5.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainOmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

Evidence35 refs | 5 sources | 67% coverage

Blockerno shell-level blocker reported

Analysis summary

OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

Segment

Tabular Data Benchmarking

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "8dccf89f-382f-4286-ac19-88735eec0501", "arxiv_id": "2604.06814", "canonical_route": "/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "endpoints": { "paper_pack": "/api/v1/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale/paper-pack", "build_passport": "/api/v1/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale", "normalized_query": "2604.06814", "route": "/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "paper_ref": "omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale#webpage", "url": "https://sciencetostartup.com/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "name": "OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale", "description": "OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale#scholarlyArticle", "headline": "OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale", "description": "OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.", "url": "https://sciencetostartup.com/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale", "sameAs": "https://arxiv.org/abs/2604.06814", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.06814" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-08T08:31:43.000Z", "author": [ { "@type": "Person", "name": "Dihong Jiang" }, { "@type": "Person", "name": "Ruoqi Cao" }, { "@type": "Person", "name": "Zhiyuan Dang" }, { "@type": "Person", "name": "Li Huang" }, { "@type": "Person", "name": "Qingsong Zhang" }, { "@type": "Person", "name": "Zhiyu Wang" }, { "@type": "Person", "name": "Shihao Piao" }, { "@type": "Person", "name": "Shenggao Zhu" }, { "@type": "Person", "name": "Jianlong Chang" }, { "@type": "Person", "name": "Zhouchen Lin" }, { "@type": "Person", "name": "Qi Tian" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Tabular Data Benchmarking" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Tabular Data Benchmarking", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neur", "item": "https://sciencetostartup.com/paper/omnitabbench-mapping-the-empirical-frontiers-of-gbdts-neural-networks-and-foundation-models-for-tabular-data-at-scale" } ] } ] }

Competitive landscape

OmniTabBench is a large-scale benchmark for tabular data, evaluating GBDTs, neural networks, and foundation models to guide model selection.

Segment

Tabular Data Benchmarking

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

OmniTabBench: Mapping the Empirical Frontiers of GBDTs, Neural Networks, and Foundation Models for Tabular Data at Scale

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline