ARXIV:2603.26015 · VISION-LANGUAGE MODELS · SUBMITTED 30 MAR · 21:55 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation

Rakib Hossain Sajib · Md Kishor Morol · Rajan Das Gupta · Mohammad Sakib Mahmood · Shuvra Smaran Das · arXiv

Leverage state-of-the-art large vision-language models for zero-shot human age estimation, offering a competitive alternative to traditional supervised methods for applications in biometrics and healthcare.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain Leverage state-of-the-art large vision-language models for zero-shot human age estimation, offering a competitive alternative to traditional supervised methods for applications in biometrics and healthcare.

Evidence 33 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

Human age estimation from facial images represents a challenging computer vision task with significant applications in biometrics, healthcare, and human-computer interaction. While traditional deep learning approaches require extensive labeled datasets and domain-specific training, recent advances in large vision-language models (LVLMs) offer the potential for zero-shot age estimation. This study presents a comprehensive zero-shot evaluation of state-of-the-art Large Vision-Language Models (LVLMs) for facial age estimation, a task traditionally dominated by domain-specific convolutional networks and supervised learning. We assess the performance of GPT-4o, Claude 3.5 Sonnet, and LLaMA 3.2 Vision on two benchmark datasets, UTKFace and FG-NET, without any fine-tuning or task-specific adaptation. Using eight evaluation metrics, including MAE, MSE, RMSE, MAPE, MBE, $R^2$, CCC, and $\pm$5-year accuracy, we demonstrate that general-purpose LVLMs can deliver competitive performance in zero-shot settings. Our findings highlight the emergent capabilities of LVLMs for accurate biometric age estimation and position these models as promising tools for real-world applications. Additionally, we highlight performance disparities linked to image quality and demographic subgroups, underscoring the need for fairness-aware multimodal inference. This work introduces a reproducible benchmark and positions LVLMs as promising tools for real-world applications in forensic science, healthcare monitoring, and human-computer interaction. The benchmark focuses on strict zero-shot inference without fine-tuning and highlights remaining challenges related to prompt sensitivity, interpretability, computational cost, and demographic fairness.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Using eight evaluation metrics, including MAE, MSE, RMSE, MAPE, MBE, $R^2$, CCC, and $\pm$5-year accuracy, we demonstrate that general-purpose LVLMs can deliver competitive performance…

WHY NOW

Vision-Language Models moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainLeverage state-of-the-art large vision-language models for zero-shot human age estimation, offering a competitive alternative to traditional supervised methods for applications in biometrics and healthcare.

Evidence33 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation

Rakib Hossain Sajib · Md Kishor Morol · Rajan Das Gupta · Mohammad Sakib Mahmood · Shuvra Smaran Das · arXiv

Competitive landscape

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "2b0f5a9d-effe-4c2f-975b-4018c6698eba", "arxiv_id": "2603.26015", "canonical_route": "/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "endpoints": { "paper_pack": "/api/v1/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation/paper-pack", "build_passport": "/api/v1/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation", "normalized_query": "2603.26015", "route": "/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "paper_ref": "vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation#webpage", "url": "https://sciencetostartup.com/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "name": "VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation", "description": "Leverage state-of-the-art large vision-language models for zero-shot human age estimation, offering a competitive alternative to traditional supervised methods for applications in biometrics and healthcare.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation#scholarlyArticle", "headline": "VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation", "description": "Leverage state-of-the-art large vision-language models for zero-shot human age estimation, offering a competitive alternative to traditional supervised methods for applications in biometrics and healthcare.", "url": "https://sciencetostartup.com/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation", "sameAs": "https://arxiv.org/abs/2603.26015", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.26015" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-27T02:16:22.000Z", "author": [ { "@type": "Person", "name": "Rakib Hossain Sajib" }, { "@type": "Person", "name": "Md Kishor Morol" }, { "@type": "Person", "name": "Rajan Das Gupta" }, { "@type": "Person", "name": "Mohammad Sakib Mahmood" }, { "@type": "Person", "name": "Shuvra Smaran Das" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Vision-Language Models" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Vision-Language Models", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "VLAgeBench: Benchmarking Large Vision-Language Models for Ze", "item": "https://sciencetostartup.com/paper/vlagebench-benchmarking-large-vision-language-models-for-zero-shot-human-age-estimation" } ] } ] }

Competitive landscape

Segment

Vision-Language Models

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation

VLAgeBench: Benchmarking Large Vision-Language Models for Zero-Shot Human Age Estimation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline