ARXIV:2605.10365 · AGENT VALUES · SUBMITTED 12 MAY · 20:15 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Haonan Dong · Qiguan Feng · Kehan Jiang · Haoran Ye · Xin Zhang · Guojie Song · arXiv

Agent-ValueBench, the first benchmark for evaluating agent values, features 394 environments and 4,335 value-conflict tasks, revealing that agent values diverge from LLM values and are influenced by harness and skill…

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain Agent-ValueBench, the first benchmark for evaluating agent values, features 394 environments and 4,335 value-conflict tasks, revealing that agent values diverge from LLM values and are influenced by harness and skill steering.

Evidence 0 refs | 0 sources | 0% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

Autonomous agents have rapidly matured as task executors and seen widespread deployment via harnesses such as OpenClaw. Safety concerns have rightly drawn growing research attention, and beneath them lie the values silently steering agent behavior. Existing value benchmarks, however, remain confined to LLMs, leaving agent values largely uncharted. From intuitive, empirical, and theoretical vantage points, we show that an agent's values diverge from those of its underlying LLM, and the agentic modality further introduces dataset-, evaluation-, and system-level challenges absent from text-only protocols. We close this gap with Agent-ValueBench, the first benchmark dedicated to agent values. It features 394 executable environments across 16 domains, offering 4,335 value-conflict tasks that cover 28 value systems and 332 dimensions. Every instance is co-synthesized through our purpose-built end-to-end pipeline and curated per-instance by professional psychologists. Each task ships with two pole-aligned golden trajectories whose checkpoints anchor a trajectory-level rubric-based judge. Benchmarking 14 frontier proprietary and open-weights models across 4 mainstream harnesses, we uncover three concerted findings. Agent values first manifest as a Value Tide of cross-model homogeneity beneath interpretable counter-currents. This tide bends non-additively under harness pull, and yet more decisively under deliberate steering via embedded skills. Together these results signal that the agent-alignment lever is shifting from classical model alignment and prompt steering toward harness alignment and skill steering.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. From intuitive, empirical, and theoretical vantage points, we show that an agent's values diverge from those of its underlying LLM, and the agentic modality…

WHY NOW

Agent Values moved forward this cycle; last verified May 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainAgent-ValueBench, the first benchmark for evaluating agent values, features 394 environments and 4,335 value-conflict tasks, revealing that agent values diverge from LLM values and are influenced by harness and skill steering.

Evidence0 refs | 0 sources | 0% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Haonan Dong · Qiguan Feng · Kehan Jiang · Haoran Ye · Xin Zhang · Guojie Song · arXiv

Competitive landscape

Segment

Agent Values

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "7807eb3c-258a-4396-bf13-db59a772961d", "arxiv_id": "2605.10365", "canonical_route": "/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "endpoints": { "paper_pack": "/api/v1/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values/paper-pack", "build_passport": "/api/v1/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values", "normalized_query": "2605.10365", "route": "/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "paper_ref": "agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values#webpage", "url": "https://sciencetostartup.com/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "name": "Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values", "description": "Agent-ValueBench, the first benchmark for evaluating agent values, features 394 environments and 4,335 value-conflict tasks, revealing that agent values diverge from LLM values and are influenced by harness and skill steering.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values#scholarlyArticle", "headline": "Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values", "description": "Agent-ValueBench, the first benchmark for evaluating agent values, features 394 environments and 4,335 value-conflict tasks, revealing that agent values diverge from LLM values and are influenced by harness and skill steering.", "url": "https://sciencetostartup.com/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values", "sameAs": "https://arxiv.org/abs/2605.10365", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.10365" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-11T11:09:04.000Z", "author": [ { "@type": "Person", "name": "Haonan Dong" }, { "@type": "Person", "name": "Qiguan Feng" }, { "@type": "Person", "name": "Kehan Jiang" }, { "@type": "Person", "name": "Haoran Ye" }, { "@type": "Person", "name": "Xin Zhang" }, { "@type": "Person", "name": "Guojie Song" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agent Values" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agent Values", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Agent-ValueBench: A Comprehensive Benchmark for Evaluating A", "item": "https://sciencetostartup.com/paper/agent-valuebench-a-comprehensive-benchmark-for-evaluating-agent-values" } ] } ] }

Competitive landscape

Segment

Agent Values

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline