ARXIV:2604.26235 · AGENTS · SUBMITTED 30 APR · 15:13 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

LATTICE: Evaluating Decision Support Utility of Crypto Agents

Aaron Chan · Tengfei Li · Tianyi Xiao · Angela Chen · Junyi Du · Xiang Ren · arXiv

LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

Evidence 0 refs | 4 sources | 67% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment. Prior crypto agent benchmarks mainly focus on reasoning-based or outcome-based evaluation, but do not…

METHOD

Full abstract

We introduce LATTICE, a benchmark for evaluating the decision support utility of crypto agents in realistic user-facing scenarios. Prior crypto agent benchmarks mainly focus on reasoning-based or outcome-based evaluation, but do not assess agents' ability to assist user decision-making. LATTICE addresses this gap by: (1) defining six evaluation dimensions that capture key decision support properties; (2) proposing 16 task types that span the end-to-end crypto copilot workflow; and (3) using LLM judges to automatically score agent outputs based on these dimensions and tasks. Crucially, the dimensions and tasks are designed to be evaluable at scale using LLM judges, without relying on ground truth from expert annotators or external data sources. In lieu of these dependencies, LATTICE's LLM judge rubrics can be continually audited and updated given new dimensions, tasks, criteria, and human feedback, thus promoting reliable and extensible evaluation. While other benchmarks often compare foundation models sharing a generic agent framework, we use LATTICE to assess production-level agents used in actual crypto copilot products, reflecting the importance of orchestration and UI/UX design in determining agent quality. In this paper, we evaluate six real-world crypto copilots on 1,200 diverse queries and report breakdowns across dimensions, tasks, and query categories. Our experiments show that most of the tested copilots achieve comparable aggregate scores, but differ more significantly on dimension-level and task-level performance. This pattern suggests meaningful trade-offs in decision support quality: users with different priorities may be better served by different copilots than the aggregate rankings alone would indicate. To support reproducible research, we open-source all LATTICE code and data used in this paper.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. We introduce LATTICE, a benchmark for evaluating the decision support utility of crypto agents in realistic user-facing scenarios. A public repository is linked, so…

WHY NOW

Agents moved forward this cycle; last verified April 2026. Public score 7.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainLATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

Evidence0 refs | 4 sources | 67% coverage

Blockerno shell-level blocker reported

Analysis summary

LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

Segment

Agents

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "d48caf65-2e29-4a7f-b764-18f4d03cce39", "arxiv_id": "2604.26235", "canonical_route": "/paper/lattice-evaluating-decision-support-utility-of-crypto-agents", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "lattice-evaluating-decision-support-utility-of-crypto-agents", "endpoints": { "paper_pack": "/api/v1/paper/lattice-evaluating-decision-support-utility-of-crypto-agents/paper-pack", "build_passport": "/api/v1/paper/lattice-evaluating-decision-support-utility-of-crypto-agents/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "LATTICE: Evaluating Decision Support Utility of Crypto Agents", "normalized_query": "2604.26235", "route": "/paper/lattice-evaluating-decision-support-utility-of-crypto-agents", "paper_ref": "lattice-evaluating-decision-support-utility-of-crypto-agents", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents#webpage", "url": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents", "name": "LATTICE: Evaluating Decision Support Utility of Crypto Agents", "description": "LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents#scholarlyArticle", "headline": "LATTICE: Evaluating Decision Support Utility of Crypto Agents", "description": "LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.", "url": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents", "sameAs": "https://arxiv.org/abs/2604.26235", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.26235" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-29T02:32:14.000Z", "author": [ { "@type": "Person", "name": "Aaron Chan" }, { "@type": "Person", "name": "Tengfei Li" }, { "@type": "Person", "name": "Tianyi Xiao" }, { "@type": "Person", "name": "Angela Chen" }, { "@type": "Person", "name": "Junyi Du" }, { "@type": "Person", "name": "Xiang Ren" } ], "codeRepository": "https://github.com/SaharaLabsAI/lattice-benchmark", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents#software", "name": "LATTICE: Evaluating Decision Support Utility of Crypto Agents - Source Code", "description": "LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.", "codeRepository": "https://github.com/SaharaLabsAI/lattice-benchmark", "url": "https://github.com/SaharaLabsAI/lattice-benchmark" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "LATTICE: Evaluating Decision Support Utility of Crypto Agent", "item": "https://sciencetostartup.com/paper/lattice-evaluating-decision-support-utility-of-crypto-agents" } ] } ] }

Competitive landscape

LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.

Segment

Agents

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

LATTICE: Evaluating Decision Support Utility of Crypto Agents

LATTICE: Evaluating Decision Support Utility of Crypto Agents

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline