ARXIV:2602.05354 · AGENTS · SUBMITTED 02 APR · 02:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents

arXiv

Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

Blocked on Code›Score5.0Evidence unverified

Opportunity summary

Pain Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

Evidence 0 refs | 0 sources | 17% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks. Across both closed and open models, agents typically navigate to relevant pages but retrieve decisive…

METHOD

Full abstract

We introduce PATHWAYS, a benchmark of 250 multi-step decision tasks that test whether web-based agents can discover and correctly use hidden contextual information. Across both closed and open models, agents typically navigate to relevant pages but retrieve decisive hidden evidence in only a small fraction of cases. When tasks require overturning misleading surface-level signals, performance drops sharply to near chance accuracy. Agents frequently hallucinate investigative reasoning by claiming to rely on evidence they never accessed. Even when correct context is discovered, agents often fail to integrate it into their final decision. Providing more explicit instructions improves context discovery but often reduces overall accuracy, revealing a tradeoff between procedural compliance and effective judgement. Together, these results show that current web agent architectures lack reliable mechanisms for adaptive investigation, evidence integration, and judgement override.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. Providing more explicit instructions improves context discovery but often reduces overall accuracy, revealing a tradeoff between procedural compliance and effective judgement.

WHY NOW

Agents moved forward this cycle; last verified April 2026. Public score 5.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainDevelop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

Evidence0 refs | 0 sources | 17% coverage

Blockermissing authors

Analysis summary

Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

Competitive landscape

Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "400b514e-34a3-4aab-bb30-9f6c490d6aae", "arxiv_id": "2602.05354", "canonical_route": "/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "endpoints": { "paper_pack": "/api/v1/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents/paper-pack", "build_passport": "/api/v1/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents", "normalized_query": "2602.05354", "route": "/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "paper_ref": "pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents#webpage", "url": "https://sciencetostartup.com/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "name": "PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents", "description": "Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents#scholarlyArticle", "headline": "PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents", "description": "Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.", "url": "https://sciencetostartup.com/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents", "sameAs": "https://arxiv.org/abs/2602.05354", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2602.05354" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-02-05T06:24:23.000Z", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "PATHWAYS: Evaluating Investigation and Context Discovery in ", "item": "https://sciencetostartup.com/paper/pathways-evaluating-investigation-and-context-discovery-in-ai-web-agents" } ] } ] }

Competitive landscape

Develop PATHWAYS, a benchmark tool for evaluating the context discovery ability of AI web agents in multi-step decision tasks.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents

PATHWAYS: Evaluating Investigation and Context Discovery in AI Web Agents

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline