ARXIV:2605.16205 · LLM AGENTS · SUBMITTED 18 MAY · 20:31 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP

Igor Bogdanov · Chung-Horng Lung · Thomas Kunz · Jie Gao · Adrian Taylor · Marzia Zaman · arXiv

This study evaluates compound LLM agent design in adversarial environments, finding that programmatic state abstraction and hierarchical decomposition are more cost-effective than increased per-agent deliberation.

Blocked on Code›Score4.0Evidence unverified

Opportunity summary

Pain This study evaluates compound LLM agent design in adversarial environments, finding that programmatic state abstraction and hierarchical decomposition are more cost-effective than increased per-agent deliberation.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

METHOD

Full abstract

Deploying compound LLM agents in adversarial, partially observable sequential environments requires navigating several design dimensions: (1) what the agent sees, (2) how it reasons, and (3) how tasks are decomposed across components. Yet practitioners lack guidance on which design choices improve performance versus merely increase inference costs. We present a controlled study of compound LLM agent design in CybORG CAGE-2, a cyber defense environment modeled as a Partially Observable Markov Decision Process (POMDP). Reward is non-positive, so all configurations operate in a failure-mitigation mode. Our evaluation spans five model families, six models, and twelve configurations (3,475 episodes) with token-level cost accounting. We vary context representation (raw observations vs. a deterministic state-tracking layer with compressed history), deliberation (self-questioning, self-critique, and self-improvement tools, with optional chain-of-thought prompting), and hierarchical decomposition (monolithic ReAct vs. delegation to specialized sub-agents). We find that: (1) Programmatic state abstraction delivers the largest returns per token spent (RPTS), improving mean return by up to 76% over raw observations. (2) Distributing deliberation tools across a hierarchy degrades performance relative to hierarchy alone for all five model families, reaching up to 3.4$\times$ worse mean return while using 1.8-2.7$\times$ more tokens. We call this destructive pattern a deliberation cascade. (3) Hierarchical decomposition without deliberation achieves the best absolute performance for most models, and context engineering is generally more cost-effective than deliberation. These findings suggest a design principle for structured adversarial POMDPs: invest in programmatic infrastructure and clean task decomposition rather than deeper per-agent reasoning, as these strategies can interfere when combined.

RESULT

ScienceToStartup currently rates this 4.0/10 on the public viability pass. Yet practitioners lack guidance on which design choices improve performance versus merely increase inference costs.

WHY NOW

LLM Agents moved forward this cycle; last verified May 2026. Public score 4.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score4.0

PainThis study evaluates compound LLM agent design in adversarial environments, finding that programmatic state abstraction and hierarchical decomposition are more cost-effective than increased per-agent deliberation.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP

Igor Bogdanov · Chung-Horng Lung · Thomas Kunz · Jie Gao · Adrian Taylor · Marzia Zaman · arXiv

Competitive landscape

Segment

LLM Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "22f6f27b-f2a2-4350-8314-8ce170d99e0b", "arxiv_id": "2605.16205", "canonical_route": "/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "endpoints": { "paper_pack": "/api/v1/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp/paper-pack", "build_passport": "/api/v1/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP", "normalized_query": "2605.16205", "route": "/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "paper_ref": "context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp#webpage", "url": "https://sciencetostartup.com/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "name": "Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP", "description": "This study evaluates compound LLM agent design in adversarial environments, finding that programmatic state abstraction and hierarchical decomposition are more cost-effective than increased per-agent deliberation.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp#scholarlyArticle", "headline": "Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP", "description": "This study evaluates compound LLM agent design in adversarial environments, finding that programmatic state abstraction and hierarchical decomposition are more cost-effective than increased per-agent deliberation.", "url": "https://sciencetostartup.com/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp", "sameAs": "https://arxiv.org/abs/2605.16205", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.16205" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-15T17:23:08.000Z", "author": [ { "@type": "Person", "name": "Igor Bogdanov" }, { "@type": "Person", "name": "Chung-Horng Lung" }, { "@type": "Person", "name": "Thomas Kunz" }, { "@type": "Person", "name": "Jie Gao" }, { "@type": "Person", "name": "Adrian Taylor" }, { "@type": "Person", "name": "Marzia Zaman" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 4 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "LLM Agents" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "LLM Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Context, Reasoning, and Hierarchy: A Cost-Performance Study ", "item": "https://sciencetostartup.com/paper/context-reasoning-and-hierarchy-a-cost-performance-study-of-compound-llm-agent-design-in-an-adversarial-pomdp" } ] } ] }

Competitive landscape

Segment

LLM Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP

Context, Reasoning, and Hierarchy: A Cost-Performance Study of Compound LLM Agent Design in an Adversarial POMDP

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline