ARXIV:2604.12177 · AGENTS · SUBMITTED 15 APR · 17:00 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Policy-Invisible Violations in LLM-Based Agents

Jie Wu · Ming Gong · arXiv

A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines. We call this failure mode policy-invisible violations: cases in which compliance depends on entity attributes, contextual state,…

METHOD

Full abstract

LLM-based agents can execute actions that are syntactically valid, user-sanctioned, and semantically appropriate, yet still violate organizational policy because the facts needed for correct policy judgment are hidden at decision time. We call this failure mode policy-invisible violations: cases in which compliance depends on entity attributes, contextual state, or session history absent from the agent's visible context. We present PhantomPolicy, a benchmark spanning eight violation categories with balanced violation and safe-control cases, in which all tool responses contain clean business data without policy metadata. We manually review all 600 model traces produced by five frontier models and evaluate them using human-reviewed trace labels. Manual review changes 32 labels (5.3%) relative to the original case-level annotations, confirming the need for trace-level human review. To demonstrate what world-state-grounded enforcement can achieve under favorable conditions, we introduce Sentinel, an enforcement framework based on counterfactual graph simulation. Sentinel treats every agent action as a proposed mutation to an organizational knowledge graph, performs speculative execution to materialize the post-action world state, and verifies graph-structural invariants to decide Allow/Block/Clarify. Against human-reviewed trace labels, Sentinel substantially outperforms a content-only DLP baseline (68.8% vs. 93.0% accuracy) while maintaining high precision, though it still leaves room for improvement on certain violation categories. These results demonstrate what becomes achievable once policy-relevant world state is made available to the enforcement layer.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. To demonstrate what world-state-grounded enforcement can achieve under favorable conditions, we introduce Sentinel, an enforcement framework based on counterfactual graph simulation. Code availability is…

WHY NOW

Agents moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "ab4c0cd8-6a37-4c0f-8190-11bb6b9b382d", "arxiv_id": "2604.12177", "canonical_route": "/paper/policy-invisible-violations-in-llm-based-agents", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "policy-invisible-violations-in-llm-based-agents", "endpoints": { "paper_pack": "/api/v1/paper/policy-invisible-violations-in-llm-based-agents/paper-pack", "build_passport": "/api/v1/paper/policy-invisible-violations-in-llm-based-agents/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Policy-Invisible Violations in LLM-Based Agents", "normalized_query": "2604.12177", "route": "/paper/policy-invisible-violations-in-llm-based-agents", "paper_ref": "policy-invisible-violations-in-llm-based-agents", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/policy-invisible-violations-in-llm-based-agents#webpage", "url": "https://sciencetostartup.com/paper/policy-invisible-violations-in-llm-based-agents", "name": "Policy-Invisible Violations in LLM-Based Agents", "description": "A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/policy-invisible-violations-in-llm-based-agents#scholarlyArticle", "headline": "Policy-Invisible Violations in LLM-Based Agents", "description": "A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.", "url": "https://sciencetostartup.com/paper/policy-invisible-violations-in-llm-based-agents", "sameAs": "https://arxiv.org/abs/2604.12177", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.12177" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-14T01:15:15.000Z", "author": [ { "@type": "Person", "name": "Jie Wu" }, { "@type": "Person", "name": "Ming Gong" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Policy-Invisible Violations in LLM-Based Agents", "item": "https://sciencetostartup.com/paper/policy-invisible-violations-in-llm-based-agents" } ] } ] }

Competitive landscape

A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Policy-Invisible Violations in LLM-Based Agents

Policy-Invisible Violations in LLM-Based Agents

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline