ARXIV:2603.17639 · AGENTS · SUBMITTED 19 MAR · 21:58 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: partial proof status

VeriGrey: Greybox Agent Validation

arXiv

VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

Blocked on Code›Score7.0Evidence partial

Opportunity summary

Pain VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

Evidence 0 refs | 0 sources | 50% coverage

Blocker Evidence partial

Open Build Read PDF Signal Canvas Track

PROBLEM

VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation. A Large Language Model (LLM) agent involves one or more LLMs in the back-end.

METHOD

Agentic AI has been a topic of great interest recently. A Large Language Model (LLM) agent involves one or more LLMs in the back-end.

Full abstract

Agentic AI has been a topic of great interest recently. A Large Language Model (LLM) agent involves one or more LLMs in the back-end. In the front end, it conducts autonomous decision-making by combining the LLM outputs with results obtained by invoking several external tools. The autonomous interactions with the external environment introduce critical security risks. In this paper, we present a grey-box approach to explore diverse behaviors and uncover security risks in LLM agents. Our approach VeriGrey uses the sequence of tools invoked as a feedback function to drive the testing process. This helps uncover infrequent but dangerous tool invocations that cause unexpected agent behavior. As mutation operators in the testing process, we mutate prompts to design pernicious injection prompts. This is carefully accomplished by linking the task of the agent to an injection task, so that the injection task becomes a necessary step of completing the agent functionality. Comparing our approach with a black-box baseline on the well-known AgentDojo benchmark, VeriGrey achieves 33% additional efficacy in finding indirect prompt injection vulnerabilities with a GPT-4.1 back-end. We also conduct real-world case studies with the widely used coding agent Gemini CLI, and the well-known OpenClaw personal assistant. VeriGrey finds prompts inducing several attack scenarios that could not be identified by black-box approaches. In OpenClaw, by constructing a conversation agent which employs mutational fuzz testing as needed, VeriGrey is able to discover malicious skill variants from 10 malicious skills (with 10/10= 100% success rate on the Kimi-K2.5 LLM backend, and 9/10= 90% success rate on Opus 4.6 LLM backend). This demonstrates the value of a dynamic approach like VeriGrey to test agents, and to eventually lead to an agent assurance framework.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. In the front end, it conducts autonomous decision-making by combining the LLM outputs with results obtained by invoking several external tools. A public repository…

WHY NOW

Agents moved forward this cycle; last verified April 2026. Public score 7.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainVeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

Evidence0 refs | 0 sources | 50% coverage

Blockermissing authors

Analysis summary

VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: partial proof status

Competitive landscape

VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

Segment

Agents

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "8fba0424-e447-4349-858e-aea9fe635570", "arxiv_id": "2603.17639", "canonical_route": "/paper/verigrey-greybox-agent-validation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "verigrey-greybox-agent-validation", "endpoints": { "paper_pack": "/api/v1/paper/verigrey-greybox-agent-validation/paper-pack", "build_passport": "/api/v1/paper/verigrey-greybox-agent-validation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "VeriGrey: Greybox Agent Validation", "normalized_query": "2603.17639", "route": "/paper/verigrey-greybox-agent-validation", "paper_ref": "verigrey-greybox-agent-validation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation#webpage", "url": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation", "name": "VeriGrey: Greybox Agent Validation", "description": "VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation#scholarlyArticle", "headline": "VeriGrey: Greybox Agent Validation", "description": "VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.", "url": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation", "sameAs": "https://arxiv.org/abs/2603.17639", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.17639" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-18T12:00:54.000Z", "codeRepository": "https://github.com/modelcontextprotocol", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation#software", "name": "VeriGrey: Greybox Agent Validation - Source Code", "description": "VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.", "codeRepository": "https://github.com/modelcontextprotocol", "url": "https://github.com/modelcontextprotocol" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "VeriGrey: Greybox Agent Validation", "item": "https://sciencetostartup.com/paper/verigrey-greybox-agent-validation" } ] } ] }

Competitive landscape

VeriGrey is a grey-box testing framework that uncovers security risks in LLM agents through dynamic prompt mutation.

Segment

Agents

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

VeriGrey: Greybox Agent Validation

VeriGrey: Greybox Agent Validation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline