ARXIV:2605.00424 · AGENTS · SUBMITTED 04 MAY · 20:25 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Alfredo Metere · arXiv

A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

Blocked on Code›Score3.0Evidence unverified

Opportunity summary

Pain A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability. The runtime that loads them inherits the same problem package managers and operating systems have always faced: a…

METHOD

Full abstract

Agent skills -- structured packages of instructions, scripts, and references that augment a large language model (LLM) without modifying the model itself -- have moved from convenience to first-class deployment artifact. The runtime that loads them inherits the same problem package managers and operating systems have always faced: a piece of content claims a behavior; the runtime must decide whether to believe it. We argue this paper's central thesis up front: a skill is \emph{untrusted code} until it is verified, and the runtime that loads it must enforce that default rather than infer trust from a signature, a clearance, or a registry of origin. Without skill verification, a human-in-the-loop (HITL) gate must fire on every irreversible call -- which is operationally untenable and degrades into rubber-stamping at any non-trivial scale. With skill verification treated as a separate, gated process, HITL fires only for what is unverified, and the system becomes sustainable. We give a trust schema (§\ref{sec:schema}) that includes an explicit verification level on every skill manifest; a capability gate (§\ref{sec:gate}) whose HITL policy is a function of that verification level; a \emph{biconditional} correctness criterion (§\ref{sec:biconditional}) that any candidate verification procedure must satisfy on an adversarial-ensemble exercise (§\ref{sec:eval}); and a portable runtime profile (§\ref{sec:guidelines}) with ten normative guidelines abstracted from a working open-source reference implementation \cite{metere2026enclawed}. The contribution is harness- and model-agnostic; nothing here requires retraining, fine-tuning, or proprietary infrastructure.

RESULT

ScienceToStartup currently rates this 3.0/10 on the public viability pass. The contribution is harness- and model-agnostic; nothing here requires retraining, fine-tuning, or proprietary infrastructure.

WHY NOW

Agents moved forward this cycle; last verified May 2026. Public score 3.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score3.0

PainA trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "1ed01500-d4c1-4bdf-9aa4-65dd31526544", "arxiv_id": "2605.00424", "canonical_route": "/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "endpoints": { "paper_pack": "/api/v1/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt/paper-pack", "build_passport": "/api/v1/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes", "normalized_query": "2605.00424", "route": "/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "paper_ref": "skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt#webpage", "url": "https://sciencetostartup.com/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "name": "Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes", "description": "A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt#scholarlyArticle", "headline": "Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes", "description": "A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.", "url": "https://sciencetostartup.com/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt", "sameAs": "https://arxiv.org/abs/2605.00424", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.00424" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-01T05:53:05.000Z", "author": [ { "@type": "Person", "name": "Alfredo Metere" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 3 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Skills as Verifiable Artifacts: A Trust Schema and a Bicondi", "item": "https://sciencetostartup.com/paper/skills-as-verifiable-artifacts-a-trust-schema-and-a-biconditional-correctness-criterion-for-human-in-the-loop-agent-runt" } ] } ] }

Competitive landscape

A trust schema and verification criterion for LLM agent skills to ensure runtime security and sustainability.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Skills as Verifiable Artifacts: A Trust Schema and a Biconditional Correctness Criterion for Human-in-the-Loop Agent Runtimes

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline