ARXIV:2604.02268 · AGENTIC LLMS · SUBMITTED 03 APR · 20:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields available

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Zhengxi Lu · Zhiyuan Yao · Jinyang Wu · Chengcheng Han · Qi Gu · Xunliang Cai · +4 at arXiv

Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

Ship in 2-4 weeks›Score7.0Evidence verified

Opportunity summary

Pain Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

Evidence 0 refs | 0 sources | 67% coverage

Blocker Evidence verified

Open Build Read PDF Signal Canvas Track

PROBLEM

Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency. Yet inference-time skill augmentation is fundamentally limited: retrieval noise introduces irrelevant guidance, injected skill content imposes substantial…

METHOD

Full abstract

Agent skills, structured packages of procedural knowledge and executable resources that agents dynamically load at inference time, have become a reliable mechanism for augmenting LLM agents. Yet inference-time skill augmentation is fundamentally limited: retrieval noise introduces irrelevant guidance, injected skill content imposes substantial token overhead, and the model never truly acquires the knowledge it merely follows. We ask whether skills can instead be internalized into model parameters, enabling zero-shot autonomous behavior without any runtime skill retrieval. We introduce SKILL0, an in-context reinforcement learning framework designed for skill internalization. SKILL0 introduces a training-time curriculum that begins with full skill context and progressively withdraws it. Skills are grouped offline by category and rendered with interaction history into a compact visual context, teaching he model tool invocation and multi-turn task completion. A Dynamic Curriculum then evaluates each skill file's on-policy helpfulness, retaining only those from which the current policy still benefits within a linearly decaying budget, until the agent operates in a fully zero-shot setting. Extensive agentic experiments demonstrate that SKILL0 achieves substantial improvements over the standard RL baseline (+9.7\% for ALFWorld and +6.6\% for Search-QA), while maintaining a highly efficient context of fewer than 0.5k tokens per step. Our code is available at https://github.com/ZJU-REAL/SkillZero.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Extensive agentic experiments demonstrate that SKILL0 achieves substantial improvements over the standard RL baseline (+9.7\% for ALFWorld and +6.6\% for Search-QA), while maintaining a…

WHY NOW

Agentic LLMs moved forward this cycle; last verified April 2026. Public score 7.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainInternalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

Evidence0 refs | 0 sources | 67% coverage

Blockerno shell-level blocker reported

Analysis summary

Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields available

Competitive landscape

Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

Segment

Agentic LLMs

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "3012e4f2-07a2-4dae-a3ce-f8bdb5a46b8c", "arxiv_id": "2604.02268", "canonical_route": "/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "endpoints": { "paper_pack": "/api/v1/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization/paper-pack", "build_passport": "/api/v1/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization", "normalized_query": "2604.02268", "route": "/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "paper_ref": "skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization#webpage", "url": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "name": "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization", "description": "Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization#scholarlyArticle", "headline": "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization", "description": "Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.", "url": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization", "sameAs": "https://arxiv.org/abs/2604.02268", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.02268" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-02T17:03:05.000Z", "author": [ { "@type": "Person", "name": "Zhengxi Lu" }, { "@type": "Person", "name": "Zhiyuan Yao" }, { "@type": "Person", "name": "Jinyang Wu" }, { "@type": "Person", "name": "Chengcheng Han" }, { "@type": "Person", "name": "Qi Gu" }, { "@type": "Person", "name": "Xunliang Cai" }, { "@type": "Person", "name": "Weiming Lu" }, { "@type": "Person", "name": "Jun Xiao" }, { "@type": "Person", "name": "Yueting Zhuang" }, { "@type": "Person", "name": "Yongliang Shen" } ], "codeRepository": "https://github.com/ZJU-REAL/SkillZero", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agentic LLMs" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization#software", "name": "SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization - Source Code", "description": "Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.", "codeRepository": "https://github.com/ZJU-REAL/SkillZero", "url": "https://github.com/ZJU-REAL/SkillZero" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agentic LLMs", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "SKILL0: In-Context Agentic Reinforcement Learning for Skill ", "item": "https://sciencetostartup.com/paper/skill0-in-context-agentic-reinforcement-learning-for-skill-internalization" } ] } ] }

Competitive landscape

Internalize LLM agent skills into model parameters for zero-shot autonomous behavior, reducing token overhead and improving efficiency.

Segment

Agentic LLMs

Adoption evidence

Public code linked for build inspection

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline