ARXIV:2603.15159 · AI CODE GENERATION · SUBMITTED 02 APR · 02:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv

PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

Blocked on Code›Score7.0Evidence unverified

Opportunity summary

Pain PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

Evidence 0 refs | 0 sources | 17% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality. Existing approaches mainly rely on retrieving private-library API documentation and injecting relevant knowledge…

METHOD

Full abstract

Large Language Models (LLMs) have shown strong potential for code generation, yet they remain limited in private-library-oriented code generation, where the goal is to generate code using APIs from private libraries. Existing approaches mainly rely on retrieving private-library API documentation and injecting relevant knowledge into the context at inference time. However, our study shows that this is insufficient: even given accurate required knowledge, LLMs still struggle to invoke private-library APIs effectively. To address this limitation, we propose PriCoder, an approach that teaches LLMs to invoke private-library APIs through automatically synthesized data. Specifically, PriCoder models private-library data synthesis as the construction of a graph, and alternates between two graph operators: (1) Progressive Graph Evolution, which improves data diversity by progressively synthesizing more diverse training samples from basic ones, and (2) Multidimensional Graph Pruning, which improves data quality through a rigorous filtering pipeline. To support rigorous evaluation, we construct two new benchmarks based on recently released libraries that are unfamiliar to the tested models. Experiments on three mainstream LLMs show that PriCoder substantially improves private-library-oriented code generation, yielding gains of over 20% in pass@1 in many settings, while causing negligible impact on general code generation capability. Our code and benchmarks are publicly available at https://github.com/contact-eniacode/PriCoder.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. However, our study shows that this is insufficient: even given accurate required knowledge, LLMs still struggle to invoke private-library APIs effectively.

WHY NOW

AI Code Generation moved forward this cycle; last verified April 2026. Public score 7.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainPriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

Evidence0 refs | 0 sources | 17% coverage

Blockermissing authors

Analysis summary

PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

VerifiedSource: PDF linkedPartialPaperPack: 3 of 4 citation fields filledMissingMissing fields: authorsPartialProof: unverified proof status

Competitive landscape

PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

Segment

AI Code Generation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "ab27f2c7-c048-4e8f-824f-2c1446dc1ec7", "arxiv_id": "2603.15159", "canonical_route": "/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "endpoints": { "paper_pack": "/api/v1/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation/paper-pack", "build_passport": "/api/v1/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation", "normalized_query": "2603.15159", "route": "/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "paper_ref": "to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation#webpage", "url": "https://sciencetostartup.com/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "name": "To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation", "description": "PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation#scholarlyArticle", "headline": "To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation", "description": "PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.", "url": "https://sciencetostartup.com/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation", "sameAs": "https://arxiv.org/abs/2603.15159", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.15159" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-16T11:53:39.000Z", "author": [ { "@type": "Person", "name": "Yitong Zhang", "affiliation": { "@type": "Organization", "name": "Tsinghua University, Proxseer Inc." } }, { "@type": "Person", "name": "Chengze Li", "affiliation": { "@type": "Organization", "name": "Nanjing University" } }, { "@type": "Person", "name": "Ruize Chen", "affiliation": { "@type": "Organization", "name": "Nanjing University" } }, { "@type": "Person", "name": "Guowei Yang", "affiliation": { "@type": "Organization", "name": "Proxseer Inc." } }, { "@type": "Person", "name": "Xiaoran Jia", "affiliation": { "@type": "Organization", "name": "Beijing Institute of Technology" } }, { "@type": "Person", "name": "Yijie Ren", "affiliation": { "@type": "Organization", "name": "Beihang University" } }, { "@type": "Person", "name": "Jia Li", "affiliation": { "@type": "Organization", "name": "Tsinghua University" } } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "AI Code Generation" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "AI Code Generation", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "To See is Not to Master: Teaching LLMs to Use Private Librar", "item": "https://sciencetostartup.com/paper/to-see-is-not-to-master-teaching-llms-to-use-private-libraries-for-code-generation" } ] }, { "@type": "FAQPage", "mainEntity": [ { "@type": "Question", "name": "What is the startup potential of \"To See is Not to Master: Teaching LLMs to Use Private Librar\"?", "acceptedAnswer": { "@type": "Answer", "text": "PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality." } }, { "@type": "Question", "name": "What products could be built from this research?", "acceptedAnswer": { "@type": "Answer", "text": "Productize PriCoder as a code generation tool that can be integrated with developer platforms to enhance LLMs' ability to use proprietary APIs effectively, especially for enterprise clients that use custom libraries." } }, { "@type": "Question", "name": "What are the practical use cases?", "acceptedAnswer": { "@type": "Answer", "text": "Develop an AI-based IDE plugin that suggests optimized API calls for private libraries, enhancing developer productivity and code reliability." } }, { "@type": "Question", "name": "What industries could this research disrupt?", "acceptedAnswer": { "@type": "Answer", "text": "PriCoder could disrupt current API usage by making it easier for LLMs to generate code involving private libraries, potentially replacing manual API documentation retrieval processes." } } ] } ] }

Competitive landscape

PriCoder enables LLMs to effectively use private library APIs for code generation by synthesizing data and enhancing code diversity and quality.

Segment

AI Code Generation

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline