ARXIV:2603.24837 · PROGRAM ANALYSIS · SUBMITTED 02 APR · 02:30 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Bridging Code Property Graphs and Language Models for Program Analysis

Ahmed Lekssays · arXiv

A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

Evidence 0 refs | 0 sources | 17% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching. These limitations force existing approaches to operate on isolated code snippets, missing…

METHOD

Full abstract

Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading entire repositories, code embeddings fail to capture inter procedural data flows, and LLMs struggle to generate complex static analysis queries. These limitations force existing approaches to operate on isolated code snippets, missing vulnerabilities that span multiple functions and files. We introduce codebadger, an open source Model Context Protocol (MCP) server that integrates Joern's Code Property Graph (CPG) engine with LLMs. Rather than requiring LLMs to generate complex CPG queries, codebadger provides high level tools for program slicing, taint tracking, data flow analysis, and semantic code navigation, enabling targeted exploration of large codebases without exhaustive file reading. We demonstrate its effectiveness through three use cases: (1) navigating an 8,000 method codebase to audit memory safety patterns, (2) discovering and exploiting a previously unreported buffer overflow in libtiff, and (3) generating a correct patch for an integer overflow vulnerability (CVE-2025-6021) in libxml2 on the first attempt. codebadger enables LLMs to reason about code semantically across entire repositories, supporting vulnerability discovery, patching, and program comprehension at scale.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. We demonstrate its effectiveness through three use cases: (1) navigating an 8,000 method codebase to audit memory safety patterns, (2) discovering and exploiting a…

WHY NOW

Program Analysis moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

Evidence0 refs | 0 sources | 17% coverage

Blockerno shell-level blocker reported

Analysis summary

A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

Segment

Program Analysis

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "7f663269-2873-4ed0-a9cf-5cdc9ac91172", "arxiv_id": "2603.24837", "canonical_route": "/paper/bridging-code-property-graphs-and-language-models-for-program-analysis", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "bridging-code-property-graphs-and-language-models-for-program-analysis", "endpoints": { "paper_pack": "/api/v1/paper/bridging-code-property-graphs-and-language-models-for-program-analysis/paper-pack", "build_passport": "/api/v1/paper/bridging-code-property-graphs-and-language-models-for-program-analysis/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Bridging Code Property Graphs and Language Models for Program Analysis", "normalized_query": "2603.24837", "route": "/paper/bridging-code-property-graphs-and-language-models-for-program-analysis", "paper_ref": "bridging-code-property-graphs-and-language-models-for-program-analysis", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/bridging-code-property-graphs-and-language-models-for-program-analysis#webpage", "url": "https://sciencetostartup.com/paper/bridging-code-property-graphs-and-language-models-for-program-analysis", "name": "Bridging Code Property Graphs and Language Models for Program Analysis", "description": "A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/bridging-code-property-graphs-and-language-models-for-program-analysis#scholarlyArticle", "headline": "Bridging Code Property Graphs and Language Models for Program Analysis", "description": "A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.", "url": "https://sciencetostartup.com/paper/bridging-code-property-graphs-and-language-models-for-program-analysis", "sameAs": "https://arxiv.org/abs/2603.24837", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2603.24837" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-03-25T22:08:04.000Z", "author": [ { "@type": "Person", "name": "Ahmed Lekssays" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Program Analysis" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Program Analysis", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Bridging Code Property Graphs and Language Models for Progra", "item": "https://sciencetostartup.com/paper/bridging-code-property-graphs-and-language-models-for-program-analysis" } ] } ] }

Competitive landscape

A server that integrates code property graphs with LLMs to enable semantic code analysis across entire repositories for vulnerability discovery and patching.

Segment

Program Analysis

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Bridging Code Property Graphs and Language Models for Program Analysis

Bridging Code Property Graphs and Language Models for Program Analysis

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline