ARXIV:2604.01977 · AI FOR CYBERSECURITY · SUBMITTED 03 APR · 20:50 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Ayush Garg · Sophia Hager · Jacob Montiel · Aditya Tiwari · Michael Gentile · Zach Reavis · +2 at arXiv

RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

Evidence 0 refs | 0 sources | 33% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity. In 2025, the National Vulnerability Database published over 48,000 new vulnerabilities, motivating the…

METHOD

Full abstract

Security teams face a challenge: the volume of newly disclosed Common Vulnerabilities and Exposures (CVEs) far exceeds the capacity to manually develop detection mechanisms. In 2025, the National Vulnerability Database published over 48,000 new vulnerabilities, motivating the need for automation. We present RuleForge, an AWS internal system that automatically generates detection rules--JSON-based patterns that identify malicious HTTP requests exploiting specific vulnerabilities--from structured Nuclei templates describing CVE details. Nuclei templates provide standardized, YAML-based vulnerability descriptions that serve as the structured input for our rule generation process. This paper focuses on RuleForge's architecture and operational deployment for CVE-related threat detection, with particular emphasis on our novel LLM-as-a-judge (Large Language Model as judge) confidence validation system and systematic feedback integration mechanism. This validation approach evaluates candidate rules across two dimensions--sensitivity (avoiding false negatives) and specificity (avoiding false positives)--achieving AUROC of 0.75 and reducing false positives by 67% compared to synthetic-test-only validation in production. Our 5x5 generation strategy (five parallel candidates with up to five refinement attempts each) combined with continuous feedback loops enables systematic quality improvement. We also present extensions enabling rule generation from unstructured data sources and demonstrate a proof-of-concept agentic workflow for multi-event-type detection. Our lessons learned highlight critical considerations for applying LLMs to cybersecurity tasks, including overconfidence mitigation and the importance of domain expertise in both prompt design and quality review of generated rules through human-in-the-loop validation.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. Our 5x5 generation strategy (five parallel candidates with up to five refinement attempts each) combined with continuous feedback loops enables systematic quality improvement. Code…

WHY NOW

AI for Cybersecurity moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainRuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

Evidence0 refs | 0 sources | 33% coverage

Blockerno shell-level blocker reported

Analysis summary

RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

Segment

AI for Cybersecurity

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "b1a3f3f2-a250-4bf7-b5a3-c89d98e94ad7", "arxiv_id": "2604.01977", "canonical_route": "/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "endpoints": { "paper_pack": "/api/v1/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale/paper-pack", "build_passport": "/api/v1/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale", "normalized_query": "2604.01977", "route": "/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "paper_ref": "ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale#webpage", "url": "https://sciencetostartup.com/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "name": "RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale", "description": "RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale#scholarlyArticle", "headline": "RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale", "description": "RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.", "url": "https://sciencetostartup.com/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale", "sameAs": "https://arxiv.org/abs/2604.01977", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.01977" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-02T12:39:26.000Z", "author": [ { "@type": "Person", "name": "Ayush Garg" }, { "@type": "Person", "name": "Sophia Hager" }, { "@type": "Person", "name": "Jacob Montiel" }, { "@type": "Person", "name": "Aditya Tiwari" }, { "@type": "Person", "name": "Michael Gentile" }, { "@type": "Person", "name": "Zach Reavis" }, { "@type": "Person", "name": "David Magnotti" }, { "@type": "Person", "name": "Wayne Fullen" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "AI for Cybersecurity" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "AI for Cybersecurity", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "RuleForge: Automated Generation and Validation for Web Vulne", "item": "https://sciencetostartup.com/paper/ruleforge-automated-generation-and-validation-for-web-vulnerability-detection-at-scale" } ] } ] }

Competitive landscape

RuleForge automates the generation of security detection rules from vulnerability descriptions using LLMs, significantly reducing false positives and improving detection capacity.

Segment

AI for Cybersecurity

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

RuleForge: Automated Generation and Validation for Web Vulnerability Detection at Scale

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline