ARXIV:2604.25737 · CODE EDITING AGENTS · SUBMITTED 29 APR · 02:43 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

Noam Tarshish · Nofar Selouk · Daniel Hodisan · Bar Ezra Gafniel · Yuval Elovici · Asaf Shabtai · +1 at arXiv

A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

Ship in 2-4 weeks›Score7.0Evidence unverified

Opportunity summary

Pain A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes. On the EditBench benchmark, 39 of 40 evaluated models obtain a task success rate (TSR) below…

METHOD

Full abstract

Instructed code editing is a significant challenge for large language models (LLMs). On the EditBench benchmark, 39 of 40 evaluated models obtain a task success rate (TSR) below 60 percent, highlighting a gap between general code generation and the ability to perform instruction-driven editing under executable test constraints. To address this, we propose SAFEdit, a multi-agent framework for instructed code editing that decomposes the editing process into specialized roles to improve reliability and reduce unintended code changes. A Planner Agent produces an explicit, visibility-aware edit plan, an Editor Agent applies minimal, literal code modifications, and a Verifier Agent executes real test runs. When tests fail, SAFEdit uses a Failure Abstraction Layer (FAL) to transform raw test logs into structured diagnostic feedback, which is fed back to the Editor to support iterative refinement. We compare SAFEdit against both prior single-model results reported for EditBench and an implemented ReAct single-agent baseline under the same evaluation conditions. We used EditBench to evaluate SAFEdit on 445 code editing instances in five languages (English, Polish, Spanish, Chinese, and Russian) under varying spatial context variants. SAFEdit achieved 68.6 percent TSR, outperforming the single-model baseline by 3.8 percentage points and the ReAct single-agent baseline by 8.6 percentage points. The iterative refinement loop was found to contribute 17.4 percentage points to SAFEdit's overall success rate. SAFEdit's automated error analysis further indicates a reduction in instruction-level hallucinations compared to single-agent approaches, providing an additional framework component for interpreting failures beyond pass or fail outcomes.

RESULT

ScienceToStartup currently rates this 7.0/10 on the public viability pass. To address this, we propose SAFEdit, a multi-agent framework for instructed code editing that decomposes the editing process into specialized roles to improve reliability…

WHY NOW

Code Editing Agents moved forward this cycle; last verified April 2026. Public score 7.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score7.0

PainA multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

Segment

Code Editing Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "74ce4598-f4f3-4467-8af0-c9aa3608456c", "arxiv_id": "2604.25737", "canonical_route": "/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "endpoints": { "paper_pack": "/api/v1/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing/paper-pack", "build_passport": "/api/v1/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?", "normalized_query": "2604.25737", "route": "/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "paper_ref": "safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing#webpage", "url": "https://sciencetostartup.com/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "name": "SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?", "description": "A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing#scholarlyArticle", "headline": "SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?", "description": "A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.", "url": "https://sciencetostartup.com/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing", "sameAs": "https://arxiv.org/abs/2604.25737", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.25737" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-28T15:04:46.000Z", "author": [ { "@type": "Person", "name": "Noam Tarshish" }, { "@type": "Person", "name": "Nofar Selouk" }, { "@type": "Person", "name": "Daniel Hodisan" }, { "@type": "Person", "name": "Bar Ezra Gafniel" }, { "@type": "Person", "name": "Yuval Elovici" }, { "@type": "Person", "name": "Asaf Shabtai" }, { "@type": "Person", "name": "Eliya Nachmani" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 7 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Code Editing Agents" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Code Editing Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "SAFEdit: Does Multi-Agent Decomposition Resolve the Reliabil", "item": "https://sciencetostartup.com/paper/safedit-does-multi-agent-decomposition-resolve-the-reliability-challenges-of-instructed-code-editing" } ] } ] }

Competitive landscape

A multi-agent framework that decomposes instructed code editing into specialized roles to improve reliability and reduce unintended changes.

Segment

Code Editing Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

7.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

SAFEdit: Does Multi-Agent Decomposition Resolve the Reliability Challenges of Instructed Code Editing?

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline