ARXIV:2604.24473 · MEDICAL AI · SUBMITTED 28 APR · 15:18 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Johannes Moll · Jannik Lübberstedt · Christoph Nuernbergk · Jacob Stroh · Luisa Mertens · Anna Purcarea · +18 at arXiv

An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

Blocked on Code›Score5.0Evidence unverified

Opportunity summary

Pain An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories. Whether LLM-based systems can synthesise this evidence at a level approaching expert agreement has not been…

METHOD

Full abstract

Multiple myeloma is managed through sequential lines of therapy over years to decades, with each decision depending on cumulative disease history distributed across dozens to hundreds of heterogeneous clinical documents. Whether LLM-based systems can synthesise this evidence at a level approaching expert agreement has not been established. A retrospective evaluation was conducted on longitudinal clinical records of 811 myeloma patients treated at a tertiary centre (2001-2026), covering 44,962 documents and 1,334,677 laboratory values, with external validation on MIMIC-IV. An agentic reasoning system was compared against single-pass retrieval-augmented generation (RAG), iterative RAG, and full-context input on 469 patient-question pairs from 48 templates at three complexity levels. Reference labels came from double annotation by four oncologists with senior haematologist adjudication. Iterative RAG and full-context input converged on a shared ceiling (75.4% vs 75.8%, p = 1.00). The agentic system reached 79.6% concordance (95% CI 76.4-82.8), exceeding both baselines (+3.8 and +4.2 pp; p = 0.006 and 0.007). Gains rose with question complexity, reaching +9.4 pp on criteria-based synthesis (p = 0.032), and with record length, reaching +13.5 pp in the top decile (n = 10). The system error rate (12.2%) was comparable to expert disagreement (13.6%), but severity was inverted: 57.8% of system errors were clinically significant versus 18.8% of expert disagreements. Agentic reasoning was the only approach to exceed the shared ceiling, with gains concentrated on the most complex questions and longest records. The greater clinical consequence of residual system errors indicates that prospective evaluation in routine care is required before these findings translate into patient benefit.

RESULT

ScienceToStartup currently rates this 5.0/10 on the public viability pass. The greater clinical consequence of residual system errors indicates that prospective evaluation in routine care is required before these findings translate into patient benefit.

WHY NOW

Medical AI moved forward this cycle; last verified April 2026. Public score 5.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score5.0

PainAn agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

Segment

Medical AI

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "97963d2d-e08d-4a5c-92e3-57e01630ae41", "arxiv_id": "2604.24473", "canonical_route": "/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "endpoints": { "paper_pack": "/api/v1/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus/paper-pack", "build_passport": "/api/v1/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus", "normalized_query": "2604.24473", "route": "/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "paper_ref": "agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus#webpage", "url": "https://sciencetostartup.com/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "name": "Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus", "description": "An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus#scholarlyArticle", "headline": "Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus", "description": "An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.", "url": "https://sciencetostartup.com/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus", "sameAs": "https://arxiv.org/abs/2604.24473", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.24473" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-27T13:41:18.000Z", "author": [ { "@type": "Person", "name": "Johannes Moll" }, { "@type": "Person", "name": "Jannik Lübberstedt" }, { "@type": "Person", "name": "Christoph Nuernbergk" }, { "@type": "Person", "name": "Jacob Stroh" }, { "@type": "Person", "name": "Luisa Mertens" }, { "@type": "Person", "name": "Anna Purcarea" }, { "@type": "Person", "name": "Christopher Zirn" }, { "@type": "Person", "name": "Zeineb Benchaaben" }, { "@type": "Person", "name": "Fabian Drexel" }, { "@type": "Person", "name": "Hartmut Häntze" }, { "@type": "Person", "name": "Anirudh Narayanan" }, { "@type": "Person", "name": "Friedrich Puttkammer" }, { "@type": "Person", "name": "Andrei Zhukov" }, { "@type": "Person", "name": "Jacqueline Lammert" }, { "@type": "Person", "name": "Sebastian Ziegelmayer" }, { "@type": "Person", "name": "Markus Graf" }, { "@type": "Person", "name": "Marion Högner" }, { "@type": "Person", "name": "Marcus Makowski" }, { "@type": "Person", "name": "Florian Bassermann" }, { "@type": "Person", "name": "Lisa C. Adams" }, { "@type": "Person", "name": "Jiazhen Pan" }, { "@type": "Person", "name": "Daniel Rueckert" }, { "@type": "Person", "name": "Krischan Braitsch" }, { "@type": "Person", "name": "Keno K. Bressem" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 5 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Medical AI" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Medical AI", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Agentic clinical reasoning over longitudinal myeloma records", "item": "https://sciencetostartup.com/paper/agentic-clinical-reasoning-over-longitudinal-myeloma-records-a-retrospective-evaluation-against-expert-consensus" } ] } ] }

Competitive landscape

An agentic reasoning system for clinical decision support in multiple myeloma, outperforming existing methods on complex patient histories.

Segment

Medical AI

Adoption evidence

No public code link in the paper record yet

Commercial read

5.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Agentic clinical reasoning over longitudinal myeloma records: a retrospective evaluation against expert consensus

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline