ARXIV:2605.12239 · AGENTS · SUBMITTED 13 MAY · 21:04 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Harness Engineering as Categorical Architecture

Bogdan Banu · arXiv

Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

Blocked on Code›Score3.0Evidence unverified

Opportunity summary

Pain Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks. Yet harness design remains ad hoc, with no formal theory governing composition, preservation of properties under…

METHOD

Full abstract

The agent harness, the system layer comprising prompts, tools, memory, and orchestration logic that surrounds the model, has emerged as the central engineering abstraction for LLMbased agents. Yet harness design remains ad hoc, with no formal theory governing composition, preservation of properties under compilation, or systematic comparison across frameworks. We show that the categorical Architecture triple (G, Know, Phi) from the ArchAgents framework provides exactly this formalization. The four pillars of agent externalization (Memory, Skills, Protocols, Harness Engineering) map onto the triple's components: Memory as coalgebraic state, Skills as operad-composed objects, Protocols as syntactic wiring G, and the full Harness as the Architecture itself. Structural guarantees-integrity gates, quality-based escalation, supported convergence checks-are Know-level certificates whose preservation is structural replay: our compiler checks identity and verifier replay, not output-layer correctness or model behavior. We validate this correspondence with a reference implementation featuring compiler functors targeting Swarms, DeerFlow, Ralph, Scion, and LangGraph: the four configuration compilers preserve three named certificate types by identity or replay, and LangGraph preserves the same certificates through its shared per-stage execution path. The LangGraph compiler creates one node per stage using the same per-stage method as the native runtime, providing LangGraph-native observability without reimplementing harness logic. An end-to-end escalation experiment with real LLM agents confirms that the quality-based escalation control path is model-parametric in this two-model, one-task experiment. The result positions categorical architecture as the formal theory behind harness engineering.

RESULT

ScienceToStartup currently rates this 3.0/10 on the public viability pass. We show that the categorical Architecture triple (G, Know, Phi) from the ArchAgents framework provides exactly this formalization.

WHY NOW

Agents moved forward this cycle; last verified May 2026. Public score 3.0/10.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score3.0

PainFormalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "c8e9c8ce-d085-49f4-9f5d-7542d36c70de", "arxiv_id": "2605.12239", "canonical_route": "/paper/harness-engineering-as-categorical-architecture", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "harness-engineering-as-categorical-architecture", "endpoints": { "paper_pack": "/api/v1/paper/harness-engineering-as-categorical-architecture/paper-pack", "build_passport": "/api/v1/paper/harness-engineering-as-categorical-architecture/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Harness Engineering as Categorical Architecture", "normalized_query": "2605.12239", "route": "/paper/harness-engineering-as-categorical-architecture", "paper_ref": "harness-engineering-as-categorical-architecture", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/harness-engineering-as-categorical-architecture#webpage", "url": "https://sciencetostartup.com/paper/harness-engineering-as-categorical-architecture", "name": "Harness Engineering as Categorical Architecture", "description": "Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/harness-engineering-as-categorical-architecture#scholarlyArticle", "headline": "Harness Engineering as Categorical Architecture", "description": "Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.", "url": "https://sciencetostartup.com/paper/harness-engineering-as-categorical-architecture", "sameAs": "https://arxiv.org/abs/2605.12239", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2605.12239" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-05-12T15:09:46.000Z", "author": [ { "@type": "Person", "name": "Bogdan Banu" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 3 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agents" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agents", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Harness Engineering as Categorical Architecture", "item": "https://sciencetostartup.com/paper/harness-engineering-as-categorical-architecture" } ] } ] }

Competitive landscape

Formalizes LLM agent harness engineering using categorical architecture, providing theoretical guarantees for composition and property preservation across frameworks.

Segment

Agents

Adoption evidence

No public code link in the paper record yet

Commercial read

3.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Harness Engineering as Categorical Architecture

Harness Engineering as Categorical Architecture

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline