ARXIV:2604.25369 · REINFORCEMENT LEARNING · SUBMITTED 29 APR · 02:31 UTC · FRESHNESS STALE

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control

Quentin Vacher · Nicolas Beuve · Mickaël Dardaillon · Karol Desnos · arXiv

A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

Ship in 2-4 weeks›Score4.0Evidence unverified

Opportunity summary

Pain A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

Evidence 0 refs | 3 sources | 50% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows. Reinforcement Learning (RL), inspired by human behavior, is a great example, as it involves developing specific behaviours for…

METHOD

Full abstract

Over the past few decades, machine learning has been widely used to learn complex tasks. Reinforcement Learning (RL), inspired by human behavior, is a great example, as it involves developing specific behaviours for specific tasks. To further challenge algorithms, Multi-Task RL (MTRL) environments have been introduced, requiring a single model to learn multiple behaviors. The Tangled Program Graph (TPG) algorithm is a Genetic Programming (GP) algorithm designed for discrete MTRL environments. Recently, the MAPLE algorithm has been proposed, as another GP algorithm that achieves high results in single task continuous RL environments. A variation of the TPG is proposed alongside MAPLE, named Multi-Action TPG (MATPG) that aggregates MAPLE agents, and creates a control flow to activate them. Initially tested on single task RL environments only, MATPG achieved similar results to MAPLE. In this work, we present a new benchmark based on the MuJoCo Half Cheetah from Gymnasium. This benchmark features five distinct obstacles that are randomly positioned in front of the agent, each of which demands a unique behavior. This benchmark serves as a use case for MATPG, to prove its ability as a GP solution for continuous MTRL environments. Our experiments demonstrate its superiority in this multi-task use case when combined with lexicase selection. Furthermore, we examine the interpretability of the evolved graph, revealing that the decision flow of the model is fully interpretable.

RESULT

ScienceToStartup currently rates this 4.0/10 on the public viability pass. Recently, the MAPLE algorithm has been proposed, as another GP algorithm that achieves high results in single task continuous RL environments. Code availability is…

WHY NOW

Reinforcement Learning moved forward this cycle; last verified April 2026. Public score 4.0/10. Production flags indicate code availability.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score4.0

PainA genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

Evidence0 refs | 3 sources | 50% coverage

Blockerno shell-level blocker reported

Analysis summary

A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

Segment

Reinforcement Learning

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "d0c7d679-ce7c-4887-b215-10d14b3be43c", "arxiv_id": "2604.25369", "canonical_route": "/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "endpoints": { "paper_pack": "/api/v1/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control/paper-pack", "build_passport": "/api/v1/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control", "normalized_query": "2604.25369", "route": "/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "paper_ref": "multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control#webpage", "url": "https://sciencetostartup.com/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "name": "Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control", "description": "A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control#scholarlyArticle", "headline": "Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control", "description": "A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.", "url": "https://sciencetostartup.com/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control", "sameAs": "https://arxiv.org/abs/2604.25369", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2604.25369" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-04-28T08:34:52.000Z", "author": [ { "@type": "Person", "name": "Quentin Vacher" }, { "@type": "Person", "name": "Nicolas Beuve" }, { "@type": "Person", "name": "Mickaël Dardaillon" }, { "@type": "Person", "name": "Karol Desnos" } ], "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 4 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Reinforcement Learning" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code" } ] }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Reinforcement Learning", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Multi-action Tangled Program Graphs for Multi-task Reinforce", "item": "https://sciencetostartup.com/paper/multi-action-tangled-program-graphs-for-multi-task-reinforcement-learning-with-continuous-control" } ] } ] }

Competitive landscape

A genetic programming algorithm for multi-task reinforcement learning in continuous control environments with interpretable decision flows.

Segment

Reinforcement Learning

Adoption evidence

No public code link in the paper record yet

Commercial read

4.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control

Multi-action Tangled Program Graphs for Multi-task Reinforcement Learning with Continuous Control

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Related Resources

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline