ARXIV:2606.03965 · AGENTIC LLM REASONING · SUBMITTED 03 JUN · 20:32 UTC · FRESHNESS FRESH

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Yu Xia · Zhouhang Xie · Xin Xu · Byungkyu Kang · Prarit Lamba · Xiang Gao · +1 at arXiv

ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

Ship in 2-4 weeks›Score8.0Evidence unverified

Opportunity summary

Pain ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

Evidence 0 refs | 4 sources | 83% coverage

Blocker Evidence unverified

Open Build Read PDF Signal Canvas Track

PROBLEM

ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings. Existing efficient reasoning methods control thinking length by shortening, early-stopping, or compressing traces,…

METHOD

Full abstract

Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and offer little inference-time control. Existing efficient reasoning methods control thinking length by shortening, early-stopping, or compressing traces, leaving how the model thinks implicit. In this paper, we propose Agentic Chain-of-Thought Steering (ACTS), which formulates reasoning steering as a Markov decision process where a controller agent adaptively steers a frozen reasoner during inference. At each step, the controller observes the reasoning trace and remaining thinking budget, then issues a steering action consisting of a reasoning strategy and a steering phrase that initiates the next reasoner step. This enables budget-aware strategy control for efficient reasoning while preserving the reasoner's generation continuity. We initialize the controller agent from our constructed synthetic steering trajectories with multi-budget augmentation, and further optimize it via reinforcement learning with budget-conditioned reward shaping. Experiments across multiple benchmarks show that ACTS matches full-thinking performance with substantial token savings, and enables controllable accuracy-efficiency trade-offs across different reasoners and tasks. The code is available at https://github.com/Andree-9/ACTS.

RESULT

ScienceToStartup currently rates this 8.0/10 on the public viability pass. Large language models improve final-answer accuracy through extended chain-of-thought reasoning, but often spend tokens inefficiently and offer little inference-time control. A public repository is…

WHY NOW

Agentic LLM Reasoning moved forward this cycle; last verified June 2026. Public score 8.0/10. Implementation evidence is present through a linked repository.

Continue into Read for claims, analysis, references, and neighboring papers.

Opportunity summary

Score8.0

PainACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

Evidence0 refs | 4 sources | 83% coverage

Blockerno shell-level blocker reported

Analysis summary

ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

VerifiedSource: PDF linkedVerifiedPaperPack: citation fields availablePartialProof: unverified proof status

Competitive landscape

ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

Segment

Agentic LLM Reasoning

Adoption evidence

Public code linked for build inspection

Commercial read

8.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

{ "contract_version": "paper-r2", "paper_id": "84872e0f-d7eb-46c7-a7c9-bbef31dd09a1", "arxiv_id": "2606.03965", "canonical_route": "/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "active_tab": "synced from current hash by the drawer client", "selected_artifact": "agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "endpoints": { "paper_pack": "/api/v1/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning/paper-pack", "build_passport": "/api/v1/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning/build-passport", "mcp_resource": "sciencetostartup://surfaces/paper-workspace" } }

{ "surface": "paper", "mode": "paper", "query": "Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning", "normalized_query": "2606.03965", "route": "/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "paper_ref": "agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "topic_slug": null, "benchmark_ref": null, "dataset_ref": null }

{ "@context": "https://schema.org", "@graph": [ { "@type": "WebPage", "@id": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning#webpage", "url": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "name": "Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning", "description": "ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.", "isPartOf": { "@id": "https://sciencetostartup.com/#website" } }, { "@type": "ScholarlyArticle", "@id": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning#scholarlyArticle", "headline": "Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning", "description": "ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.", "url": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning", "sameAs": "https://arxiv.org/abs/2606.03965", "identifier": { "@type": "PropertyValue", "propertyID": "arXiv", "value": "2606.03965" }, "isAccessibleForFree": true, "isPartOf": { "@id": "https://sciencetostartup.com/#website" }, "datePublished": "2026-06-02T17:51:30.000Z", "author": [ { "@type": "Person", "name": "Yu Xia" }, { "@type": "Person", "name": "Zhouhang Xie" }, { "@type": "Person", "name": "Xin Xu" }, { "@type": "Person", "name": "Byungkyu Kang" }, { "@type": "Person", "name": "Prarit Lamba" }, { "@type": "Person", "name": "Xiang Gao" }, { "@type": "Person", "name": "Julian McAuley" } ], "codeRepository": "https://github.com/Andree-9/ACTS", "additionalProperty": [ { "@type": "PropertyValue", "propertyID": "viabilityScore", "value": 8 }, { "@type": "PropertyValue", "propertyID": "researchDomain", "value": "Agentic LLM Reasoning" }, { "@type": "PropertyValue", "propertyID": "commercialReadiness", "value": "code, repo url" } ] }, { "@type": "SoftwareSourceCode", "@id": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning#software", "name": "Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning - Source Code", "description": "ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.", "codeRepository": "https://github.com/Andree-9/ACTS", "url": "https://github.com/Andree-9/ACTS" }, { "@type": "BreadcrumbList", "itemListElement": [ { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://sciencetostartup.com" }, { "@type": "ListItem", "position": 2, "name": "Agentic LLM Reasoning", "item": "https://sciencetostartup.com/topics" }, { "@type": "ListItem", "position": 3, "name": "Agentic Chain-of-Thought Steering for Efficient and Controll", "item": "https://sciencetostartup.com/paper/agentic-chain-of-thought-steering-for-efficient-and-controllable-llm-reasoning" } ] } ] }

Competitive landscape

ACTS enables efficient and controllable LLM reasoning by formulating steering as a Markov decision process, matching full-thinking performance with token savings.

Segment

Agentic LLM Reasoning

Adoption evidence

Public code linked for build inspection

Commercial read

8.0/10 public viability

Direct

not classified

Adjacent

not classified

Substitute

not classified

Unknown

not classified

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Agentic Chain-of-Thought Steering for Efficient and Controllable LLM Reasoning

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline

Claim map

Constellation map

Competitive landscape

Buzz

PDF

REFERENCES

Related Papers

Subscribe to the weekly brief

Build artifacts

Brief

Experiment plan

Validation checklist

Scientific founder

Translational engineer

Domain operator

GTM lead

Regulatory/clinical advisor

Timeline