172 papers - avg viability 5.6
The field of LLM reasoning is advancing through various innovative approaches that enhance the efficiency and accuracy of complex reasoning tasks. Techniques such as PathCal and SELFDOUBT focus on refining the reasoning process by managing uncertainty and optimizing decision-making paths. Other methods, like CIKA and AdapTime, leverage causal intervention and adaptive strategies to improve mathematical and temporal reasoning capabilities. These advancements are crucial for builders, as they provide scalable solutions that can be integrated into applications requiring robust reasoning without extensive computational resources. As LLMs become more adept at handling intricate reasoning challenges, they open new avenues for practical applications across diverse domains.
A training-free decoding controller that calibrates LLM reasoning paths by distinguishing reflection marker types to improve efficiency and accuracy.
RefineRL enhances LLMs for competitive programming by enabling self-refinement through a skeptical agent and reinforcement learning, outperforming larger models.
A hint-assisted reasoning framework that uses cooperative small language models to improve mathematical problem-solving.
A novel method to control and improve step-wise verification in LLMs by selectively steering latent states, offering efficient and accurate reasoning validation.
A reinforcement learning framework that enables compact open LLMs to perform multi-hop knowledge graph reasoning in a single inference step, outperforming larger models.
A framework for LLMs to learn from external rewards during inference by using retrieved instances and pseudo-labels for iterative refinement, significantly improving performance on reasoning and knowledge-intensive tasks.
AdapTime enhances LLMs' temporal reasoning by dynamically executing adaptive reasoning steps, improving accuracy without external tools.
A framework that uses LLMs as simulators to discover causal relationships between concepts and mathematical problem-solving, significantly improving reasoning capabilities.
Cooperative policy optimization for LLMs that shifts training from competition to team cooperation, improving reasoning accuracy and solution diversity.
A test-time performance enhancement tool for recurrent neural networks, applicable without needing explicit energy functions.