CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference Build Now
CoDe-R refines decompiler output with LLMs using rationale guidance and adaptive inference, achieving state-of-the-art re-executability for lightweight models.
GitHub stars n/a Velocity flat History 1 snapshot Decompilation Apr 14 Pending High viability
KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Build Now
Boosting LLM reasoning performance through reinforcement learning with minimal-sufficient knowledge guidance, achieving state-of-the-art results at scale.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning with Knowledge Guidance Apr 14 Pending High viability
Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution Build Now
Domain-specific autoencoders significantly improve medical image super-resolution fidelity in diffusion models, outperforming generic VAEs.
GitHub stars n/a Velocity flat History pending Medical Image Super-Resolution Apr 14 Pending High viability
Beyond Output Correctness: Benchmarking and Evaluating Large Language Model Reasoning in Coding Tasks Build Now
A benchmark and evaluation framework for assessing the reasoning quality of large language models in coding tasks, offering improved accuracy and insights.
GitHub stars n/a Velocity flat History pending LLM Evaluation Apr 14 Pending High viability
MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization Build Now
A memory-augmented reinforcement learning agent for sample-efficient molecular optimization in drug discovery.
GitHub stars n/a Velocity flat History pending Drug Discovery AI Apr 14 Pending High viability
CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models Build Now
A plug-and-play framework for reducing visual tokens in multimodal LLMs through class-adaptive layer fusion and dual-stage pruning, outperforming existing methods.
GitHub stars n/a Velocity flat History pending Multimodal LLMs Apr 14 Pending High viability
Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs Build Now
A prompt learning system that leverages LLMs to improve knowledge graph reasoning and predict missing information with high confidence.
GitHub stars n/a Velocity flat History pending Knowledge Graph Reasoning Apr 14 Pending High viability
Filtered Reasoning Score: Evaluating Reasoning Quality on a Model's Most-Confident Traces Build Now
A novel evaluation metric that assesses the quality of LLM reasoning by filtering for high-confidence traces, revealing deeper model capabilities.
GitHub stars n/a Velocity flat History pending LLM Evaluation Apr 13 Pending High viability
IDEA: An Interpretable and Editable Decision-Making Framework for LLMs via Verbal-to-Numeric Calibration Build Now
A framework to extract LLM decision knowledge into an interpretable model, enabling calibrated probabilities and direct parameter editing.
GitHub stars n/a Velocity flat History pending Interpretable LLM Decision Making Apr 14 Pending High viability
SCRIPT: A Subcharacter Compositional Representation Injection Module for Korean Pre-Trained Language Models Build Now
A module that injects subcharacter compositional knowledge into Korean language models to enhance their understanding and generation capabilities.
GitHub stars n/a Velocity flat History pending LLM Adaptation Apr 14 Pending High viability
GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees Build Now
GF-Score provides certified class-conditional robustness evaluation with fairness guarantees, decomposing aggregate scores into per-class profiles and eliminating the need for adversarial attacks.
GitHub stars n/a Velocity flat History pending AI Robustness Evaluation Apr 14 Pending High viability
LLM-Based Automated Diagnosis Of Integration Test Failures At Google Build Now
An LLM-powered tool integrated into Google's code review system that diagnoses integration test failures with 90% accuracy, deployed across thousands of tests.
GitHub stars n/a Velocity flat History pending LLM Agents Apr 13 Code High viability
SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker Build Now
SEATrack is a simple, efficient, and adaptive multimodal tracker that uses AMG-LoRA and HMoE to improve cross-modal alignment and global relation modeling.
GitHub stars n/a Velocity flat History pending Multimodal Tracking Apr 14 Pending High viability
Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space Build Now
Demonstrates that agent identity creates attractor-like geometry in LLM activation space, enabling persistent cognitive cores.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Pending High viability
MVAdapt: Zero-Shot Multi-Vehicle Adaptation for End-to-End Autonomous Driving Build Now
Adapt autonomous driving models to different vehicle dynamics with a physics-conditioned framework for improved transferability.
GitHub stars n/a Velocity flat History pending Autonomous Driving Apr 13 Pending High viability
Topology-Aware Reasoning over Incomplete Knowledge Graph with Graph-Based Soft Prompting Build Now
A graph-based soft prompting framework for LLMs to perform multi-hop knowledge graph question answering by reasoning over subgraphs, reducing sensitivity to KG incompleteness.
GitHub stars n/a Velocity flat History pending Knowledge Graph QA Apr 14 Pending High viability
KG-Reasoner: A Reinforced Model for End-to-End Multi-Hop Knowledge Graph Reasoning Build Now
KG-Reasoner is an end-to-end framework using Reinforcement Learning to enable LLMs to perform precise multi-hop reasoning over Knowledge Graphs.
GitHub stars n/a Velocity flat History pending Knowledge Graph Reasoning Apr 14 Pending High viability
Modality-Native Routing in Agent-to-Agent Networks: A Multimodal A2A Protocol Extension Build Now
MMA2A extends Agent-to-Agent networks with modality-native routing, significantly improving cross-modal reasoning accuracy by preserving richer context for downstream agents.
GitHub stars n/a Velocity flat History pending Multi-Agent Systems Apr 14 Pending High viability
Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks Build Now
A novel agent design paradigm that grounds reasoning in deterministic computation before querying LLMs for spatial-aware benchmarks, achieving competitive accuracy.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Pending High viability
DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding Build Now
DocSeeker is a multimodal LLM framework for understanding long documents by structuring analysis, localization, and reasoning with evidence grounding.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
X-VC: Zero-shot Streaming Voice Conversion in Codec Space Build Now
X-VC is a zero-shot, low-latency voice conversion tool that transforms voice characteristics without pre-training, ideal for real-time applications.
GitHub stars n/a Velocity flat History 1 snapshot Audio Technology Apr 14 Code High viability
LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks Build Now
Optimize ad personalization for cold-start users using generative CTR models powered by hypernetworks.
GitHub stars n/a Velocity flat History 1 snapshot AdTech Apr 13 Code High viability
Parallax: Why AI Agents That Think Must Never Act Build Now
A novel security paradigm for AI agents that separates reasoning from execution to prevent malicious actions, with a highly effective open-source implementation.
GitHub stars n/a Velocity flat History pending AI Agent Security Apr 14 Pending High viability
HintMR: Eliciting Stronger Mathematical Reasoning in Small Language Models Build Now
A hint-assisted reasoning framework that uses two small language models to improve mathematical problem-solving capabilities.
GitHub stars n/a Velocity flat History pending LLM Reasoning Apr 14 Code High viability
FRTSearch: Unified Detection and Parameter Inference of Fast Radio Transients using Instance Segmentation Build Now
An end-to-end AI framework that unifies the detection and physical characterization of Fast Radio Transients, significantly reducing false positives and increasing speed.
GitHub stars n/a Velocity flat History pending Astronomy AI Apr 14 Code High viability
Beyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented Generation Build Now
This paper introduces an Opinion-Aware Retrieval-Augmented Generation system that addresses the bias of current RAG models towards factual content, enabling synthesis of diverse perspectives and improving retrieval diversity.
GitHub stars n/a Velocity flat History pending Retrieval-Augmented Generation Apr 13 Code High viability
Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs Build Now
MemJack is a memory-augmented multi-agent framework that uses visual semantics to automate jailbreak attacks on VLMs, achieving high success rates and releasing a comprehensive benchmark dataset.
GitHub stars n/a Velocity flat History 1 snapshot VLM Security / Agents Apr 14 Code High viability
SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents Build Now
SIR-Bench is a benchmark for evaluating autonomous security incident response agents, measuring genuine investigation depth beyond simple alert parroting.
GitHub stars n/a Velocity flat History pending Security Agents Apr 13 Code High viability
Towards Long-horizon Agentic Multimodal Search Watch
Develop a multimodal search agent for complex, long-horizon reasoning tasks.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal Search and Reasoning Apr 14 Pending
PromptEcho: Annotation-Free Reward from Vision-Language Models for Text-to-Image Reinforcement Learning Build Now
A novel, annotation-free reward system for text-to-image models that leverages existing vision-language models to significantly improve prompt adherence and image quality.
GitHub stars n/a Velocity flat History pending Text-to-Image Generation Apr 14 Code High viability
Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs Build Now
A deterministic framework for enterprise-grade text categorization using LLMs, improving accuracy and reproducibility by focusing on high-value semantic features.
GitHub stars n/a Velocity flat History pending LLM Text Categorization Apr 13 Code High viability
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Build Now
An offline distillation framework for large language models that significantly speeds up post-training for reasoning and code generation tasks without requiring a live teacher server.
GitHub stars n/a Velocity flat History pending LLM Post-Training Apr 14 Code High viability
KumoRFM-2: Scaling Foundation Models for Relational Learning Build Now
A foundation model for relational data that natively processes connected tables, outperforming supervised methods for predictive tasks.
GitHub stars n/a Velocity flat History pending Relational Foundation Models Apr 14 Code High viability
Human-Centric Topic Modeling with Goal-Prompted Contrastive Learning and Optimal Transport Build Now
A human-centric topic modeling system that uses LLMs and optimal transport to generate goal-oriented and interpretable topics.
GitHub stars n/a Velocity flat History pending LLM Applications Apr 14 Code High viability
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety Build Now
A language-agnostic method for LLM safety that anchors alignment in the model's semantic bottleneck, significantly reducing attack success rates across languages.
GitHub stars n/a Velocity flat History pending LLM Safety Apr 13 Code High viability
GCA Framework: A Gulf-Grounded Dataset and Agentic Pipeline for Climate Decision Support Build Now
A Gulf-focused multimodal dataset and tool-augmented agent framework for climate decision support, improving LLM reliability through domain fine-tuning and geospatial integration.
GitHub stars n/a Velocity flat History pending Climate AI Apr 14 Code High viability
BEAM: Bi-level Memory-adaptive Algorithmic Evolution for LLM-Powered Heuristic Design Build Now
BEAM is a bi-level evolutionary algorithm that designs high-level algorithmic structures for LLM-powered heuristic design, significantly outperforming existing methods in optimization problems and designing a novel solver for Maximum Independent Set.
GitHub stars n/a Velocity flat History pending LLM Optimization Apr 14 Code High viability
Socrates Loss: Unifying Confidence Calibration and Classification by Leveraging the Unknown Build Now
Socrates Loss unifies classification and confidence calibration for deep neural networks by leveraging an auxiliary unknown class, improving stability and accuracy-performance trade-offs.
GitHub stars n/a Velocity flat History pending Model Calibration Apr 14 Pending High viability
SpanKey: Dynamic Key Space Conditioning for Neural Network Access Control Watch
A lightweight neural network access control method that conditions activations on secret keys, enabling gated inference without encrypting weights.
GitHub stars n/a Velocity flat History pending AI Security Apr 14 Pending
The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results Watch
A challenge and benchmark for cross-domain few-shot object detection, evaluating novel methods for unseen target domains.
GitHub stars n/a Velocity flat History pending Computer Vision Apr 13 Pending
Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds Watch
A zero-shot pipeline for frugal knowledge graph construction using local LLMs, achieving competitive results with consumer hardware.
GitHub stars n/a Velocity flat History pending Knowledge Graph Construction Apr 13 Pending
AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow Build Now
An LLM-driven multi-agent framework that enables domain scientists to autonomously build high-quality deep learning surrogate models for complex simulations using natural language.
GitHub stars n/a Velocity flat History pending AI for Scientific Simulation Apr 13 Code High viability
CIA: Inferring the Communication Topology from LLM-based Multi-Agent Systems Build Now
A novel attack infers communication topologies in LLM-based multi-agent systems from black-box queries, revealing significant privacy risks.
GitHub stars n/a Velocity flat History pending LLM Security Apr 14 Code High viability
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation Build Now
A minimalist conversational memory framework using only retrieval and generation, addressing signal sparsity and redundancy for robust long-term dialogue management.
GitHub stars n/a Velocity flat History pending Conversational AI Apr 13 Code High viability
Towards grounded autonomous research: an end-to-end LLM mini research loop on published computational physics Build Now
An end-to-end LLM agent autonomously reproduces, critiques, and extends computational physics research, demonstrating a novel mini research loop capable of identifying significant scientific findings.
GitHub stars n/a Velocity flat History pending LLM Research Agents Apr 14 Code High viability
CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades Build Now
A multi-agent deliberation system for LLM cascades that uses debate to improve accuracy and reduce costs in uncertain queries.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Build Now
A new benchmark for self-evolving AI agents in real-world engineering tasks, focusing on iterative optimization and feasibility constraints.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
Unveiling the Surprising Efficacy of Navigation Understanding in End-to-End Autonomous Driving Build Now
A new framework and dataset for autonomous driving that significantly improves navigation understanding by integrating global and local planning, achieving state-of-the-art results without auxiliary losses.
GitHub stars n/a Velocity flat History pending Autonomous Driving Apr 14 Code High viability
NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1) Build Now
A challenge and dataset for multimodal large language models to perform professional image quality assessment and provide expert-level reasoning.
GitHub stars n/a Velocity flat History pending Image Quality Assessment Apr 14 Pending High viability
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Build Now
An open-source, efficient hybrid Mamba-Transformer model for agentic reasoning that significantly outperforms existing models in inference throughput.
GitHub stars n/a Velocity flat History pending LLM Training Apr 14 Code High viability
Beyond Prompt: Fine-grained Simulation of Cognitively Impaired Standardized Patients via Stochastic Steering Build Now
StsPatient simulates cognitively impaired standardized patients with fine-grained control over impairment severity and domain specificity using steering vectors and stochastic token modulation.
GitHub stars n/a Velocity flat History pending Clinical Simulation Apr 14 Code High viability
Detecting and refurbishing ground truth errors during training of deep learning-based echocardiography segmentation models Build Now
A novel strategy to detect and correct errors in medical image segmentation training data, improving model performance under challenging conditions.
GitHub stars n/a Velocity flat History pending Medical AI Apr 14 Code High viability
RPRA: Predicting an LLM-Judge for Efficient but Performant Inference Build Now
Enabling smaller LLMs to predict their own performance limitations to improve efficiency and self-awareness in AI systems.
GitHub stars n/a Velocity flat History pending LLM Self-Awareness Apr 14 Code High viability
Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization Build Now
A novel reinforcement learning paradigm for LLMs that bifurcates policy into normal and high-entropy modes to improve exploration without sacrificing accuracy.
GitHub stars n/a Velocity flat History pending LLM Reinforcement Learning Apr 13 Code High viability
WiseOWL: A Methodology for Evaluating Ontological Descriptiveness and Semantic Correctness for Ontology Reuse and Ontology Recommendations Build Now
WiseOWL is a methodology and Streamlit app that scores ontologies on descriptiveness and semantic correctness to guide reuse and recommendations.
GitHub stars n/a Velocity flat History pending Ontology Engineering Apr 13 Code High viability
GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses Build Now
A novel training recipe and dataset for LLMs that generates constructive scientific paper feedback, significantly improving its validity and actionability as validated by expert authors.
GitHub stars n/a Velocity flat History pending LLM Feedback and Evaluation Apr 13 Code High viability
MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents Build Now
A multimodal chunking pipeline that leverages document structure and vision to significantly improve RAG performance on long industrial documents.
GitHub stars n/a Velocity flat History pending RAG Enhancement Apr 14 Code High viability
Beyond Scores: Diagnostic LLM Evaluation via Fine-Grained Abilities Build Now
A diagnostic framework for LLMs that moves beyond single scores to assess fine-grained abilities across scientific domains, enabling targeted model improvement and selection.
GitHub stars n/a Velocity flat History pending LLM Evaluation Apr 14 Code High viability
Modeling Co-Pilots for Text-to-Model Translation Build Now
A suite of LLM co-pilots and a novel dataset for translating natural language into formal models for optimization and satisfaction problems, with an interactive editor.
GitHub stars n/a Velocity flat History pending Combinatorial Optimization Modeling Apr 14 Code High viability
A Two-Stage LLM Framework for Accessible and Verified XAI Explanations Build Now
A two-stage LLM framework that verifies and refines AI explanations for accuracy and accessibility, making XAI systems more trustworthy.
GitHub stars n/a Velocity flat History pending XAI Apr 14 Code High viability
GeM-EA: A Generative and Meta-learning Enhanced Evolutionary Algorithm for Streaming Data-Driven Optimization Build Now
An evolutionary algorithm enhanced with generative replay and meta-learning for faster and more robust optimization of streaming data.
GitHub stars n/a Velocity flat History pending Optimization AI Apr 14 Code High viability
Intelligent ROI-Based Vehicle Counting Framework for Automated Traffic Monitoring Build Now
An intelligent framework automatically identifies optimal regions for vehicle counting in traffic videos, achieving high accuracy and up to 4x faster processing.
GitHub stars n/a Velocity flat History pending Traffic Monitoring Apr 14 Code High viability
IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation Build Now
A unified model for advanced industrial anomaly detection to boost manufacturing efficiency.
GitHub stars n/a Velocity flat History 1 snapshot Industrial Automation Apr 14 Code High viability
TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning Build Now
TRUST Agents is a multi-agent framework for explainable fake news detection and claim reasoning, offering improved interpretability and evidence transparency.
GitHub stars n/a Velocity flat History pending Fact Verification & LLM Agents Apr 14 Code High viability
Ride the Wave: Precision-Allocated Sparse Attention for Smooth Video Generation Build Now
Precision-Allocated Sparse Attention (PASA) accelerates video generation by dynamically allocating computation to critical transitions and reducing visual flickering.
GitHub stars n/a Velocity flat History pending Video Generation Apr 14 Code High viability
Long-Horizon Plan Execution in Large Tool Spaces through Entropy-Guided Branching Build Now
SLATE is a new benchmark for evaluating tool-integrated LLM agents, and EGB is an uncertainty-aware search algorithm that improves task success and efficiency in large tool spaces.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Code High viability
Benchmarking Deflection and Hallucination in Large Vision-Language Models Build Now
A benchmark and methodology to evaluate deflection and hallucination in Vision-Language Models, ensuring reliability by filtering for retrieval-dependent samples and assessing behavior under conflicting evidence.
GitHub stars n/a Velocity flat History pending Multimodal LLMs Apr 13 Code High viability
RACF: A Resilient Autonomous Car Framework with Object Distance Correction Build Now
A resilient framework for autonomous cars that uses sensor fusion and an object distance correction algorithm to improve perception robustness in real-time.
GitHub stars n/a Velocity flat History pending Autonomous Vehicle Perception Apr 14 Code High viability
Decoding by Perturbation: Mitigating MLLM Hallucinations via Dynamic Textual Perturbation Build Now
A training-free framework that mitigates multimodal LLM hallucinations by dynamically perturbing text to stabilize visual grounding.
GitHub stars n/a Velocity flat History pending LLM Hallucination Mitigation Apr 14 Code High viability
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks Build Now
A meta-learning framework that generates synthetic tasks to improve black-box optimization from small offline datasets.
GitHub stars n/a Velocity flat History pending Optimization AI Apr 14 Code High viability
BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning Build Now
BID-LoRA, a parameter-efficient framework for Continual Learning and Unlearning that precisely deletes unwanted knowledge and efficiently integrates new knowledge while minimizing leakage.
GitHub stars n/a Velocity flat History pending Continual Learning Apr 14 Code High viability
From Plan to Action: How Well Do Agents Follow the Plan? Build Now
This research analyzes how well AI agents follow plans, revealing that explicit plan guidance and reminders improve performance, suggesting a need for models that adaptively follow instructions rather than memorizing workflows.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Code High viability
MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer Build Now
A training-free framework for multi-style image transfer that uses mask-guided attention to prevent artifacts and preserve structure.
GitHub stars n/a Velocity flat History pending Generative Image Apr 14 Code High viability
Cognition-Inspired Dual-Stream Semantic Enhancement for Vision-Based Dynamic Emotion Modeling Build Now
A cognition-inspired dual-stream model that enhances dynamic emotion recognition by integrating semantic and contextual knowledge with facial dynamics, achieving state-of-the-art performance.
GitHub stars n/a Velocity flat History pending Emotion Recognition Apr 14 Code High viability
Visual Preference Optimization with Rubric Rewards Build Now
A framework for improving multimodal AI by using instance-specific rubrics to generate preference data, outperforming existing methods on benchmarks.
GitHub stars n/a Velocity flat History pending Multimodal AI Apr 14 Code High viability
OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension Build Now
A hardware-efficient framework for LLM quantization that suppresses activation outliers to maintain accuracy and achieve significant speedups on modern AI accelerators.
GitHub stars n/a Velocity flat History pending LLM Quantization Apr 14 Code High viability
MISID: A Multimodal Multi-turn Dataset for Complex Intent Recognition in Strategic Deception Games Build Now
MISID, a multimodal dataset and FRACTAM framework for complex intent recognition in strategic deception games, addressing deficiencies in current MLLMs for long-context discourse and cross-modal synergy.
GitHub stars n/a Velocity flat History pending Multimodal AI Apr 14 Code High viability
Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models Build Now
A prompting schema that integrates expert system heuristics to improve structured reasoning and efficiency in large language models for complex problem-solving.
GitHub stars n/a Velocity flat History pending LLM Reasoning Apr 14 Code High viability
The Long-Horizon Task Mirage? Diagnosing Where and Why Agentic Systems Break Build Now
A diagnostic benchmark and methodology for analyzing and attributing failures in LLM agents on long-horizon tasks.
GitHub stars n/a Velocity flat History pending Agentic Systems Apr 13 Code High viability
EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports Build Now
A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development for virtual egocentric applications.
GitHub stars n/a Velocity flat History pending Video LLMs Apr 14 Code High viability
Euler-inspired Decoupling Neural Operator for Efficient Pansharpening Build Now
A physics-inspired neural operator synthesizes high-resolution multispectral images from panchromatic and low-resolution data with improved efficiency and accuracy.
GitHub stars n/a Velocity flat History pending Image Processing Apr 14 Code High viability
Preventing Safety Drift in Large Language Models via Coupled Weight and Activation Constraints Build Now
A novel approach that simultaneously constrains model weights and activations to prevent safety degradation during large language model fine-tuning.
GitHub stars n/a Velocity flat History pending LLM Safety Apr 14 Code High viability
Calibration-Aware Policy Optimization for Reasoning LLMs Build Now
Improving LLM reasoning accuracy and calibration by optimizing for uncertainty-aware advantage estimation, reducing overconfidence and hallucinations.
GitHub stars n/a Velocity flat History pending LLM Reasoning Calibration Apr 14 Code High viability
TimeSAF: Towards LLM-Guided Semantic Asynchronous Fusion for Time Series Forecasting Build Now
A novel LLM-guided framework for time series forecasting that asynchronously fuses semantic and temporal features to achieve state-of-the-art performance and strong generalization.
GitHub stars n/a Velocity flat History pending Time Series Forecasting Apr 14 Code High viability
Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning Build Now
A case-based learning framework for LLM agents that transfers prior task experience to new, complex real-world settings, improving structured analysis and performance.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
Coding-Free and Privacy-Preserving MCP Framework for Clinical Agentic Research Intelligence System Build Now
An AI system that automates clinical research workflows, from planning to report generation, without requiring coding or direct patient data access.
GitHub stars n/a Velocity flat History pending Clinical AI Agents Apr 14 Code High viability
A longitudinal health agent framework Build Now
A framework for building longitudinal health agents that maintain user engagement and adapt to evolving goals over time.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Code High viability
Interpretable DNA Sequence Classification via Dynamic Feature Generation in Decision Trees Build Now
A framework that uses LLMs to generate interpretable, high-level DNA sequence features for improved classification accuracy.
GitHub stars n/a Velocity flat History pending Genomic AI Apr 13 Code High viability
FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators Build Now
FastGrasp is a learning-based framework for fast, dexterous grasping with mobile manipulators, integrating grasp guidance, whole-body control, and tactile feedback to achieve robust manipulation across diverse scenarios.
GitHub stars n/a Velocity flat History pending Robotics Apr 14 Code High viability
Policy-Invisible Violations in LLM-Based Agents Build Now
A framework for LLM agents that uses graph simulation to enforce policies by detecting violations hidden in action outcomes.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference Build Now
Enabling privacy-preserving LLM inference by integrating fully homomorphic encryption with Llama 3 for secure text generation.
GitHub stars n/a Velocity flat History pending Privacy Preserving AI Apr 14 Code High viability
INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents Build Now
A new benchmark and fine-tuning approach for cross-lingual table understanding in Bahasa Indonesia documents, significantly improving VLM performance on complex tables and low-resource languages.
GitHub stars n/a Velocity flat History pending Multilingual Document Understanding Apr 13 Code High viability
RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair Watch
RePAIR offers interactive, prompt-based machine unlearning for language models, allowing users to autonomously control data erasure at inference time.
GitHub stars n/a Velocity flat History 1 snapshot Interactive AI Apr 14 Code
A hierarchical spatial-aware algorithm with efficient reinforcement learning for human-robot task planning and allocation in production Build Now
An efficient reinforcement learning algorithm for human-robot task planning and allocation in manufacturing, considering spatial awareness and dynamic environments.
GitHub stars n/a Velocity flat History pending Robotics Apr 14 Code High viability
Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models Watch
A novel method to accelerate vision foundation model training by creating a chain of models that transfer knowledge sequentially, reducing computational cost without performance loss.
GitHub stars n/a Velocity flat History pending Vision Foundation Model Training Apr 14 Pending
Security and Resilience in Autonomous Vehicles: A Proactive Design Approach Build Now
A proactive design approach for autonomous vehicles integrating redundancy, diversity, and anomaly detection to ensure security and resilience against cyberattacks.
GitHub stars n/a Velocity flat History pending Autonomous Vehicle Security Apr 14 Code High viability
Towards Platonic Representation for Table Reasoning: A Foundation for Permutation-Invariant Retrieval Build Now
This paper introduces the Platonic Representation Hypothesis for tables, advocating for permutation-invariant representations to improve robustness in table reasoning and retrieval systems, and proposes a novel structure-aware encoder.
GitHub stars n/a Velocity flat History pending Table Representation Learning Apr 13 Code High viability
The A-R Behavioral Space: Execution-Level Profiling of Tool-Using Language Model Agents in Organizational Deployment Build Now
The A-R Behavioral Space provides an execution-layer profiling method for tool-using LLM agents, characterizing their behavior across different autonomy levels and risk contexts.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Code High viability
Distorted or Fabricated? A Survey on Hallucination in Video LLMs Ignore
A survey and taxonomy of hallucinations in Video LLMs, identifying dynamic distortion and content fabrication as key issues.
GitHub stars n/a Velocity flat History pending Video LLMs Apr 14 Pending
Efficient Adversarial Training via Criticality-Aware Fine-Tuning Build Now
A parameter-efficient fine-tuning method that significantly reduces the computational cost of adversarial training for Vision Transformers by focusing on robustness-critical parameters.
GitHub stars n/a Velocity flat History pending Robustness Training Apr 14 Code High viability
Clustering-Enhanced Domain Adaptation for Cross-Domain Intrusion Detection in Industrial Control Systems Build Now
A clustering-enhanced domain adaptation method for industrial control systems that significantly improves unknown attack detection and reduces performance degradation.
GitHub stars n/a Velocity flat History pending Cybersecurity AI Apr 14 Code High viability
Cooperative Memory Paging with Keyword Bookmarks for Long-Horizon LLM Conversations Build Now
A novel LLM memory system that uses keyword bookmarks and a recall tool to significantly improve long-conversation answer quality, outperforming existing methods.
GitHub stars n/a Velocity flat History pending LLM Memory Management Apr 14 Code High viability
Operationalising the Right to be Forgotten in LLMs: A Lightweight Sequential Unlearning Framework for Privacy-Aligned Deployment in Politically Sensitive Environments Build Now
A lightweight sequential unlearning framework for LLMs that enables privacy-aligned deployment by suppressing sensitive data with minimal impact on performance.
GitHub stars n/a Velocity flat History pending LLM Privacy Apr 14 Code High viability
Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents Build Now
LLM agents with dual-trace memory encoding significantly improve cross-session recall and temporal reasoning by pairing facts with contextual scene traces.
GitHub stars n/a Velocity flat History pending Agents Apr 14 Code High viability
Curvelet-Based Frequency-Aware Feature Enhancement for Deepfake Detection Build Now
A novel deepfake detection method using Curvelet transforms with wedge-level attention and scale-aware spatial masking to enhance frequency-domain features for improved robustness.
GitHub stars n/a Velocity flat History pending Deepfake Detection Apr 13 Code High viability
VISTA: Validation-Informed Trajectory Adaptation via Self-Distillation Build Now
VISTA is a self-distillation framework that improves model robustness and generalization by enforcing consistency along the optimization trajectory using validation-informed anchors.
GitHub stars n/a Velocity flat History pending Model Training Optimization Apr 13 Code High viability
VFA: Relieving Vector Operations in Flash Attention with Global Maximum Pre-computation Watch
VFA is a hardware-friendly method to optimize FlashAttention by reducing vector operations, achieving significant speedups without performance loss.
GitHub stars n/a Velocity flat History pending LLM Optimization Apr 14 Code
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection Watch
Revolutionize bug detection with LLM-based test generation for software projects.
GitHub stars n/a Velocity flat History 1 snapshot AI-based Testing Apr 13 Code
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Ignore
Investigating the dynamics and mechanisms of on-policy distillation for large language models, proposing strategies to improve its success.
GitHub stars n/a Velocity flat History pending LLM Training Apr 14 Pending
One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness Watch
Reveals that instruction-tuned LLMs are fragile to simple lexical constraints, leading to significant response collapse, and proposes a two-pass generation method to recover performance.
GitHub stars n/a Velocity flat History pending LLM Robustness Apr 14 Code
Cycle-Consistent Search: Question Reconstructability as a Proxy Reward for Search Agent Training Watch
A gold-supervision-free framework for training search agents using cycle-consistency to reconstruct original questions from search trajectories.
GitHub stars n/a Velocity flat History pending Search Agent Training Apr 14 Code
Is Vibe Coding the Future? An Empirical Assessment of LLM Generated Codes for Construction Safety Watch
Empirical assessment of LLM-generated code for construction safety reveals significant silent failure rates, highlighting the need for deterministic AI wrappers and governance.
GitHub stars n/a Velocity flat History pending LLM Safety Apr 14 Code
PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation Build Now
PR-MaGIC is a training-free framework that refines prompts for in-context segmentation using gradient flow from SAM's mask decoder, improving segmentation quality without additional training.
GitHub stars n/a Velocity flat History pending Computer Vision Apr 13 Code High viability
TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs Watch
A fuzzing framework to find vulnerabilities in LLM chat templates for jailbreaking and red teaming.
LLM Security Apr 14 High viability
When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP Watch
Evaluates LLM and back-translation data augmentation for low-resource African languages, revealing task-specific effectiveness for NLP tasks like NER and POS tagging.
GitHub stars n/a Velocity flat History pending Low-Resource NLP Apr 14 Code
Narrative-Driven Paper-to-Slide Generation via ArcDeck Watch
A multi-agent framework that reconstructs the narrative flow of academic papers to generate coherent and logically structured presentations.
GitHub stars n/a Velocity flat History pending AI Agents for Content Generation Apr 13 Code
Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe Watch
Extracting low-resource language data from LLMs using strategic prompting, with code and corpora released.
LLM Data Mining Apr 14 High viability
Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models Build Now
LLMs for molecular optimization that preserve scaffold integrity and improve properties, trained on chemistry-grounded preference data.
GitHub stars n/a Velocity flat History pending Drug Discovery AI Apr 14 Code High viability
Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments Build Now
A novel neural compression technique for temporal lightmaps, enabling high-quality dynamic global illumination in real-time rendering with significantly reduced storage.
GitHub stars n/a Velocity flat History pending Real-time Rendering Compression Apr 14 Code High viability
QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence Build Now
QuarkMedSearch is a long-horizon deep search agent for Chinese medical intelligence, achieving state-of-the-art performance through novel data construction and training strategies.
GitHub stars n/a Velocity flat History pending Medical AI Agents Apr 14 Code High viability
Rethinking Satellite Image Restoration for Onboard AI: A Lightweight Learning-Based Approach Build Now
A lightweight convolutional network for satellite image restoration, demonstrating competitive quality and significant latency reduction for onboard AI applications.
GitHub stars n/a Velocity flat History pending Computer Vision Apr 14 Code High viability
Thermodynamic Liquid Manifold Networks: Physics-Bounded Deep Learning for Solar Forecasting in Autonomous Off-Grid Microgrids Watch
A physics-bounded deep learning model for solar forecasting in autonomous microgrids that eliminates nocturnal generation and achieves zero-lag synchronization during weather transients.
Solar Forecasting Apr 13 High viability
ARGOS: Who, Where, and When in Agentic Multi-Camera Person Search Build Now
ARGOS is a benchmark and framework for agentic multi-camera person search, requiring agents to reason, question, and use tools to identify individuals under information asymmetry.
GitHub stars n/a Velocity flat History pending Agentic Multi-Camera Person Search Apr 14 Code High viability
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models Build Now
MODIX is a training-free framework that scales positional indices in Vision-Language Models based on information density, improving multimodal reasoning.
GitHub stars n/a Velocity flat History pending Vision-Language Models Apr 14 Code High viability
Not All Forgetting Is Equal: Architecture-Dependent Retention Dynamics in Fine-Tuned Image Classifiers Watch
Analyzing architecture-dependent sample forgetting in fine-tuned image classifiers to inform data pruning and curriculum design.
GitHub stars n/a Velocity flat History pending Model Forgetting Analysis Apr 13 Code
Round-Trip Translation Reveals What Frontier Multilingual Benchmarks Miss Watch
Round-trip translation reveals limitations in current multilingual benchmarks and proposes a more realistic evaluation method with the Lost in Translation benchmark.
GitHub stars n/a Velocity flat History pending Multilingual LLMs Apr 14 Code
Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining Ignore
A reproducible workflow for building and validating a legally screened chemistry corpus from S2ORC, enriched with embeddings and annotations for downstream text mining.
GitHub stars n/a Velocity flat History pending Data Corpus Generation Apr 14 Code
Human-Inspired Context-Selective Multimodal Memory for Social Robots Watch
A human-inspired multimodal memory system for social robots that selectively stores and retrieves emotional and novel experiences for personalized interactions.
GitHub stars n/a Velocity flat History pending Robotics Apr 13 Code
LLMs Struggle with Abstract Meaning Comprehension More Than Expected Watch
A bidirectional attention classifier that improves LLMs' ability to comprehend abstract meanings in text.
GitHub stars n/a Velocity flat History pending LLM Comprehension Apr 13 Code
SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration Watch
SpecBound accelerates LLM inference by adaptively bounding self-speculation length and calibrating confidence layer-wise, achieving up to 2.33x speedup without modifying base LLM parameters.
LLM Inference Apr 14
DeepTest Tool Competition 2026: Benchmarking an LLM-Based Automotive Assistant Watch
Reports on the first LLM Testing competition benchmarking an LLM-based automotive assistant for car manual information retrieval failures.
GitHub stars n/a Velocity flat History pending LLM Testing / Automotive Apr 14 Code
Evaluating Relational Reasoning in LLMs with REL Ignore
A new benchmark framework to evaluate relational reasoning in LLMs by systematically varying the complexity of entity binding.
GitHub stars n/a Velocity flat History pending LLM Evaluation Apr 14 Code
Artificial Intelligence for Modeling and Simulation of Mixed Automated and Human Traffic Ignore
A survey of AI methods for modeling mixed automated and human traffic simulation, proposing a taxonomy and identifying gaps in existing tools.
GitHub stars n/a Velocity flat History pending AI for Traffic Simulation Apr 14 Code
Information-Theoretic Optimization for Task-Adapted Compressed Sensing Magnetic Resonance Imaging Ignore
An information-theoretic framework for task-adapted compressed sensing MRI that enables probabilistic inference for uncertainty prediction and adaptive sampling for clinical tasks.
GitHub stars n/a Velocity flat History pending Medical AI Apr 14 Code
How Transformers Learn to Plan via Multi-Token Prediction Ignore
This research explores multi-token prediction as a superior objective for training language models to perform complex reasoning tasks like planning, outperforming standard next-token prediction.
GitHub stars n/a Velocity flat History pending LLM Reasoning Apr 13 Code
LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software Ignore
Introduces LogicEval, a framework and dataset for evaluating automated repair techniques for logical vulnerabilities in software, highlighting challenges for LLM-based approaches.
GitHub stars n/a Velocity flat History pending Automated Program Repair Apr 14 Code
Local-Splitter: A Measurement Study of Seven Tactics for Reducing Cloud LLM Token Usage on Coding-Agent Workloads Watch
A measurement study of seven tactics to reduce cloud LLM token usage for coding agents, offering workload-dependent strategies for significant cost savings.
LLM Optimization Apr 14
Observing the unobserved confounding through its effects: toward randomized trial-like estimates from real-world survival data Ignore
This research proposes a framework to infer and balance a latent prognostic factor from observational survival data, aiming to reduce unobserved confounding and improve treatment-effect estimation.
GitHub stars n/a Velocity flat History pending Causal Inference Apr 13 Code
DoseRAD2026 Challenge dataset: AI accelerated photon and proton dose calculation for radiotherapy Ignore
A new benchmark dataset and challenge for AI-accelerated photon and proton dose calculation in radiotherapy, enabling the development of faster and more accurate dose prediction methods.
GitHub stars n/a Velocity flat History pending Medical Imaging AI Apr 14 Code
Latent patterns of urban mixing in mobility analysis across five global cities Ignore
Leveraging large-scale travel surveys and graph neural networks, this research uncovers latent patterns of urban social mixing across five global cities, revealing how mobility shapes social interactions.
GitHub stars n/a Velocity flat History pending Urban Mobility Analysis Apr 14 Code
ReflectCAP: Detailed Image Captioning with Reflective Memory Watch
A multi-agent system that uses structured reflection notes to improve the factuality and coverage of image captions generated by large vision-language models.
Image Captioning Apr 14
Memory as Metabolism: A Design for Companion Knowledge Systems Ignore
A novel design for companion knowledge systems that addresses entrenchment and drift in personal LLM wikis by implementing a metabolism-like process for memory updates.
GitHub stars n/a Velocity flat History pending Agents Apr 13 Code
Bilevel Late Acceptance Hill Climbing for the Electric Capacitated Vehicle Routing Problem Ignore
A bilevel optimization algorithm for the Electric Capacitated Vehicle Routing Problem that separates routing and charging decisions to accelerate convergence and achieve near-optimal solutions.
GitHub stars n/a Velocity flat History pending Optimization Algorithms Apr 14 Code
Designing Reliable LLM-Assisted Rubric Scoring for Constructed Responses: Evidence from Physics Exams Watch
Designing reliable LLM-assisted rubric scoring for constructed responses in STEM exams, focusing on rubric design and LLM configurations.
AI Education Apr 14
CycloneMAE: A Scalable Multi-Task Learning Model for Global Tropical Cyclone Probabilistic Forecasting Ignore
CycloneMAE is a scalable multi-task learning model for global tropical cyclone probabilistic forecasting, outperforming NWP systems in pressure, wind, and track forecasting.
Weather Forecasting AI Apr 14
PAL: Personal Adaptive Learner Ignore
A personalized learning tool that adapts educational content to individual student needs.
GitHub stars n/a Velocity flat History 1 snapshot Adaptive Learning Apr 14 Code
ARGen: Affect-Reinforced Generative Augmentation towards Vision-based Dynamic Emotion Perception Ignore
A framework that uses affect-reinforced generative augmentation to create realistic dynamic facial expressions for improved emotion recognition.
Generative AI for Vision Apr 14
BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH Ignore
A novel Bayesian optimization approach that selectively uses related-task information to improve sample efficiency.
GitHub stars n/a Velocity flat History pending Optimization Apr 13 Code
LLM-Guided Prompt Evolution for Password Guessing Ignore
Automated prompt evolution using LLM agents to significantly improve password guessing rates for security auditing.
LLM Security Apr 14
GAM: Hierarchical Graph-based Agentic Memory for LLM Agents Ignore
A hierarchical graph-based memory framework for LLM agents that separates memory encoding and consolidation to improve long-term coherence.
LLM Agents Apr 14
Efficient Semantic Image Communication for Traffic Monitoring at the Edge Ignore
Develops semantic image communication pipelines for edge traffic monitoring that drastically reduce data transmission costs by replacing full images with compact representations.
Edge AI / Computer Vision Apr 14
Development, Evaluation, and Deployment of a Multi-Agent System for Thoracic Tumor Board Ignore
An AI system automates patient summary generation for thoracic tumor boards, improving efficiency and accuracy in clinical practice.
Medical AI Apr 14
SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models Ignore
Introduces SOAR, a novel post-training method for diffusion models that corrects exposure bias and improves alignment without requiring reward models.
Diffusion Models Apr 14
Evaluating the Limitations of Protein Sequence Representations for Parkinson's Disease Classification Ignore
Evaluating protein sequence representations for Parkinson's disease classification shows limited discriminative power, requiring more informative biological features.
GitHub stars n/a Velocity flat History pending Medical AI Apr 13 Code
Representation geometry shapes task performance in vision-language modeling for CT enterography Ignore
Investigating representation geometry in vision-language models for CT enterography to improve disease assessment and report generation.
Medical AI Apr 14
Cross-Cultural Simulation of Citizen Emotional Responses to Bureaucratic Red Tape Using LLM Agents Ignore
An interactive interface for simulating citizen emotional responses to bureaucratic red tape across cultures to improve policymaking.
Agent Simulation Apr 14
Audio Source Separation in Reverberant Environments using $β$-divergence based Nonnegative Factorization Ignore
Improving audio source separation in reverberant environments using beta-divergence based nonnegative factorization.
Audio AI Apr 14
Can AI Tools Transform Low-Demand Math Tasks? An Evaluation of Task Modification Capabilities Ignore
An evaluation of AI tools' capability to upgrade low-demand math tasks, finding moderate success rates and highlighting the distinct capability of task modification versus classification.
AI for Curriculum Adaptation Apr 14
AISafetyBenchExplorer: A Metric-Aware Catalogue of AI Safety Benchmarks Reveals Fragmented Measurement and Weak Benchmark Governance Ignore
AISafetyBenchExplorer is a catalogue of AI safety benchmarks that reveals fragmentation in measurement and weak governance, providing a structured analysis of the current landscape.
GitHub stars n/a Velocity flat History pending AI Safety Apr 14 Code
Elastic Net Regularization and Gabor Dictionary for Classification of Heart Sound Signals using Deep Learning Ignore
Optimizing deep learning models with elastic net regularization and Gabor dictionaries for accurate heart sound signal classification.
Medical AI Apr 14
Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models Ignore
This research investigates the Identifiable Victim Effect in large language models, revealing how alignment and reasoning training impact their ethical decision-making.
LLM Alignment & Ethics Apr 13
EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture Ignore
A hybrid cognitive architecture that uses a spiking neural network to modulate LLM behavior and enable autonomous actions.
Cognitive Architectures Apr 14
Contextual Multi-Task Reinforcement Learning for Autonomous Reef Monitoring Ignore
Develops a contextual multi-task reinforcement learning approach to enable autonomous underwater vehicles to monitor marine ecosystems more effectively.
Autonomous Systems Apr 14
ROSE: An Intent-Centered Evaluation Metric for NL2SQL Ignore
A new metric for evaluating Natural Language to SQL systems that focuses on answering the user's intent rather than just matching ground truth SQL.
NL2SQL Evaluation Apr 14
When to Forget: A Memory Governance Primitive Ignore
Introduces Memory Worth, a lightweight primitive for governing agent memory quality by tracking co-occurrence with successful outcomes.
Agents Apr 13
Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search Ignore
A hierarchical optimization framework using LLM-driven evolutionary search to efficiently tune GPU kernels across various dimensions.
GPU Optimization Apr 13
LLM-Guided Semantic Bootstrapping for Interpretable Text Classification with Tsetlin Machines Ignore
A framework to transfer LLM knowledge into interpretable Tsetlin Machines for text classification, improving accuracy and transparency without runtime LLM calls.
Interpretable Text Classification Apr 14
Orthogonal Subspace Projection for Continual Machine Unlearning via SVD-Based LoRA Ignore
A novel method for continual machine unlearning that uses SVD-guided orthogonal subspace projection to prevent parameter collision and interference between tasks.
Machine Unlearning Apr 14
Enhancing Clustering: An Explainable Approach via Filtered Patterns Ignore
A theoretical framework to reduce redundancy in explainable clustering by formally characterizing and removing duplicate pattern representations.
GitHub stars n/a Velocity flat History pending Explainable AI Apr 14 Code
How memory can affect collective and cooperative behaviors in an LLM-Based Social Particle Swarm Ignore
This research explores how memory length impacts cooperative behavior in LLM-based agents within a social particle swarm model, using Gemini and Gemma to analyze emergent social dynamics.
Agents Apr 14
Continuous Knowledge Metabolism: Generating Scientific Hypotheses from Evolving Literature Ignore
Continuous Knowledge Metabolism (CKM) is a framework for generating scientific hypotheses by incrementally processing evolving literature and updating a structured knowledge base.
Scientific Discovery Apr 14
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production Ignore
A safe reinforcement learning approach for human-robot task planning and allocation that predicts and filters based on worker fatigue.
Robotics Apr 14
ResBM: Residual Bottleneck Models for Low-Bandwidth Pipeline Parallelism Ignore
Introduces Residual Bottleneck Models (ResBM) for efficient low-bandwidth decentralized training of transformer architectures, achieving significant activation compression.
LLM Training Optimization Apr 13
Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents Ignore
Aethon introduces a reference-based replication primitive for near-constant-time instantiation of stateful AI agents, shifting instantiation from duplication to reference for improved scalability and governance.
AI Infrastructure Apr 13
LIFE -- an energy efficient advanced continual learning agentic AI framework for frontier systems Ignore
A novel framework for energy-efficient continual learning in HPC systems, focusing on agentic AI and brain-inspired architectures for self-evolving network management.
Agentic AI Frameworks Apr 14
From Kinematics to Dynamics: Learning to Refine Hybrid Plans for Physically Feasible Execution Ignore
Reinforcement learning refines robotic motion plans to ensure physical feasibility, bridging the gap between planning and real-world execution.
Robotics Planning Apr 14
Use of AI Tools: Guidelines to Maintain Academic Integrity in Computing Colleges Ignore
Guidelines and a formal model to maintain academic integrity in computing colleges amidst the rise of AI tools like ChatGPT.
Academic Integrity Apr 13
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers Ignore
A tutorial bridging operations research with deep learning for sequential decision-making under uncertainty.
Decision Making AI Apr 13
Disposition Distillation at Small Scale: A Three-Arc Negative Result Ignore
This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.
LLM Training Apr 13
Self-Monitoring Benefits from Structural Integration: Lessons from Metacognition in Continuous-Time Multi-Timescale Agents Ignore
This paper demonstrates that self-monitoring capabilities in RL agents only provide benefits when structurally integrated into the decision pathway, not as auxiliary add-ons.
Reinforcement Learning Agents Apr 13
Social Learning Strategies for Evolved Virtual Soft Robots Ignore
Developing social learning strategies for virtual soft robots to accelerate brain optimization through peer knowledge sharing.
Robotics Apr 14
Characterizing Resource Sharing Practices on Underground Internet Forum Synthetic Non-Consensual Intimate Image Content Creation Communities Ignore
This paper analyzes resource sharing and knowledge transfer within underground forums for synthetic non-consensual intimate imagery creation, identifying intervention points for deterrence.
AI Ethics & Safety Apr 14
From edges to meaning: Semantic line sketches as a cognitive scaffold for ancient pictograph invention Ignore
A biologically inspired digital twin of the visual hierarchy that generates contour sketches and refines them with semantic representations, mirroring human visual cortex for symbol invention.
Cognitive AI for Symbol Invention Apr 14
Can AI Detect Life? Lessons from Artificial Life Ignore
This research highlights the significant risk of false positives when using current AI methods for extraterrestrial life detection due to their susceptibility to out-of-distribution samples.
AI for Astrobiology Apr 13
Algorithmic Analysis of Dense Associative Memory: Finite-Size Guarantees and Adversarial Robustness Ignore
This paper provides theoretical guarantees for the convergence and robustness of Dense Associative Memory, a generalization of Hopfield networks, with potential applications in memory systems.
Theoretical AI Apr 14
Broadening the Applicability of Conditional Syntax Splitting for Reasoning from Conditional Belief Bases Ignore
A theoretical generalization of conditional syntax splitting for nonmonotonic reasoning from belief bases, broadening applicability.
Reasoning Apr 14
Technical Report -- A Context-Sensitive Multi-Level Similarity Framework for First-Order Logic Arguments: An Axiomatic Study Ignore
A theoretical framework for measuring similarity in First-Order Logic arguments, accounting for structured content and contextual weights.
Formal Argumentation Apr 14
Efficiency of Proportional Mechanisms in Online Auto-Bidding Advertising Ignore
This paper analyzes the efficiency of proportional mechanisms in online advertising auctions, focusing on theoretical bounds for pure Nash equilibria.
Algorithmic Game Theory Apr 14
A Scoping Review of Large Language Model-Based Pedagogical Agents Ignore
A review of Large Language Model-based pedagogical agents in educational settings, analyzing design dimensions and emerging trends.
AI in Education Apr 14
Deepfakes at Face Value: Image and Authority Ignore
Deepfakes are wrong because they usurp our authority over the permissible uses of our image and identity by exploiting biometric features as a generative resource.
AI Ethics Apr 14
Latent Planning Emerges with Scale Ignore
Investigating latent planning abilities in LLMs, showing that these capabilities emerge with scale and can be measured through internal planning representations.
LLM Reasoning Apr 14
Loop Corrections to the Training and Generalization Errors of Random Feature Models Ignore
Theoretical analysis of loop corrections to training and generalization errors in random feature models.
LLM Training Apr 14