KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance Build Now
A reinforcement learning framework that boosts LLM reasoning by intelligently guiding training with minimal, sufficient knowledge points.
GitHub 42 stars Velocity flat History 1 snapshot LLM Reasoning Apr 14 Pending High viability
The Second Challenge on Cross-Domain Few-Shot Object Detection at NTIRE 2026: Methods and Results Build Now
This paper presents the NTIRE 2026 Cross-Domain Few-Shot Object Detection Challenge, showcasing innovative methods and pushing the performance frontier in a critical computer vision task.
GitHub 18 stars Velocity flat History 1 snapshot Object Detection Apr 13 Pending High viability
Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution Build Now
Domain-specific autoencoders significantly improve medical image super-resolution fidelity in diffusion models, with code and weights available.
GitHub 0 stars Velocity flat History 1 snapshot Medical Image Super-Resolution Apr 14 Pending High viability
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Build Now
A systematic investigation into on-policy distillation for large language models, identifying key conditions for success and proposing strategies to improve failing distillations.
GitHub 44 stars Velocity flat History 1 snapshot LLM Training Apr 14 Pending High viability
CoDe-R: Refining Decompiler Output with LLMs via Rationale Guidance and Adaptive Inference Build Now
CoDe-R refines decompiler output using LLMs with rationale guidance and adaptive inference, achieving state-of-the-art re-executability for lightweight models.
GitHub stars n/a Velocity flat History 1 snapshot Decompilation Apr 14 Pending High viability
Towards Long-horizon Agentic Multimodal Search Build Now
Develop a cutting-edge agent-based multimodal search platform to enhance complex query resolution capabilities.
GitHub 6 stars Velocity flat History 1 snapshot Multimodal Search Agents Apr 14 Pending High viability
Distorted or Fabricated? A Survey on Hallucination in Video LLMs Watch
A survey and taxonomy of hallucinations in Video LLMs, analyzing root causes and proposing future research directions for more reliable systems.
GitHub 24 stars Velocity flat History 1 snapshot Video LLMs Apr 14 Pending
Learning Chain Of Thoughts Prompts for Predicting Entities, Relations, and even Literals on Knowledge Graphs Build Now
A prompt learning framework that uses chain-of-thought prompts to predict missing entities, relations, and literals in knowledge graphs, outperforming traditional embedding models.
GitHub 0 stars Velocity flat History 1 snapshot Knowledge Graphs Apr 14 Pending High viability
Beyond Output Correctness: Benchmarking and Evaluating Large Language Model Reasoning in Coding Tasks Build Now
A benchmark and evaluator for LLM reasoning in coding tasks that improves accuracy and identifies limitations in existing methods.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 14 Pending High viability
SCRIPT: A Subcharacter Compositional Representation Injection Module for Korean Pre-Trained Language Models Build Now
A module that injects subcharacter compositional knowledge into Korean LLMs to improve linguistic understanding and generation without architectural changes.
GitHub 1 stars Velocity flat History 1 snapshot LLM Adaptation Apr 14 Pending High viability
MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization Build Now
A memory-augmented reinforcement learning agent for sample-efficient molecular optimization in drug discovery.
GitHub stars n/a Velocity flat History 1 snapshot Drug Discovery AI Apr 14 Pending High viability
Filtered Reasoning Score: Evaluating Reasoning Quality on a Model's Most-Confident Traces Build Now
A new evaluation metric for LLMs that assesses reasoning quality beyond simple accuracy, with open-source code available.
GitHub 0 stars Velocity flat History 1 snapshot LLM Evaluation Apr 13 Pending High viability
DocSeeker: Structured Visual Reasoning with Evidence Grounding for Long Document Understanding Build Now
A structured reasoning framework for multimodal LLMs to improve understanding of long documents by localizing and grounding evidence.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Pending High viability
Neural Dynamic GI: Random-Access Neural Compression for Temporal Lightmaps in Dynamic Lighting Environments Build Now
A novel neural compression technique for temporal lightmaps that enables high-quality dynamic global illumination with significantly reduced storage and memory.
GitHub 713 stars Velocity flat History 1 snapshot Generative Graphics Apr 14 Pending High viability
Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models Build Now
A novel training acceleration method for vision foundation models that creates a model family chain, enabling efficient sequential knowledge transfer and reducing training costs by up to 72%.
GitHub 5 stars Velocity flat History 1 snapshot LLM Training Acceleration Apr 14 Pending High viability
CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models Build Now
A plug-and-play framework for reducing visual token redundancy in multimodal LLMs through class-adaptive layer fusion and dual-stage pruning.
GitHub 0 stars Velocity flat History 1 snapshot Multimodal LLMs Apr 14 Pending High viability
Spatial Atlas: Compute-Grounded Reasoning for Spatial-Aware Research Agent Benchmarks Build Now
A novel agent paradigm that grounds reasoning in deterministic computation before LLM generation for spatial-aware tasks, improving accuracy and interpretability.
GitHub 0 stars Velocity flat History 1 snapshot Agents Apr 13 Pending High viability
Efficient Adversarial Training via Criticality-Aware Fine-Tuning Build Now
A criticality-aware fine-tuning method that achieves robust Vision Transformer models by selectively updating only the most critical parameters, significantly reducing computational cost.
GitHub 713 stars Velocity flat History 1 snapshot Robustness AI Apr 14 Pending High viability
LLM-HYPER: Generative CTR Modeling for Cold-Start Ad Personalization via LLM-Based Hypernetworks Build Now
Generates ad personalization models using LLMs as hypernetworks, solving cold-start problems and deployed in production.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI for Advertising Apr 13 Code High viability
IDEA: An Interpretable and Editable Decision-Making Framework for LLMs via Verbal-to-Numeric Calibration Build Now
A framework that extracts LLM decision knowledge into an interpretable parametric model, enabling calibrated probabilities and quantitative human-AI collaboration.
GitHub 0 stars Velocity flat History 1 snapshot Interpretable LLM Decision Making Apr 14 Pending High viability
Identity as Attractor: Geometric Evidence for Persistent Agent Architecture in LLM Activation Space Build Now
This research provides evidence that agent identity induces attractor-like geometry in LLM activation space, offering a new way to understand and potentially control LLM behavior.
GitHub 0 stars Velocity flat History 1 snapshot LLM Analysis Apr 13 Pending High viability
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection Build Now
An LLM-powered multi-agent system that autonomously generates executable proofs-of-concept to validate bug reports in software, significantly improving bug detection accuracy and reducing false positives.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 13 Code High viability
KG-Reasoner: A Reinforced Model for End-to-End Multi-Hop Knowledge Graph Reasoning Build Now
KG-Reasoner is an end-to-end framework using Reinforcement Learning to enable LLMs to perform dynamic, multi-hop reasoning over Knowledge Graphs.
GitHub 0 stars Velocity flat History 1 snapshot Knowledge Graph Reasoning Apr 14 Pending High viability
Modality-Native Routing in Agent-to-Agent Networks: A Multimodal A2A Protocol Extension Build Now
An extension to Agent-to-Agent networks that enables modality-native routing for richer multimodal context, improving task accuracy in vision-dependent scenarios.
GitHub 0 stars Velocity flat History 1 snapshot Multi-Agent Systems Apr 14 Pending High viability
Lightning OPD: Efficient Post-Training for Large Reasoning Models with Offline On-Policy Distillation Build Now
An offline distillation framework for large language models that significantly speeds up post-training for reasoning and code generation tasks without requiring a live teacher server.
GitHub stars n/a Velocity flat History 1 snapshot LLM Post-Training Apr 14 Pending High viability
NTIRE 2026 The 3rd Restore Any Image Model (RAIM) Challenge: Professional Image Quality Assessment (Track 1) Build Now
A challenge and dataset for professional image quality assessment using multimodal large language models to provide comparative selection and expert-level reasoning.
GitHub 13 stars Velocity flat History 1 snapshot Image Quality Assessment Apr 14 Pending High viability
GeM-EA: A Generative and Meta-learning Enhanced Evolutionary Algorithm for Streaming Data-Driven Optimization Build Now
An evolutionary algorithm enhanced with meta-learning and generative replay for robust and fast optimization of streaming data with concept drift.
GitHub 1 stars Velocity flat History 1 snapshot Optimization AI Apr 14 Pending High viability
PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation Build Now
A training-free framework that refines prompts for in-context image segmentation using gradient flow, significantly improving accuracy without additional training.
GitHub 713 stars Velocity flat History 1 snapshot Image Segmentation Apr 13 Pending High viability
Parallax: Why AI Agents That Think Must Never Act Build Now
A novel security paradigm for AI agents that separates reasoning from execution to prevent unauthorized actions, with a 98.9% attack blocking rate.
GitHub 2 stars Velocity flat History 1 snapshot AI Agent Security Apr 14 Pending High viability
CIA: Inferring the Communication Topology from LLM-based Multi-Agent Systems Build Now
A novel attack infers communication topologies in LLM-based multi-agent systems, revealing significant privacy risks and system vulnerabilities.
GitHub 0 stars Velocity flat History 1 snapshot LLM Security Apr 14 Pending High viability
Socrates Loss: Unifying Confidence Calibration and Classification by Leveraging the Unknown Build Now
A unified loss function that improves classification accuracy and confidence calibration by leveraging uncertainty.
GitHub 2 stars Velocity flat History 1 snapshot Model Calibration Apr 14 Pending High viability
Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models Build Now
A pipeline for controllable molecular optimization using LLMs that preserves scaffold integrity and improves drug discovery efficiency.
GitHub 1866 stars Velocity flat History 1 snapshot Drug Discovery AI Apr 14 Pending High viability
MODIX: A Training-Free Multimodal Information-Driven Positional Index Scaling for Vision-Language Models Build Now
MODIX is a training-free framework that scales positional indices in Vision-Language Models based on information density, improving multimodal reasoning.
GitHub 713 stars Velocity flat History 1 snapshot Vision-Language Models Apr 14 Pending High viability
GF-Score: Certified Class-Conditional Robustness Evaluation with Fairness Guarantees Build Now
GF-Score provides certified class-conditional robustness evaluation with fairness guarantees, enabling attack-free auditing of neural network vulnerabilities.
GitHub 0 stars Velocity flat History 1 snapshot Robustness Evaluation Apr 14 Pending High viability
LLM-Based Automated Diagnosis Of Integration Test Failures At Google Build Now
AutoDebug leverages LLMs to diagnose integration test failures, enhancing developer productivity at scale.
GitHub stars n/a Velocity flat History 1 snapshot AI-powered Software Development Tools Apr 13 Code High viability
MVAdapt: Zero-Shot Multi-Vehicle Adaptation for End-to-End Autonomous Driving Build Now
Adapt autonomous driving models to different vehicle dynamics with a physics-conditioned framework for improved transferability.
GitHub stars n/a Velocity flat History pending Autonomous Driving Apr 13 Pending High viability
QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence Build Now
QuarkMedSearch is a long-horizon deep search agent for Chinese medical intelligence, achieving state-of-the-art performance with a novel data construction and training strategy.
GitHub 1866 stars Velocity flat History 1 snapshot Agents Apr 14 Pending High viability
Modeling Co-Pilots for Text-to-Model Translation Build Now
A suite of LLM co-pilots and a unified dataset for translating natural language into formal models for optimization and satisfaction problems.
GitHub stars n/a Velocity flat History 1 snapshot Combinatorial Optimization Modeling Apr 14 Pending High viability
SEATrack: Simple, Efficient, and Adaptive Multimodal Tracker Build Now
SEATrack is a multimodal tracker that uses AMG-LoRA and HMoE to achieve a balance between performance and efficiency in cross-modal fusion for tracking tasks.
GitHub 1 stars Velocity flat History 1 snapshot Multimodal Tracking Apr 14 Pending High viability
Rethinking Satellite Image Restoration for Onboard AI: A Lightweight Learning-Based Approach Build Now
A lightweight convolutional network for onboard satellite image restoration achieves competitive quality and significantly reduces latency, enabling real-time AI applications in space.
GitHub 713 stars Velocity flat History 1 snapshot Computer Vision Apr 14 Pending High viability
SpanKey: Dynamic Key Space Conditioning for Neural Network Access Control Watch
A lightweight method to control AI model access by conditioning activations on secret keys, without encrypting model weights.
GitHub 0 stars Velocity flat History 1 snapshot AI Security Apr 14 Pending
HintMR: Eliciting Stronger Mathematical Reasoning in Small Language Models Build Now
A hint-assisted reasoning framework that uses cooperative small language models to improve mathematical problem-solving.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 14 Code High viability
FRTSearch: Unified Detection and Parameter Inference of Fast Radio Transients using Instance Segmentation Build Now
An end-to-end AI framework for real-time detection and characterization of cosmic radio signals, significantly reducing false positives and increasing speed.
GitHub stars n/a Velocity flat History 1 snapshot Astronomy AI Apr 14 Code High viability
WiseOWL: A Methodology for Evaluating Ontological Descriptiveness and Semantic Correctness for Ontology Reuse and Ontology Recommendations Build Now
WiseOWL is a methodology and Streamlit app for evaluating and recommending ontologies based on descriptiveness and semantic correctness.
GitHub stars n/a Velocity flat History 1 snapshot Ontology Evaluation Apr 13 Code High viability
GoodPoint: Learning Constructive Scientific Paper Feedback from Author Responses Build Now
A novel training methodology for LLMs that learns to generate constructive scientific paper feedback by leveraging author responses, significantly improving feedback quality and author perception.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 13 Code High viability
IAD-Unify: A Region-Grounded Unified Model for Industrial Anomaly Segmentation, Understanding, and Generation Build Now
A unified model for industrial anomaly detection, offering enhanced segmentation, understanding, and generation in manufacturing processes.
GitHub stars n/a Velocity flat History 1 snapshot AI-powered Industrial Solutions Apr 14 Code High viability
PAL: Personal Adaptive Learner Build Now
PAL is an AI platform that transforms lecture videos into interactive learning experiences, dynamically adapting to learners' understanding with personalized summaries.
GitHub stars n/a Velocity flat History 1 snapshot AI Education Apr 14 Code High viability
ARGOS: Who, Where, and When in Agentic Multi-Camera Person Search Build Now
ARGOS is a benchmark and framework for agentic multi-camera person search, enabling agents to reason, question, and utilize spatio-temporal tools under information asymmetry.
GitHub 713 stars Velocity flat History 1 snapshot Agentic Multi-Camera Person Search Apr 14 Pending High viability
Topology-Aware Reasoning over Incomplete Knowledge Graph with Graph-Based Soft Prompting Watch
A graph-based soft prompting framework for multi-hop Knowledge Graph Question Answering that uses GNNs to reason over subgraphs and reduce sensitivity to incomplete KGs.
GitHub 0 stars Velocity flat History 1 snapshot Knowledge Graph QA Apr 14 Pending
Human-Centric Topic Modeling with Goal-Prompted Contrastive Learning and Optimal Transport Build Now
Human-centric topic modeling that uses LLM-based prompting and contrastive learning with optimal transport to produce goal-oriented topics.
GitHub stars n/a Velocity flat History 1 snapshot LLM Applications Apr 14 Code High viability
LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety Build Now
A language-agnostic method for LLM safety that anchors alignment in the model's semantic bottleneck, significantly reducing attack success rates across languages.
GitHub stars n/a Velocity flat History pending LLM Safety Apr 13 Code High viability
GCA Framework: A Gulf-Grounded Dataset and Agentic Pipeline for Climate Decision Support Build Now
A Gulf-focused multimodal dataset and agentic pipeline for climate decision support, integrating geospatial tools and domain-specific knowledge to improve LLM reliability.
GitHub stars n/a Velocity flat History 1 snapshot Climate AI Apr 14 Code High viability
BEAM: Bi-level Memory-adaptive Algorithmic Evolution for LLM-Powered Heuristic Design Build Now
BEAM is a bi-level evolutionary algorithm that designs high-level algorithmic structures for LLM-powered heuristic design, significantly outperforming existing methods in complex optimization problems.
GitHub stars n/a Velocity flat History 1 snapshot LLM Optimization Apr 14 Code High viability
INDOTABVQA: A Benchmark for Cross-Lingual Table Understanding in Bahasa Indonesia Documents Build Now
A cross-lingual table understanding benchmark for Bahasa Indonesia documents, with fine-tuning insights and open-source models.
GitHub stars n/a Velocity flat History 1 snapshot Multilingual Document Understanding Apr 13 Code High viability
Frugal Knowledge Graph Construction with Local LLMs: A Zero-Shot Pipeline, Self-Consistency and Wisdom of Artificial Crowds Watch
A zero-shot pipeline for frugal knowledge graph construction using local LLMs, achieving competitive results with consumer hardware.
GitHub stars n/a Velocity flat History pending Knowledge Graph Construction Apr 13 Pending
AutoSurrogate: An LLM-Driven Multi-Agent Framework for Autonomous Construction of Deep Learning Surrogate Models in Subsurface Flow Build Now
An LLM-driven multi-agent framework that enables domain scientists to autonomously build high-quality deep learning surrogate models for subsurface flow simulations using natural language.
GitHub stars n/a Velocity flat History 1 snapshot AI for Science Apr 13 Code High viability
Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation Build Now
A minimalist conversational memory framework using only retrieval and generation, addressing signal sparsity and redundancy for robust long-term dialogue management.
GitHub stars n/a Velocity flat History pending Conversational AI Apr 13 Code High viability
OpenTME: An Open Dataset of AI-powered H&E Tumor Microenvironment Profiles from TCGA Build Now
Releases OpenTME, a large dataset of AI-generated tumor microenvironment profiles from histopathology images for research.
GitHub stars n/a Velocity flat History 1 snapshot Medical Imaging Datasets Apr 13 Code High viability
X-VC: Zero-shot Streaming Voice Conversion in Codec Space Build Now
X-VC enables real-time zero-shot voice conversion to recreate any voice instantly using a neural codec.
GitHub stars n/a Velocity flat History 1 snapshot Voice Conversion Technology Apr 14 Code High viability
CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades Build Now
A multi-agent deliberation system for LLM cascades that uses consensus-driven ensembles to resolve ambiguities internally, reducing costs and improving accuracy.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Code High viability
Frontier-Eng: Benchmarking Self-Evolving Agents on Real-World Engineering Tasks with Generative Optimization Build Now
A new benchmark for self-evolving AI agents that iteratively optimize engineering designs using simulators and verifiers, pushing the boundaries of real-world problem-solving.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Code High viability
Unveiling the Surprising Efficacy of Navigation Understanding in End-to-End Autonomous Driving Build Now
A new framework and dataset for autonomous driving that significantly improves navigation understanding by fusing global and local planning, achieving state-of-the-art results without auxiliary losses.
GitHub stars n/a Velocity flat History 1 snapshot Autonomous Driving Apr 14 Code High viability
Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Build Now
An open-source, efficient hybrid Mamba-Transformer model for agentic reasoning that significantly outperforms existing models in inference throughput.
GitHub stars n/a Velocity flat History pending LLM Training Apr 14 Code High viability
Beyond Prompt: Fine-grained Simulation of Cognitively Impaired Standardized Patients via Stochastic Steering Build Now
A system for fine-grained simulation of cognitively impaired standardized patients using steering vectors and stochastic token modulation for precise severity control.
GitHub stars n/a Velocity flat History 1 snapshot Clinical Simulation Apr 14 Code High viability
Detecting and refurbishing ground truth errors during training of deep learning-based echocardiography segmentation models Build Now
A novel strategy to detect and correct errors in medical image segmentation training data, improving model performance under challenging conditions.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14 Code High viability
RPRA: Predicting an LLM-Judge for Efficient but Performant Inference Build Now
Enabling smaller LLMs to predict their own performance limitations, paving the way for more efficient and self-aware AI systems.
GitHub stars n/a Velocity flat History 1 snapshot LLM Efficiency Apr 14 Code High viability
Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization Build Now
A novel reinforcement learning paradigm for LLMs that bifurcates policy into normal and high-entropy modes to improve exploration without sacrificing accuracy.
GitHub stars n/a Velocity flat History pending LLM Reinforcement Learning Apr 13 Code High viability
Beyond Factual Grounding: The Case for Opinion-Aware Retrieval-Augmented Generation Build Now
An Opinion-Aware RAG architecture enhances LLM synthesis of subjective content by preserving opinion diversity, addressing limitations of current factual RAG systems.
GitHub stars n/a Velocity flat History 1 snapshot RAG Apr 13 Code High viability
Every Picture Tells a Dangerous Story: Memory-Augmented Multi-Agent Jailbreak Attacks on VLMs Build Now
MemJack is a memory-augmented multi-agent framework that uses visual semantics to orchestrate automated jailbreak attacks on VLMs, achieving high success rates and releasing a comprehensive benchmark dataset.
GitHub stars n/a Velocity flat History 1 snapshot VLM Security / Agents Apr 14 Code High viability
SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents Build Now
A benchmark and framework for evaluating the investigation depth of security incident response agents, distinguishing genuine forensic analysis from simple alert parroting.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 13 Code High viability
MultiDocFusion: Hierarchical and Multimodal Chunking Pipeline for Enhanced RAG on Long Industrial Documents Build Now
A multimodal chunking pipeline that leverages document structure and vision to significantly improve RAG performance on long industrial documents.
GitHub stars n/a Velocity flat History pending RAG Enhancement Apr 14 Code High viability
Beyond Scores: Diagnostic LLM Evaluation via Fine-Grained Abilities Build Now
A cognitive diagnostic framework for LLMs that moves beyond single scores to provide fine-grained ability assessments across multiple scientific domains, enabling targeted improvement and selection.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 14 Code High viability
How Transformers Learn to Plan via Multi-Token Prediction Build Now
This paper reveals that multi-token prediction in Transformers enables more robust planning by inducing a reverse reasoning process, outperforming next-token prediction on various reasoning benchmarks.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 13 Code High viability
Round-Trip Translation Reveals What Frontier Multilingual Benchmarks Miss Watch
Round-trip translation reveals limitations of current multilingual benchmarks and introduces 'Lost in Translation' (LiT) for realistic evaluation of multilingual LLMs.
GitHub stars n/a Velocity flat History 1 snapshot Multilingual LLMs Apr 14 Pending
A Two-Stage LLM Framework for Accessible and Verified XAI Explanations Build Now
A two-stage LLM framework that verifies and refines AI explanations for accuracy and accessibility, making XAI systems more trustworthy.
GitHub stars n/a Velocity flat History 1 snapshot XAI Apr 14 Code High viability
PromptEcho: Annotation-Free Reward from Vision-Language Models for Text-to-Image Reinforcement Learning Build Now
A novel reward mechanism for text-to-image models that eliminates the need for human annotation or reward model training, improving prompt following capabilities.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI Apr 14 Code High viability
Intelligent ROI-Based Vehicle Counting Framework for Automated Traffic Monitoring Build Now
An intelligent framework automatically identifies optimal regions for vehicle counting in traffic videos, achieving near-perfect accuracy and four times faster processing.
GitHub stars n/a Velocity flat History 1 snapshot Traffic Monitoring Apr 14 Code High viability
Leveraging Weighted Syntactic and Semantic Context Assessment Summary (wSSAS) Towards Text Categorization Using LLMs Build Now
A deterministic framework that enhances LLM text categorization accuracy and reproducibility by prioritizing high-value semantic features.
GitHub stars n/a Velocity flat History 1 snapshot LLM for Text Categorization Apr 13 Code High viability
TRUST Agents: A Collaborative Multi-Agent Framework for Fake News Detection, Explainable Verification, and Logic-Aware Claim Reasoning Build Now
TRUST Agents is a multi-agent framework for explainable fake news detection and claim reasoning, offering improved interpretability and evidence transparency.
GitHub stars n/a Velocity flat History 1 snapshot Fact Verification & Reasoning Apr 14 Code High viability
Ride the Wave: Precision-Allocated Sparse Attention for Smooth Video Generation Build Now
A training-free framework for efficient and temporally smooth video generation by dynamically allocating computation budget and using stochastic attention routing.
GitHub stars n/a Velocity flat History 1 snapshot Video Generation Apr 14 Code High viability
Long-Horizon Plan Execution in Large Tool Spaces through Entropy-Guided Branching Build Now
A new benchmark and entropy-guided search algorithm to enable LLM agents to execute long-horizon plans in large tool spaces.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 13 Code High viability
Benchmarking Deflection and Hallucination in Large Vision-Language Models Build Now
A benchmark and data curation pipeline for evaluating deflection and hallucination in large vision-language models, focusing on retrieval-dependent samples and insufficient evidence scenarios.
GitHub stars n/a Velocity flat History 1 snapshot Vision-Language Models Apr 13 Code High viability
RACF: A Resilient Autonomous Car Framework with Object Distance Correction Build Now
A resilient framework for autonomous cars that uses sensor fusion and a novel correction algorithm to ensure accurate object distance estimation in real-time.
GitHub stars n/a Velocity flat History 1 snapshot Autonomous Vehicle Perception Apr 14 Code High viability
Decoding by Perturbation: Mitigating MLLM Hallucinations via Dynamic Textual Perturbation Build Now
A training-free framework that mitigates multimodal LLM hallucinations by dynamically perturbing text to stabilize visual grounding.
GitHub stars n/a Velocity flat History 1 snapshot LLM Hallucination Mitigation Apr 14 Code High viability
Black-Box Optimization From Small Offline Datasets via Meta Learning with Synthetic Tasks Build Now
A meta-learning framework that generates synthetic tasks to enable black-box optimization from small, offline datasets.
GitHub stars n/a Velocity flat History 1 snapshot Optimization AI Apr 14 Code High viability
BID-LoRA: A Parameter-Efficient Framework for Continual Learning and Unlearning Build Now
BID-LoRA, a parameter-efficient framework for Continual Learning and Unlearning, enabling precise knowledge deletion and efficient integration of new knowledge with minimal leakage.
GitHub stars n/a Velocity flat History 1 snapshot Continual Learning Apr 14 Code High viability
From Plan to Action: How Well Do Agents Follow the Plan? Build Now
This research systematically analyzes how well AI agents follow instructed plans, revealing critical insights for improving autonomous reasoning and task completion.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 13 Code High viability
Robust Explanations for User Trust in Enterprise NLP Systems Build Now
A framework for evaluating the robustness of LLM explanations in black-box enterprise systems, enabling better user trust and compliance.
GitHub stars n/a Velocity flat History 1 snapshot LLM Explainability Apr 13 Code High viability
MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer Build Now
A training-free framework for multi-style image transfer that uses mask-guided attention to seamlessly blend multiple styles without artifacts or structural inconsistencies.
GitHub stars n/a Velocity flat History 1 snapshot Generative Image Apr 14 Code High viability
Visual Preference Optimization with Rubric Rewards Build Now
A framework for improving visual preference optimization in multimodal AI using instance-specific rubrics, outperforming existing methods and approaching GPT-5.4 quality.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI Apr 14 Code High viability
KumoRFM-2: Scaling Foundation Models for Relational Learning Build Now
A foundation model for relational data that natively processes connected tables, outperforming supervised methods in few-shot learning and scaling to billion-scale datasets.
GitHub stars n/a Velocity flat History 1 snapshot Relational Foundation Models Apr 14 Code High viability
OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension Build Now
A hardware-efficient framework for LLM quantization that suppresses activation outliers to maintain accuracy and achieve significant speedups on modern AI accelerators.
GitHub stars n/a Velocity flat History 1 snapshot LLM Optimization Apr 14 Code High viability
MISID: A Multimodal Multi-turn Dataset for Complex Intent Recognition in Strategic Deception Games Build Now
MISID, a multimodal dataset and FRACTAM framework for complex intent recognition in strategic deception games, addressing deficiencies in current MLLMs for long-context discourse and cross-modal synergy.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal AI Apr 14 Code High viability
Heuristic Classification of Thoughts Prompting (HCoT): Integrating Expert System Heuristics for Structured Reasoning into Large Language Models Build Now
A novel prompting schema that integrates expert system heuristics to improve LLM reasoning and problem-solving efficiency.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 14 Code High viability
BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH Build Now
BayMOTH is a novel meta-Bayesian optimization approach that intelligently uses related-task information or falls back to lookahead for efficient sequential optimization.
GitHub stars n/a Velocity flat History 1 snapshot Bayesian Optimization Apr 13 Code High viability
Narrative-Driven Paper-to-Slide Generation via ArcDeck Build Now
A multi-agent framework for generating presentations from academic papers by reconstructing narrative flow, with a new benchmark.
GitHub stars n/a Velocity flat History 1 snapshot Document Summarization Apr 13 Code High viability
The Long-Horizon Task Mirage? Diagnosing Where and Why Agentic Systems Break Build Now
A diagnostic benchmark and evaluation pipeline for understanding and improving LLM agent performance on long-horizon tasks.
GitHub stars n/a Velocity flat History 1 snapshot Agentic Systems Apr 13 Code High viability
EgoEsportsQA: An Egocentric Video Benchmark for Perception and Reasoning in Esports Build Now
A new benchmark and dataset for evaluating Video-LLMs in high-velocity esports environments, revealing significant performance gaps and guiding future development.
GitHub stars n/a Velocity flat History 1 snapshot Video LLMs Apr 14 Code High viability
Euler-inspired Decoupling Neural Operator for Efficient Pansharpening Build Now
A physics-inspired neural operator efficiently synthesizes high-resolution multispectral images from panchromatic and low-resolution multispectral inputs.
GitHub stars n/a Velocity flat History 1 snapshot Image Enhancement Apr 14 Code High viability
Preventing Safety Drift in Large Language Models via Coupled Weight and Activation Constraints Build Now
A new method that simultaneously constrains LLM weights and activations to prevent safety degradation during fine-tuning.
GitHub stars n/a Velocity flat History 1 snapshot LLM Safety Apr 14 Code High viability
Calibration-Aware Policy Optimization for Reasoning LLMs Build Now
A novel policy optimization method that jointly improves LLM reasoning accuracy and calibration, mitigating overconfidence and hallucination.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 14 Code High viability
TimeSAF: Towards LLM-Guided Semantic Asynchronous Fusion for Time Series Forecasting Build Now
A novel time series forecasting framework that uses LLM-guided semantic asynchronous fusion to improve accuracy and generalization by decoupling semantic guidance from temporal dynamics.
GitHub stars n/a Velocity flat History 1 snapshot Time Series Forecasting Apr 14 Code High viability
Transferable Expertise for Autonomous Agents via Real-World Case-Based Learning Build Now
A case-based learning framework for LLM agents that transfers prior task experience to new, complex real-world scenarios, improving structured analysis and performance.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Code High viability
Coding-Free and Privacy-Preserving MCP Framework for Clinical Agentic Research Intelligence System Build Now
An AI system that automates clinical research workflows, from planning to report generation, without requiring coding or direct patient data access.
GitHub stars n/a Velocity flat History 1 snapshot Clinical AI Agents Apr 14 Code High viability
LLMs Struggle with Abstract Meaning Comprehension More Than Expected Watch
A bidirectional attention classifier that improves LLMs' abstract meaning comprehension by mimicking human cognitive strategies.
GitHub 1 stars Velocity flat History 1 snapshot LLM Comprehension Apr 13 Pending
Interpretable DNA Sequence Classification via Dynamic Feature Generation in Decision Trees Build Now
A framework that adaptively generates biologically-informed features for decision trees, enabling interpretable and predictive DNA sequence analysis.
GitHub stars n/a Velocity flat History 1 snapshot Interpretable AI for Genomics Apr 13 Code High viability
FastGrasp: Learning-based Whole-body Control method for Fast Dexterous Grasping with Mobile Manipulators Build Now
FastGrasp is a learning-based framework for fast, dexterous grasping with mobile manipulators, integrating grasp guidance, whole-body control, and tactile feedback for robust manipulation.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14 Code High viability
Policy-Invisible Violations in LLM-Based Agents Build Now
A framework for LLM agents that enforces organizational policies by simulating world-state changes, significantly outperforming content-only baselines.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Code High viability
RePAIR: Interactive Machine Unlearning through Prompt-Aware Model Repair Watch
Interactive machine unlearning for language models to empower users with personal data control.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics and Privacy Apr 14 Code
A hierarchical spatial-aware algorithm with efficient reinforcement learning for human-robot task planning and allocation in production Build Now
A hierarchical spatial-aware algorithm with efficient reinforcement learning for human-robot task planning and allocation in production.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14 Code High viability
Human-Inspired Context-Selective Multimodal Memory for Social Robots Build Now
Develops a human-inspired multimodal memory system for social robots to enable personalized, context-aware interactions.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Memory Systems Apr 13 Code High viability
Security and Resilience in Autonomous Vehicles: A Proactive Design Approach Build Now
A proactive design approach for autonomous vehicles integrating redundancy and anomaly detection to ensure security and operational continuity against cyberattacks.
GitHub stars n/a Velocity flat History 1 snapshot Autonomous Vehicle Security Apr 14 Code High viability
The A-R Behavioral Space: Execution-Level Profiling of Tool-Using Language Model Agents in Organizational Deployment Build Now
A new framework for evaluating LLM agents by profiling their execution-level behavior and refusal signals, crucial for safe organizational deployment.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 13 Code High viability
Clustering-Enhanced Domain Adaptation for Cross-Domain Intrusion Detection in Industrial Control Systems Build Now
A clustering-enhanced domain adaptation method for industrial control systems that significantly improves unknown attack detection with limited labeled data.
GitHub stars n/a Velocity flat History 1 snapshot Cybersecurity AI Apr 14 Code High viability
Cooperative Memory Paging with Keyword Bookmarks for Long-Horizon LLM Conversations Build Now
A novel LLM memory system using keyword bookmarks and a recall tool to dramatically improve long-conversation recall, outperforming existing methods.
GitHub stars n/a Velocity flat History 1 snapshot LLM Memory Management Apr 14 Code High viability
Towards grounded autonomous research: an end-to-end LLM mini research loop on published computational physics Build Now
An end-to-end LLM research loop autonomously reproduces, critiques, and extends computational physics papers, demonstrating significant potential for accelerating scientific discovery.
GitHub stars n/a Velocity flat History 1 snapshot AI Research Automation Apr 14 Code High viability
Operationalising the Right to be Forgotten in LLMs: A Lightweight Sequential Unlearning Framework for Privacy-Aligned Deployment in Politically Sensitive Environments Build Now
A lightweight sequential unlearning framework for LLMs that enables privacy-aligned deployment by selectively suppressing sensitive data without compromising general language capabilities.
GitHub stars n/a Velocity flat History 1 snapshot LLM Privacy Apr 14 Code High viability
Drawing on Memory: Dual-Trace Encoding Improves Cross-Session Recall in LLM Agents Build Now
LLM agents with dual-trace memory encoding significantly improve cross-session recall and temporal reasoning by pairing facts with contextual scene traces.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14 Code High viability
Curvelet-Based Frequency-Aware Feature Enhancement for Deepfake Detection Build Now
A novel deepfake detection method using Curvelet Transform and attention mechanisms to enhance feature robustness against compression.
GitHub stars n/a Velocity flat History 1 snapshot Deepfake Detection Apr 13 Code High viability
VFA: Relieving Vector Operations in Flash Attention with Global Maximum Pre-computation Watch
Vector Relieved Flash Attention (VFA) optimizes attention computation by reducing vector operations, achieving significant speedups on modern hardware without performance loss.
GitHub stars n/a Velocity flat History 1 snapshot LLM Optimization Apr 14 Code
One Token Away from Collapse: The Fragility of Instruction-Tuned Helpfulness Watch
Reveals that instruction-tuned LLMs are fragile to simple lexical constraints, leading to significant response collapse, and identifies a planning failure as the root cause.
GitHub stars n/a Velocity flat History 1 snapshot LLM Robustness Apr 14 Code
Evaluating Relational Reasoning in LLMs with REL Ignore
A new benchmark framework that measures relational reasoning in LLMs by varying the complexity of entity binding, revealing consistent performance degradation.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 14 Pending
Cycle-Consistent Search: Question Reconstructability as a Proxy Reward for Search Agent Training Watch
A gold-supervision-free framework for training search agents using cycle-consistency to reconstruct questions from search trajectories.
GitHub stars n/a Velocity flat History 1 snapshot Search Agent Training Apr 14 Code
Is Vibe Coding the Future? An Empirical Assessment of LLM Generated Codes for Construction Safety Watch
Empirical assessment of LLM-generated code for construction safety reveals significant silent failure rates, highlighting the need for deterministic wrappers.
GitHub stars n/a Velocity flat History 1 snapshot LLM Safety Apr 14 Code
TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs Watch
A fuzzing framework to find vulnerabilities in LLM chat templates for jailbreaking and red teaming.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 14 High viability
When Does Data Augmentation Help? Evaluating LLM and Back-Translation Methods for Hausa and Fongbe NLP Watch
Evaluates LLM and back-translation data augmentation for Hausa and Fongbe NLP, finding task-specific effectiveness for NER and POS tagging.
GitHub stars n/a Velocity flat History 1 snapshot Low-Resource NLP Apr 14 Code
Mining Large Language Models for Low-Resource Language Data: Comparing Elicitation Strategies for Hausa and Fongbe Watch
Extracting low-resource language data from LLMs using strategic prompting, with released corpora and code.
GitHub stars n/a Velocity flat History 1 snapshot LLM Data Mining Apr 14 High viability
Fully Homomorphic Encryption on Llama 3 model for privacy preserving LLM inference Watch
Integrates post-quantum homomorphic encryption into Llama 3 inference for privacy-preserving LLM applications with high accuracy and reasonable latency.
GitHub stars n/a Velocity flat History 1 snapshot Privacy-Preserving AI Apr 14 Code
Artificial Intelligence for Modeling and Simulation of Mixed Automated and Human Traffic Watch
A survey of AI methods for modeling mixed automated and human traffic in simulation, identifying gaps and future directions.
GitHub stars n/a Velocity flat History 1 snapshot Simulation Apr 14 Code
Latent patterns of urban mixing in mobility analysis across five global cities Watch
Leveraging large-scale travel surveys and graph neural networks, this research uncovers nuanced patterns of urban social mixing and place exposure across five global cities, revealing mobility's greater influence than sociodemographics.
GitHub stars n/a Velocity flat History 1 snapshot Urban Mobility Analysis Apr 14 Code
Not All Forgetting Is Equal: Architecture-Dependent Retention Dynamics in Fine-Tuned Image Classifiers Watch
Analyzing architecture-dependent sample forgetting in fine-tuned image classifiers to inform data pruning and curriculum design.
GitHub stars n/a Velocity flat History pending Model Forgetting Analysis Apr 13 Code
Lit2Vec: A Reproducible Workflow for Building a Legally Screened Chemistry Corpus from S2ORC for Downstream Retrieval and Text Mining Ignore
A reproducible workflow for building and validating a legally screened chemistry corpus from S2ORC, enriched with embeddings and annotations for downstream text mining.
GitHub stars n/a Velocity flat History 1 snapshot Data Corpus Construction Apr 14 Code
Towards Platonic Representation for Table Reasoning: A Foundation for Permutation-Invariant Retrieval Watch
A hypothesis and framework for permutation-invariant table representation learning to build robust table retrieval systems.
GitHub stars n/a Velocity flat History 1 snapshot Table Reasoning Apr 13 Code
SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration Watch
Accelerating LLM inference with adaptive bounded self-speculation and layer-wise confidence calibration.
GitHub stars n/a Velocity flat History 1 snapshot LLM Inference Apr 14
DeepTest Tool Competition 2026: Benchmarking an LLM-Based Automotive Assistant Watch
Reports on the first LLM Testing competition focused on benchmarking an LLM-based automotive assistant for car manual information retrieval failures.
GitHub stars n/a Velocity flat History 1 snapshot LLM Testing / Automotive Apr 14 Code
Information-Theoretic Optimization for Task-Adapted Compressed Sensing Magnetic Resonance Imaging Ignore
An information-theoretic framework for task-adapted compressed sensing MRI that enables probabilistic inference for uncertainty prediction and adaptive sampling for clinical tasks.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14 Code
LogicEval: A Systematic Framework for Evaluating Automated Repair Techniques for Logical Vulnerabilities in Real-World Software Ignore
Introduces LogicEval, a framework and dataset for evaluating automated repair techniques for logical vulnerabilities in real-world software, highlighting limitations of current LLM-based approaches.
GitHub stars n/a Velocity flat History 1 snapshot Automated Program Repair Apr 14 Code
Thermodynamic Liquid Manifold Networks: Physics-Bounded Deep Learning for Solar Forecasting in Autonomous Off-Grid Microgrids Watch
This research introduces a thermodynamically consistent deep learning network for solar forecasting in off-grid microgrids, eliminating nocturnal generation anomalies and achieving zero-lag synchronization.
GitHub stars n/a Velocity flat History 1 snapshot Forecasting Apr 13
Local-Splitter: A Measurement Study of Seven Tactics for Reducing Cloud LLM Token Usage on Coding-Agent Workloads Watch
A measurement study of seven tactics to reduce cloud LLM token usage for coding agents, offering workload-specific strategies for cost savings.
GitHub stars n/a Velocity flat History 1 snapshot LLM Cost Optimization Apr 14
Observing the unobserved confounding through its effects: toward randomized trial-like estimates from real-world survival data Ignore
A framework to infer and balance latent prognostic factors from real-world survival data to improve treatment-effect estimation.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 13 Code
Cognition-Inspired Dual-Stream Semantic Enhancement for Vision-Based Dynamic Emotion Modeling Ignore
A cognition-inspired dual-stream model that enhances dynamic emotion recognition by integrating semantic and contextual knowledge with facial dynamics, achieving state-of-the-art performance.
GitHub stars n/a Velocity flat History 1 snapshot Emotion AI Apr 14 Code
A longitudinal health agent framework Ignore
A longitudinal health agent framework designed for sustained user engagement and personalized decision-making in health tasks.
GitHub stars n/a Velocity flat History 1 snapshot Health Agents Apr 13 Code
DoseRAD2026 Challenge dataset: AI accelerated photon and proton dose calculation for radiotherapy Ignore
A new dataset and challenge for AI-accelerated photon and proton dose calculation in radiotherapy, enabling development of faster and more accurate dose estimation methods.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14 Code
ReflectCAP: Detailed Image Captioning with Reflective Memory Watch
A multi-agent system that uses structured reflection notes to improve the factuality and coverage of image captions generated by large vision-language models.
Image Captioning Apr 14
Memory as Metabolism: A Design for Companion Knowledge Systems Ignore
A design for companion knowledge systems that proposes a governance profile for personal LLM memory wikis to prevent entrenchment and mirror user operational dimensions.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 13 Code
VISTA: Validation-Informed Trajectory Adaptation via Self-Distillation Ignore
A self-distillation framework that improves model robustness and generalization by enforcing consistency along the optimization trajectory.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 13 Code
Bilevel Late Acceptance Hill Climbing for the Electric Capacitated Vehicle Routing Problem Ignore
A bilevel optimization algorithm for the Electric Capacitated Vehicle Routing Problem that separates routing and charging decisions to accelerate convergence and achieve near-optimal solutions.
GitHub stars n/a Velocity flat History 1 snapshot Optimization Algorithms Apr 14 Code
Designing Reliable LLM-Assisted Rubric Scoring for Constructed Responses: Evidence from Physics Exams Watch
Designing reliable LLM-assisted rubric scoring for constructed responses in STEM exams, focusing on rubric design.
GitHub stars n/a Velocity flat History 1 snapshot AI Education Apr 14
CycloneMAE: A Scalable Multi-Task Learning Model for Global Tropical Cyclone Probabilistic Forecasting Ignore
CycloneMAE is a scalable multi-task learning model for global tropical cyclone probabilistic forecasting that outperforms NWP systems.
GitHub stars n/a Velocity flat History 1 snapshot Weather Forecasting AI Apr 14
When to Forget: A Memory Governance Primitive Watch
This paper introduces Memory Worth, a lightweight primitive for agent memory governance that tracks memory success/failure co-occurrence to improve decision-making.
GitHub stars n/a Velocity flat History 1 snapshot Agent Memory Apr 13
ARGen: Affect-Reinforced Generative Augmentation towards Vision-based Dynamic Emotion Perception Ignore
A framework that generates realistic facial expressions for training AI models to better perceive emotions from video.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI for Vision Apr 14
LLM-Guided Prompt Evolution for Password Guessing Ignore
Optimizing LLM prompts using evolutionary computation to enhance password guessing capabilities for security auditing.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 14
Efficient Semantic Image Communication for Traffic Monitoring at the Edge Ignore
Develops semantic image communication pipelines for edge traffic monitoring that drastically reduce data transmission costs by replacing full images with compact representations.
GitHub stars n/a Velocity flat History 1 snapshot Edge AI / Computer Vision Apr 14
Development, Evaluation, and Deployment of a Multi-Agent System for Thoracic Tumor Board Ignore
An AI system automates patient summary generation for thoracic tumor boards, improving efficiency and accuracy in clinical practice.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14
SOAR: Self-Correction for Optimal Alignment and Refinement in Diffusion Models Ignore
Introduces SOAR, a novel post-training method for diffusion models that corrects exposure bias and improves alignment without relying on reward models.
GitHub stars n/a Velocity flat History 1 snapshot Diffusion Models Apr 14
Evaluating the Limitations of Protein Sequence Representations for Parkinson's Disease Classification Ignore
Evaluating protein sequence representations for Parkinson's disease classification shows limited discriminative power, requiring more informative biological features.
GitHub stars n/a Velocity flat History pending Medical AI Apr 13 Code
Representation geometry shapes task performance in vision-language modeling for CT enterography Ignore
Investigating representation geometry in vision-language models for CT enterography, finding that mean pooling and per-slice contrast are key for disease assessment and retrieval.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14
Cross-Cultural Simulation of Citizen Emotional Responses to Bureaucratic Red Tape Using LLM Agents Ignore
An interactive interface for simulating citizen emotional responses to bureaucratic red tape across cultures, aiming to improve policymaking.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents for Social Simulation Apr 14
Audio Source Separation in Reverberant Environments using $β$-divergence based Nonnegative Factorization Ignore
Improving audio source separation in reverberant environments using beta-divergence based nonnegative factorization.
GitHub stars n/a Velocity flat History 1 snapshot Audio AI Apr 14
Can AI Tools Transform Low-Demand Math Tasks? An Evaluation of Task Modification Capabilities Ignore
An evaluation of AI tools' capabilities in transforming low-demand mathematics tasks, revealing moderate success rates and distinct generative versus classification abilities.
GitHub stars n/a Velocity flat History 1 snapshot AI for Education Apr 14
AISafetyBenchExplorer: A Metric-Aware Catalogue of AI Safety Benchmarks Reveals Fragmented Measurement and Weak Benchmark Governance Ignore
AISafetyBenchExplorer is a catalogue of AI safety benchmarks that reveals fragmented measurement and weak governance, providing a structured approach for benchmark discovery and meta-evaluation.
GitHub stars n/a Velocity flat History 1 snapshot AI Safety Apr 14 Code
Elastic Net Regularization and Gabor Dictionary for Classification of Heart Sound Signals using Deep Learning Ignore
Optimizing deep learning models with elastic net regularization and Gabor dictionaries for improved heart sound signal classification.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14
Self-Monitoring Benefits from Structural Integration: Lessons from Metacognition in Continuous-Time Multi-Timescale Agents Ignore
This work shows that self-monitoring modules in reinforcement learning agents only provide benefits when structurally integrated into the decision-making pathway, not as auxiliary add-ons.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Agents Apr 13
Latent Planning Emerges with Scale Ignore
Investigating how latent planning abilities emerge and scale in Large Language Models through analysis of their internal representations.
GitHub 1866 stars Velocity flat History 1 snapshot LLM Reasoning Apr 14 Pending
EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture Ignore
A hybrid cognitive architecture using spiking neural networks to enable autonomous, emergent reasoning and LLM actions without explicit prompting.
GitHub stars n/a Velocity flat History 1 snapshot Cognitive Architectures Apr 14
Contextual Multi-Task Reinforcement Learning for Autonomous Reef Monitoring Ignore
Exploring contextual multi-task reinforcement learning for autonomous underwater reef monitoring to improve policy reusability and robustness.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14
ROSE: An Intent-Centered Evaluation Metric for NL2SQL Ignore
A new metric for evaluating Natural Language to SQL systems that focuses on user intent rather than exact SQL syntax.
GitHub stars n/a Velocity flat History 1 snapshot NL2SQL Evaluation Apr 14
Record-Remix-Replay: Hierarchical GPU Kernel Optimization using Evolutionary Search Ignore
A hierarchical optimization framework using LLM-driven evolutionary search to efficiently tune GPU kernels across various dimensions.
GPU Optimization Apr 13
Social Learning Strategies for Evolved Virtual Soft Robots Ignore
Developing social learning strategies for virtual soft robots to accelerate brain optimization by leveraging peer experience.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14
LLM-Guided Semantic Bootstrapping for Interpretable Text Classification with Tsetlin Machines Ignore
A framework that transfers LLM knowledge into symbolic Tsetlin Machines for interpretable and accurate text classification without runtime LLM calls.
GitHub stars n/a Velocity flat History 1 snapshot Interpretable Text Classification Apr 14
Orthogonal Subspace Projection for Continual Machine Unlearning via SVD-Based LoRA Ignore
A novel method for continual machine unlearning that uses SVD-guided orthogonal subspace projection to prevent parameter collision and maintain model performance across sequential deletion requests.
GitHub stars n/a Velocity flat History 1 snapshot Machine Unlearning Apr 14
GAM: Hierarchical Graph-based Agentic Memory for LLM Agents Ignore
A hierarchical graph-based memory framework for LLM agents that decouples memory encoding and consolidation to improve long-term coherence and reduce noise interference.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14
Enhancing Clustering: An Explainable Approach via Filtered Patterns Ignore
A theoretical framework for reducing redundancy in explainable clustering by formally characterizing and removing duplicate pattern representations.
GitHub stars n/a Velocity flat History 1 snapshot Explainable AI Apr 14 Code
How memory can affect collective and cooperative behaviors in an LLM-Based Social Particle Swarm Ignore
Investigating how memory length impacts cooperative behavior in LLM agents within a social particle swarm model.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14
Continuous Knowledge Metabolism: Generating Scientific Hypotheses from Evolving Literature Ignore
A framework for continuously updating a knowledge base and generating scientific hypotheses from evolving literature.
GitHub stars n/a Velocity flat History 1 snapshot Scientific Discovery Apr 14
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production Ignore
Safe reinforcement learning with online filtering for fatigue-predictive human-robot task planning and allocation in production.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14
Mathematics Teachers Interactions with a Multi-Agent System for Personalized Problem Generation Ignore
Examining teacher interactions with a multi-agent system for personalized math problem generation, highlighting areas for improvement in authenticity and fit.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 13
ResBM: Residual Bottleneck Models for Low-Bandwidth Pipeline Parallelism Ignore
A novel neural network architecture designed for efficient low-bandwidth pipeline parallelism in large-scale decentralized training.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 13
Aethon: A Reference-Based Replication Primitive for Constant-Time Instantiation of Stateful AI Agents Ignore
Aethon introduces a reference-based replication primitive for near-constant-time instantiation of stateful AI agents.
GitHub stars n/a Velocity flat History 1 snapshot AI Infrastructure Apr 13
From edges to meaning: Semantic line sketches as a cognitive scaffold for ancient pictograph invention Ignore
A biologically inspired digital twin of the visual hierarchy that generates contour sketches resembling ancient pictographs.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI Apr 14
Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models Ignore
Investigates the Identifiable Victim Effect in LLMs, revealing how alignment and reasoning training modulate moral biases.
GitHub stars n/a Velocity flat History 1 snapshot LLM Ethics and Bias Apr 13
LIFE -- an energy efficient advanced continual learning agentic AI framework for frontier systems Ignore
A framework for energy-efficient continual learning in HPC systems using agentic AI and brain-inspired architectures.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 14
Loop Corrections to the Training and Generalization Errors of Random Feature Models Ignore
Theoretical analysis of loop corrections to training and generalization errors in random feature models.
GitHub 1866 stars Velocity flat History 1 snapshot LLM Training Apr 14 Pending
From Kinematics to Dynamics: Learning to Refine Hybrid Plans for Physically Feasible Execution Ignore
Reinforcement learning refines robotic motion plans to ensure physical feasibility, bridging the gap between planning and real-world execution.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Planning Apr 14
Use of AI Tools: Guidelines to Maintain Academic Integrity in Computing Colleges Ignore
Guidelines and a formal model to maintain academic integrity in computing colleges amidst the rise of AI tools like ChatGPT.
Academic Integrity Apr 13
Can AI Detect Life? Lessons from Artificial Life Ignore
This research demonstrates that current AI methods for detecting extraterrestrial life are prone to significant false positives due to their susceptibility to out-of-distribution samples.
GitHub stars n/a Velocity flat History 1 snapshot AI Safety Apr 13
Deep Learning for Sequential Decision Making under Uncertainty: Foundations, Frameworks, and Frontiers Ignore
A tutorial bridging operations research with deep learning for sequential decision-making under uncertainty.
Decision Making AI Apr 13
Disposition Distillation at Small Scale: A Three-Arc Negative Result Ignore
This paper investigates methods for training behavioral dispositions into small language models, ultimately reporting negative results across multiple experimental arcs.
LLM Training Apr 13
Characterizing Resource Sharing Practices on Underground Internet Forum Synthetic Non-Consensual Intimate Image Content Creation Communities Ignore
This paper analyzes resource sharing practices in underground forums for synthetic non-consensual intimate imagery creation and dissemination to identify intervention points for deterrence.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics & Safety Apr 14
Algorithmic Analysis of Dense Associative Memory: Finite-Size Guarantees and Adversarial Robustness Ignore
This paper provides theoretical guarantees for the convergence and robustness of Dense Associative Memory, a generalization of Hopfield networks, with potential applications in memory systems.
GitHub stars n/a Velocity flat History 1 snapshot Theoretical AI Apr 14
Broadening the Applicability of Conditional Syntax Splitting for Reasoning from Conditional Belief Bases Ignore
A theoretical generalization of conditional syntax splitting for nonmonotonic reasoning from conditional belief bases.
GitHub stars n/a Velocity flat History 1 snapshot Reasoning Apr 14
Technical Report -- A Context-Sensitive Multi-Level Similarity Framework for First-Order Logic Arguments: An Axiomatic Study Ignore
A theoretical framework for context-sensitive, multi-level similarity in First-Order Logic arguments, extending axiomatic foundations and parametric models.
GitHub stars n/a Velocity flat History 1 snapshot Formal Argumentation Apr 14
Efficiency of Proportional Mechanisms in Online Auto-Bidding Advertising Ignore
This paper analyzes the efficiency of proportional mechanisms in online advertising auto-bidding, proposing a modified mechanism with improved price of anarchy bounds.
GitHub stars n/a Velocity flat History 1 snapshot Algorithmic Game Theory Apr 14
A Scoping Review of Large Language Model-Based Pedagogical Agents Ignore
A review of how Large Language Models are being used to create AI agents for teaching and learning across different educational contexts.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 14
Deepfakes at Face Value: Image and Authority Ignore
Deepfakes are wrong because they usurp our authority over the permissible uses of our image and identity by exploiting biometric features as a generative resource.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics Apr 14