UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding Build Now
A training-free adaptive zoom-in framework for GUI grounding that intelligently focuses on uncertain element localization.
GitHub 25 stars Velocity flat History 1 snapshot GUI Grounding Apr 15 Pending High viability
Training-Free Test-Time Contrastive Learning for Large Language Models Build Now
A training-free framework that enables frozen LLMs to adapt to distribution shifts by distilling supervision from their own inference experiences through an 'Explore-Reflect-Steer' loop.
GitHub 4 stars Velocity flat History 1 snapshot LLM Test-Time Adaptation Apr 15 Pending High viability
From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space Build Now
A novel reinforcement learning approach that optimizes LLM reasoning by directly updating the pre-train space, leading to significant improvements in pruning incorrect reasoning and fostering reflective behaviors.
GitHub 8 stars Velocity flat History 1 snapshot LLM Training Apr 15 Pending High viability
TIP: Token Importance in On-Policy Distillation Build Now
A novel method for on-policy knowledge distillation that significantly reduces memory usage and training time by intelligently selecting informative tokens, validated on multiple LLM architectures.
GitHub 14 stars Velocity flat History 1 snapshot LLM Training Apr 15 Pending High viability
MCPThreatHive: Automated Threat Intelligence for Model Context Protocol Ecosystems Build Now
An automated platform for generating and visualizing threat intelligence for agentic systems, addressing critical security gaps.
GitHub 4 stars Velocity flat History 1 snapshot Security Apr 15 Pending High viability
SemiFA: An Agentic Multi-Modal Framework for Autonomous Semiconductor Failure Analysis Report Generation Build Now
An agentic multi-modal framework that autonomously generates semiconductor failure analysis reports from inspection images in under a minute by fusing vision, telemetry, and historical data.
GitHub 1 stars Velocity flat History 1 snapshot Semiconductor AI Apr 14 Pending High viability
Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs Build Now
HETA is a novel attribution framework for decoder-only LLMs that provides context-aware, causally faithful, and semantically grounded explanations of token contributions.
GitHub 1866 stars Velocity flat History 1 snapshot LLM Interpretability Apr 14 Pending High viability
MIND: AI Co-Scientist for Material Research Build Now
MIND is an LLM-driven co-scientist for material research, automating hypothesis validation through integrated in-silico experiments and a user interface.
GitHub 1 stars Velocity flat History 1 snapshot AI for Science Apr 15 Pending High viability
Gaslight, Gatekeep, V1-V3: Early Visual Cortex Alignment Shields Vision-Language Models from Sycophantic Manipulation Build Now
Shielding vision-language models from manipulation by aligning their early visual cortex representations with human neural processing.
GitHub 0 stars Velocity flat History 1 snapshot AI Safety / Vision-Language Models Apr 15 Pending High viability
MaMe & MaRe: Matrix-Based Token Merging and Restoration for Efficient Visual Perception and Synthesis Build Now
A GPU-friendly, training-free method for accelerating Vision Transformers and enhancing image synthesis by merging and restoring tokens using matrix operations.
GitHub 0 stars Velocity flat History 1 snapshot Vision Transformers Apr 15 Pending High viability
WebXSkill: Skill Learning for Autonomous Web Agents Build Now
WebXSkill enables web agents to autonomously perform complex browser tasks with improved success rates through executable skills.
GitHub 0 stars Velocity flat History 1 snapshot AI Agents Apr 14 Pending High viability
UHR-BAT: Budget-Aware Token Compression Vision-Language model for Ultra-High-Resolution Remote Sensing Build Now
A budget-aware vision-language model for ultra-high-resolution remote sensing that efficiently compresses visual tokens using query-guided importance estimation and region-wise strategies.
GitHub 0 stars Velocity flat History 1 snapshot Remote Sensing Vision-Language Models Apr 15 Pending High viability
How Can We Synthesize High-Quality Pretraining Data? A Systematic Study of Prompt Design, Generator Model, and Source Data Build Now
A systematic study and open dataset for synthesizing high-quality pretraining data for LLMs, reducing generation costs by up to 30x.
GitHub 4 stars Velocity flat History 1 snapshot LLM Training Apr 15 Pending High viability
Bridging MARL to SARL: An Order-Independent Multi-Agent Transformer via Latent Consensus Build Now
A Transformer-based framework that bridges multi-agent reinforcement learning to single-agent reinforcement learning for improved coordination and performance.
GitHub 0 stars Velocity flat History 1 snapshot Multi-Agent Reinforcement Learning Apr 15 Pending High viability
Event-Adaptive State Transition and Gated Fusion for RGB-Event Object Tracking Build Now
An efficient RGB-Event object tracking framework with adaptive state transitions and gated fusion for robust cross-modal integration.
GitHub 1866 stars Velocity flat History 1 snapshot Event-Based Vision Apr 15 Pending High viability
Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning Build Now
A novel framework using LLMs to automate and optimize reward function design in reinforcement learning, reducing evaluation costs and improving performance.
GitHub 713 stars Velocity flat History 1 snapshot Reinforcement Learning Apr 15 Pending High viability
Beyond Voxel 3D Editing: Learning from 3D Masks and Self-Constructed Data Build Now
A framework for efficient 3D asset editing that leverages self-constructed datasets and lightweight modules to enhance foundational generative models.
GitHub 713 stars Velocity flat History 1 snapshot 3D Generative AI Apr 15 Pending High viability
MAny: Merge Anything for Multimodal Continual Instruction Tuning Build Now
MAny merges task-specific knowledge in multimodal LLMs through cross-modal and low-rank parameter merging to prevent catastrophic forgetting without additional training.
GitHub 1866 stars Velocity flat History 1 snapshot Multimodal LLMs Apr 15 Pending High viability
AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot Build Now
An AI system that generates technically sound peer reviews, preferred by authors and reviewers over human reviews, for large-scale scientific conferences.
GitHub stars n/a Velocity flat History 1 snapshot AI for Scientific Review Apr 15 Code High viability
A KL Lens on Quantization: Fast, Forward-Only Sensitivity for Mixed-Precision SSM-Transformer Models Build Now
A fast, forward-only sensitivity analysis using KL divergence for mixed-precision SSM-Transformer models, enabling efficient LLM deployment on edge devices.
GitHub 0 stars Velocity flat History 1 snapshot LLM Optimization Apr 15 Pending High viability
BenGER: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks Build Now
BenGER is an open-source web platform for end-to-end benchmarking of German legal LLM tasks, integrating task creation, annotation, execution, and evaluation.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 15 Pending High viability
Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI Build Now
CONCORD is a privacy-aware A2A framework enabling socially deployable proactive conversational agents through collaborative context recovery.
GitHub stars n/a Velocity flat History 1 snapshot Privacy-Aware AI Apr 14 Code High viability
ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold Build Now
ReSS bridges symbolic and neural reasoning for tabular data by using decision trees to scaffold LLMs, generating faithful and consistent natural-language explanations for high-stakes domains.
GitHub stars n/a Velocity flat History 1 snapshot Tabular AI Apr 15 Code High viability
Diffusion Language Models for Speech Recognition Build Now
Diffusion language models enhance speech recognition accuracy by integrating acoustic and language information for improved hypothesis rescoring and joint decoding.
GitHub stars n/a Velocity flat History 1 snapshot Speech Recognition Apr 15 Code High viability
A Unified Conditional Flow for Motion Generation, Editing, and Intra-Structural Retargeting Build Now
A unified generative framework using flow matching to perform text-driven motion editing and structural retargeting with a single model.
GitHub stars n/a Velocity flat History 1 snapshot Motion Generation Apr 15 Code High viability
Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus Build Now
A bilingual TTS system synthesizes speech for the Peruvian Constitution in Quechua and Spanish.
GitHub stars n/a Velocity flat History 1 snapshot Text-to-Speech Apr 14 Code High viability
C-voting: Confidence-Based Test-Time Voting without Explicit Energy Functions Watch
Confidence-based voting (C-voting) enhances test-time scaling for recurrent models by selecting latent trajectories based on prediction confidence, outperforming existing methods without requiring explicit energy functions.
GitHub 1866 stars Velocity flat History 1 snapshot LLM Reasoning Apr 15 Pending
Lazy or Efficient? Towards Accessible Eye-Tracking Event Detection Using LLMs Build Now
A code-free, LLM-driven pipeline for accessible eye-tracking event detection that converts natural language instructions into end-to-end analysis, reducing technical overhead.
GitHub stars n/a Velocity flat History 1 snapshot Human-Computer Interaction Apr 14 Code High viability
GeoAgentBench: A Dynamic Execution Benchmark for Tool-Augmented Agents in Spatial Analysis Build Now
A dynamic benchmark and agent architecture for evaluating and improving tool-augmented LLMs in complex spatial analysis tasks.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 15 Code High viability
LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning Build Now
Introducing LongCoT, a benchmark for evaluating long-horizon chain-of-thought reasoning in LLMs, revealing a significant gap in current model capabilities and providing a rigorous measure for future progress.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 15 Code High viability
From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models Build Now
A framework for unlearning sensitive data from LLMs using memory-graph guided synthesis of supervision, minimizing user input and corpus reliance.
GitHub stars n/a Velocity flat History 1 snapshot LLM Unlearning Apr 15 Code High viability
Leveraging LLM-GNN Integration for Open-World Question Answering over Knowledge Graphs Build Now
A hybrid LLM-GNN system for advanced question answering over incomplete knowledge graphs.
GitHub stars n/a Velocity flat History 1 snapshot AI-driven Knowledge Management Apr 15 Code High viability
HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark Build Now
HINTBench: A benchmark and evaluation framework for intrinsic risk in AI agents, revealing significant capability gaps in existing models.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 15 Code High viability
TREX: Automating LLM Fine-tuning via Agent-Driven Tree-based Exploration Build Now
TREX automates and optimizes the lifecycle of LLM fine-tuning using agents and a tree-based exploration approach.
GitHub stars n/a Velocity flat History 1 snapshot AI Automation Apr 15 Code High viability
KV Packet: Recomputation-Free Context-Independent KV Caching for LLMs Build Now
A recomputation-free KV caching framework for LLMs that uses trainable adapters to significantly reduce inference latency and computational overhead.
GitHub stars n/a Velocity flat History 1 snapshot LLM Inference Optimization Apr 14 Code High viability
SafeHarness: Lifecycle-Integrated Security Architecture for LLM-based Agent Deployment Build Now
SafeHarness is a lifecycle-integrated security architecture for LLM agents that significantly reduces unsafe behavior and attack success rates.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 15 Code High viability
TokenFormer: Unify the Multi-Field and Sequential Recommendation Worlds Build Now
TokenFormer unifies multi-field and sequential recommendation models, overcoming sequential collapse propagation with a novel attention scheme and representation method.
GitHub stars n/a Velocity flat History 1 snapshot Recommendation Systems Apr 15 Code High viability
MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments Build Now
A benchmark for evaluating AI agents' ability to retrieve multimodal evidence and reason over noisy web environments.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal Reasoning Apr 15 Code High viability
HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System Build Now
A hierarchical robotic manipulation system that decouples planning from execution to preserve VLM reasoning while enabling independent component improvement.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 15 Code High viability
Peer-Predictive Self-Training for Language Model Reasoning Build Now
Peer-Predictive Self-Training (PST) enables collaborative self-improvement of language models for reasoning without external supervision.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 14 Code High viability
Numerical Instability and Chaos: Quantifying the Unpredictability of Large Language Models Build Now
Quantifies numerical instability in LLMs, identifying a chaotic 'avalanche effect' and three distinct regimes of unpredictability based on floating-point precision.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reliability Apr 14 Code High viability
Do We Still Need Humans in the Loop? Comparing Human and LLM Annotation in Active Learning for Hostility Detection Build Now
LLM-generated annotations achieve comparable F1-Macro to human annotations for hostility detection at a fraction of the cost, with nuanced error profiles.
GitHub stars n/a Velocity flat History 1 snapshot LLM Annotation for Active Learning Apr 15 Code High viability
RiskWebWorld: A Realistic Interactive Benchmark for GUI Agents in E-commerce Risk Management Build Now
A realistic benchmark for evaluating GUI agents in e-commerce risk management, revealing a significant capability gap in current models and demonstrating agentic RL improvements.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 15 Code High viability
From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning Build Now
A reinforcement learning framework that aligns sentiment analysis with human-like reasoning, improving interpretability and prediction accuracy.
GitHub stars n/a Velocity flat History 1 snapshot Explainable AI Apr 15 Code High viability
DF3DV-1K: A Large-Scale Dataset and Benchmark for Distractor-Free Novel View Synthesis Build Now
A large-scale dataset and benchmark for distractor-free novel view synthesis, enabling robust radiance field development and improving image enhancement.
GitHub stars n/a Velocity flat History 1 snapshot Novel View Synthesis Apr 15 Code High viability
A Mechanistic Analysis of Sim-and-Real Co-Training in Generative Robot Policies Build Now
This research provides a mechanistic analysis of sim-and-real co-training for generative robot policies, identifying key effects and proposing a method to improve performance.
GitHub stars n/a Velocity flat History 1 snapshot Robotics AI Apr 15 Code High viability
Reward Design for Physical Reasoning in Vision-Language Models Build Now
This research systematically investigates reward design for improving physical reasoning in vision-language models, demonstrating accuracy gains through targeted reward signals.
GitHub stars n/a Velocity flat History 1 snapshot Vision-Language Models Apr 15 Code High viability
Jump-Start Reinforcement Learning with Vision-Language-Action Regularization Build Now
VLAJS jump-starts reinforcement learning for robotics by using vision-language-action models to bias exploration and improve learning efficiency, outperforming baselines by over 50%.
GitHub stars n/a Velocity flat History 1 snapshot Robotics RL Apr 15 Code High viability
ASTER: Latent Pseudo-Anomaly Generation for Unsupervised Time-Series Anomaly Detection Build Now
A framework that generates latent pseudo-anomalies for unsupervised time-series anomaly detection, outperforming state-of-the-art with LLM enrichment.
GitHub stars n/a Velocity flat History 1 snapshot Unsupervised Time-Series Anomaly Detection Apr 15 Code High viability
Rhetorical Questions in LLM Representations: A Linear Probing Study Build Now
Analyzing how LLMs internally represent rhetorical questions using linear probing, revealing that these signals emerge early and are encoded by multiple, context-dependent directions.
GitHub stars n/a Velocity flat History 1 snapshot LLM Representations Apr 15 Code High viability
IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages Build Now
A benchmark and framework for evaluating multilingual Text-to-SQL capabilities in Indian languages, addressing the 'Indic Gap' in LLM performance.
GitHub stars n/a Velocity flat History 1 snapshot Multilingual Text-to-SQL Apr 15 Code High viability
InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis Build Now
InfiniteScienceGym is a procedurally generated benchmark for evaluating LLMs' scientific reasoning capabilities, offering a controlled and unbounded environment without large static datasets.
GitHub stars n/a Velocity flat History 1 snapshot AI Benchmarking Apr 14 Code High viability
Learning from Change: Predictive Models for Incident Prevention in a Regulated IT Environment Build Now
An interpretable machine learning model that predicts IT incident risk for regulated environments, outperforming rule-based systems.
GitHub stars n/a Velocity flat History 1 snapshot IT Operations AI Apr 15 Code High viability
Pareto-Optimal Offline Reinforcement Learning via Smooth Tchebysheff Scalarization Build Now
STOMP is a novel offline RL algorithm that extends direct preference optimization to the multi-objective setting, enabling principled alignment of models for multi-attribute tasks like protein engineering.
GitHub stars n/a Velocity flat History 1 snapshot Multi-Objective RL Apr 14 Code High viability
SFT-GRPO Data Overlap as a Post-Training Hyperparameter for Autoformalization Build Now
Optimizing data overlap in SFT-GRPO post-training for LLMs significantly improves autoformalization accuracy without additional compute.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 15 Code High viability
Quantifying and Understanding Uncertainty in Large Reasoning Models Build Now
A novel methodology quantifies uncertainty in Large Reasoning Models with statistical guarantees and provides interpretable explanations by identifying key training examples and reasoning steps.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 15 Code High viability
A 3D SAM-Based Progressive Prompting Framework for Multi-Task Segmentation of Radiotherapy-induced Normal Tissue Injuries in Limited-Data Settings Build Now
A 3D SAM-based framework for multi-task segmentation of radiotherapy-induced normal tissue injuries in limited-data settings, outperforming state-of-the-art.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 15 Code High viability
UMI-3D: Extending Universal Manipulation Interface from Vision-Limited to 3D Spatial Perception Build Now
A multimodal extension for robotic manipulation data collection that integrates LiDAR for robust 3D spatial perception.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Data Collection Apr 15 Code High viability
Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues Build Now
Syn-TurnTurk is a synthetic dataset for Turkish dialogue turn-taking prediction, enabling more natural human-machine interaction in Turkish chatbots.
GitHub stars n/a Velocity flat History 1 snapshot Dialogue AI Apr 15 Code High viability
Rethinking Uncertainty in Segmentation: From Estimation to Decision Build Now
A new method for medical image segmentation that uses uncertainty estimates to guide decisions, significantly reducing errors with minimal deferral.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14 Code High viability
Hierarchical Reinforcement Learning with Runtime Safety Shielding for Power Grid Operation Build Now
A safety-constrained hierarchical reinforcement learning framework for power grid operation that ensures runtime safety and robust generalization.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning for Energy Apr 15 Code High viability
Explainable Fall Detection for Elderly Care via Temporally Stable SHAP in Skeleton-Based Human Activity Recognition Build Now
An explainable fall detection system for elderly care using temporally stable SHAP to provide reliable insights for clinicians.
GitHub stars n/a Velocity flat History 1 snapshot Healthcare AI Apr 14 Code High viability
Outperforming Self-Attention Mechanisms in Solar Irradiance Forecasting via Physics-Guided Neural Networks Build Now
A physics-guided hybrid CNN-BiLSTM model for solar irradiance forecasting that outperforms complex attention models, enabling efficient renewable energy management.
GitHub stars n/a Velocity flat History 1 snapshot Renewable Energy AI Apr 15 Code High viability
Asymmetric-Loss-Guided Hybrid CNN-BiLSTM-Attention Model for Industrial RUL Prediction with Interpretable Failure Heatmaps Build Now
A hybrid CNN-BiLSTM-Attention model for industrial Remaining Useful Life prediction that provides interpretable failure heatmaps, improving safety and maintenance decisions.
GitHub stars n/a Velocity flat History 1 snapshot Predictive Maintenance AI Apr 15 Code High viability
Automatically Inferring Teachers' Geometric Content Knowledge: A Skills Based Approach Watch
An automated system for assessing teachers' geometric content knowledge using LLMs and a fine-grained skills dictionary.
GitHub stars n/a Velocity flat History 1 snapshot Educational AI Apr 15 Code
Feed-Forward 3D Scene Modeling: A Problem-Driven Perspective Ignore
A survey proposing a new taxonomy for feed-forward 3D scene modeling, focusing on model design strategies agnostic to output format.
GitHub 70 stars Velocity flat History 1 snapshot 3D Reconstruction Apr 15 Pending
Finetuning-Free Diffusion Model with Adaptive Constraint Guidance for Inorganic Crystal Structure Generation Watch
A diffusion model with adaptive constraint guidance for generating thermodynamically plausible inorganic crystal structures with targeted properties.
GitHub stars n/a Velocity flat History 1 snapshot Materials Science AI Apr 14 Code
SparseBalance: Load-Balanced Long Context Training with Dynamic Sparse Attention Watch
A novel algorithm-system co-design framework for load-balanced long context LLM training that improves accuracy and efficiency.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 15 Code
Multitasking Embedding for Embryo Blastocyst Grading Prediction (MEmEBG) Watch
A multitask embedding approach using a pretrained ResNet-18 to automate blastocyst grading for IVF, improving consistency and reducing subjectivity.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 14 Code
The cognitive companion: a lightweight parallel monitoring architecture for detecting and recovering from reasoning degradation in LLM agents Watch
A parallel monitoring architecture for LLM agents that detects and recovers from reasoning degradation, offering both LLM-based and zero-overhead probe-based companions.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 15 Code
A Study of Failure Modes in Two-Stage Human-Object Interaction Detection Ignore
A study analyzing failure modes in two-stage human-object interaction detection models to provide insights for future research.
GitHub 713 stars Velocity flat History 1 snapshot Computer Vision AI Apr 15 Pending
Memory Transfer Learning: How Memories are Transferred Across Domains in Coding Agents Watch
Develop a Memory Transfer Learning tool to improve coding agent adaptability across diverse tasks.
GitHub stars n/a Velocity flat History 1 snapshot AI/ML Applications Apr 15 Code
Towards Scalable Lightweight GUI Agents via Multi-role Orchestration Watch
LAMO framework enables lightweight LLMs to perform complex GUI automation through multi-role orchestration, balancing cost and scalability.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 15
Beyond Uniform Sampling: Synergistic Active Learning and Input Denoising for Robust Neural Operators Watch
A novel defense mechanism for neural operators enhances robustness against adversarial attacks.
GitHub stars n/a Velocity flat History 1 snapshot Robustness Apr 14 Code
From Alignment to Prediction: A Study of Self-Supervised Learning and Predictive Representation Learning Watch
Introduces Predictive Representation Learning (PRL) as a new paradigm for self-supervised learning, demonstrating its potential through comparative analysis of BYOL, MAE, and I-JEPA.
GitHub stars n/a Velocity flat History 1 snapshot Self-Supervised Learning Apr 15 Code
Identifiability of Potentially Degenerate Gaussian Mixture Models With Piecewise Affine Mixing Ignore
A theoretical framework and method for identifying latent variables from degenerate Gaussian mixture models transformed by piecewise affine functions.
GitHub 0 stars Velocity flat History 1 snapshot Causal Representation Learning Apr 14 Pending
Free Lunch for Unified Multimodal Models: Enhancing Generation via Reflective Rectification with Inherent Understanding Watch
A training-free framework that enhances unified multimodal model generation by leveraging their inherent understanding for reflective rectification, inspired by human 'Thinking-While-Drawing'.
GitHub stars n/a Velocity flat History 1 snapshot Unified Multimodal Models Apr 15 Code
[Emerging Ideas] Artificial Tripartite Intelligence: A Bio-Inspired, Sensor-First Architecture for Physical AI Ignore
A bio-inspired, sensor-first architecture for physical AI that improves end-to-end accuracy and reduces remote inference calls.
GitHub 0 stars Velocity flat History 1 snapshot Physical AI Apr 15 Pending
Towards Fine-grained Temporal Perception: Post-Training Large Audio-Language Models with Audio-Side Time Prompt Watch
Fine-tune large audio-language models for precise temporal event detection using audio-side time prompts and reinforcement learning.
GitHub stars n/a Velocity flat History 1 snapshot Audio AI Apr 15 Code
Design Space Exploration of Hybrid Quantum Neural Networks for Chronic Kidney Disease Watch
A comprehensive exploration of Hybrid Quantum Neural Networks for Chronic Kidney Disease diagnosis, benchmarking 625 models to find optimal design choices.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 15 Code
Med-CAM: Minimal Evidence for Explaining Medical Decision Making Ignore
Med-CAM generates minimal, sharp activation maps for interpretable medical AI decisions, improving clinician trust and understanding.
GitHub 713 stars Velocity flat History 1 snapshot Medical AI Apr 15 Pending
Sentiment analysis for software engineering: How far can zero-shot learning (ZSL) go? Watch
Leveraging zero-shot learning to perform sentiment analysis in software engineering, reducing the need for extensive annotated datasets.
GitHub stars n/a Velocity flat History 1 snapshot NLP Apr 15 Code
L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification Watch
Adaptive model selection framework for clinical text classification that intelligently defers to LLMs for improved accuracy and cost-efficiency.
GitHub stars n/a Velocity flat History 1 snapshot Clinical AI Apr 14
Inclusive Kitchen Design for Older Adults: Generative AI Visualizations to Support Mild Cognitive Impairment Watch
An AI system that transforms standard kitchen photos into MCI-friendly designs, offering a low-cost, scalable solution for older adults and caregivers to visualize and implement DIY kitchen modifications.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI for Accessibility Apr 14
Out of Context: Reliability in Multimodal Anomaly Detection Requires Contextual Inference Ignore
Reframing multimodal anomaly detection as a cross-modal contextual inference problem to improve reliability in dynamic environments.
GitHub stars n/a Velocity flat History 1 snapshot Anomaly Detection Apr 14 Code
First-See-Then-Design: A Multi-Stakeholder View for Optimal Performance-Fairness Trade-Offs Ignore
A new framework for algorithmic decision-making that explicitly models multi-stakeholder utilities to achieve optimal performance-fairness trade-offs.
GitHub stars n/a Velocity flat History 1 snapshot Fairness in AI Apr 15 Code
Minimax Optimality and Spectral Routing for Majority-Vote Ensembles under Markov Dependence Ignore
A theoretical framework and adaptive algorithm for minimax optimal majority-vote ensembles in Markov-dependent data, improving time-series and RL applications.
GitHub stars n/a Velocity flat History 1 snapshot Ensemble Methods Apr 15 Code
Towards Multi-Object-Tracking with Radar on a Fast Moving Vehicle: On the Potential of Processing Radar in the Frequency Domain Ignore
Processing radar data in the frequency domain for robust multi-object tracking on fast-moving vehicles, demonstrating radar-only odometry.
GitHub stars n/a Velocity flat History 1 snapshot Radar Perception Apr 15 Code
Beyond Conservative Automated Driving in Multi-Agent Scenarios via Coupled Model Predictive Control and Deep Reinforcement Learning Watch
An integrated MPC-RL framework for automated driving that balances safety and efficiency in multi-agent scenarios, outperforming standalone methods and showing improved generalization.
GitHub stars n/a Velocity flat History 1 snapshot Autonomous Driving Control Apr 15
CLIP Architecture for Abdominal CT Image-Text Alignment and Zero-Shot Learning: Investigating Batch Composition and Data Scaling Ignore
Investigating the impact of batch composition and data scaling on CLIP-like architectures for abdominal CT image-text alignment and zero-shot learning.
GitHub stars n/a Velocity flat History 1 snapshot Medical Vision-Language Models Apr 15 Code
The Cognitive Circuit Breaker: A Systems Engineering Framework for Intrinsic AI Reliability Ignore
A framework to detect LLM hallucinations by analyzing internal model states, reducing latency and computational overhead for mission-critical applications.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reliability Apr 15 Code
Evaluating Supervised Machine Learning Models: Principles, Pitfalls, and Metric Selection Ignore
A framework for robustly evaluating supervised machine learning models by addressing common pitfalls in metric selection and validation.
GitHub stars n/a Velocity flat History 1 snapshot ML Evaluation Apr 15 Code
From Feelings to Metrics: Understanding and Formalizing How Users Vibe-Test LLMs Ignore
Developing a method to formalize user-centric evaluations of LLMs by translating subjective 'vibe-testing' into quantitative metrics.
GitHub stars n/a Velocity flat History 1 snapshot AI Evaluation and Testing Apr 15 Code
Representation over Routing: Overcoming Surrogate Hacking in Multi-Timescale PPO Ignore
Proposes a Target Decoupling architecture for multi-timescale PPO that overcomes surrogate hacking and myopic degeneration by isolating short-term signals for policy updates.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Apr 15 Code
4th Workshop on Maritime Computer Vision (MaCVi): Challenge Overview Ignore
A workshop overview and challenge report for maritime computer vision, focusing on predictive accuracy and real-time feasibility with benchmark challenges and datasets.
GitHub stars n/a Velocity flat History 1 snapshot Computer Vision Apr 14 Code
Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration Ignore
Fairness in AI agents emerges from multi-agent collaboration, not single-model optimization, offering a new perspective on ethical AI.
GitHub stars n/a Velocity flat History 1 snapshot AI Agents Apr 15 Code
On the Use of Evolutionary Optimization for the Dynamic Chance Constrained Open-Pit Mine Scheduling Problem Ignore
A bi-objective evolutionary algorithm with a diversity-based change response mechanism optimizes open-pit mine scheduling under dynamic economic values and resource capacities.
GitHub stars n/a Velocity flat History 1 snapshot Optimization Apr 15 Code
On the Creativity of AI Agents Ignore
An analysis of the creativity of AI agents, exploring functionalist and ontological perspectives and discussing the desirability and risks of artificial creativity.
GitHub stars n/a Velocity flat History 1 snapshot AI Agents Apr 14 Code
Creo: From One-Shot Image Generation to Progressive, Co-Creative Ideation Ignore
Creo: A multi-stage text-to-image system that allows progressive, co-creative ideation with user control and decision locking.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI Apr 15
FRAGATA: Semantic Retrieval of HPC Support Tickets via Hybrid RAG over 20 Years of Request Tracker History Ignore
Fragata is a semantic search system for HPC support tickets that uses hybrid RAG to improve knowledge reuse and overcome limitations of traditional search engines.
GitHub stars n/a Velocity flat History 1 snapshot Semantic Search Apr 15
English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training Ignore
Systematically explore the impact of multilingual data on LLM performance across different scales and tasks, revealing benefits for low-resource languages and overall cross-lingual generalization.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 14
AlphaCNOT: Learning CNOT Minimization with Model-Based Planning Ignore
Developing AlphaCNOT, a model-based reinforcement learning framework for minimizing CNOT gates in quantum circuits.
GitHub stars n/a Velocity flat History 1 snapshot Quantum Computing Optimization Apr 15
GeoVision-Enabled Digital Twin for Hybrid Autonomous-Teleoperated Medical Responses Ignore
A Digital Twin architecture for hybrid autonomous-teleoperated medical response systems, enhancing situational awareness and decision-making.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 14
Adaptive Conformal Prediction for Improving Factuality of Generations by Large Language Models Ignore
An adaptive conformal prediction approach is proposed to improve the factuality of large language model generations by providing prompt-dependent uncertainty estimates.
GitHub stars n/a Velocity flat History 1 snapshot LLM Factuality Apr 15
Monthly Diffusion v0.9: A Latent Diffusion Model for the First AI-MIP Ignore
A latent diffusion model for monthly climate emulation using a spherical Fourier neural operator-inspired architecture.
GitHub stars n/a Velocity flat History 1 snapshot Climate AI Apr 15
Cognitive Offloading in Agile Teams: How Artificial Intelligence Reshapes Risk Assessment and Planning Quality Ignore
Investigating the impact of AI on cognitive offloading in Agile sprint planning to propose a hybrid AI-human framework.
GitHub stars n/a Velocity flat History 1 snapshot AI for Project Management Apr 15
SciFi: A Safe, Lightweight, User-Friendly, and Fully Autonomous Agentic AI Workflow for Scientific Applications Ignore
A safe, lightweight, and user-friendly agentic framework for the autonomous execution of well-defined scientific tasks, enabling researchers to offload routine workloads.
GitHub stars n/a Velocity flat History 1 snapshot Agentic AI Workflows Apr 14
Golden Handcuffs make safer AI agents Ignore
A Bayesian mitigation strategy for reinforcement learning agents to prevent unintended high-reward strategies by incorporating a large negative penalty and a mentor override.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 15
Optimizing Earth Observation Satellite Schedules under Unknown Operational Constraints: An Active Constraint Acquisition Approach Ignore
An active constraint acquisition approach for Earth Observation satellite scheduling that learns operational constraints interactively from an oracle.
GitHub stars n/a Velocity flat History 1 snapshot Optimization Apr 14
Large Language Models to Enhance Business Process Modeling: Past, Present, and Future Trends Ignore
A literature review on using LLMs to automate business process modeling, highlighting current trends and future research directions.
GitHub stars n/a Velocity flat History 1 snapshot Business Process Modeling Apr 15
Comparison of window shapes and lengths in short-time feature extraction for classification of heart sound signals Ignore
An experimental evaluation of window shapes and lengths for feature extraction in classifying heart sound signals using bidirectional LSTMs.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 15
Secure and Privacy-Preserving Vertical Federated Learning Ignore
A privacy-preserving framework for vertically split federated learning using secure multiparty computation and differential privacy.
GitHub stars n/a Velocity flat History 1 snapshot Federated Learning Apr 15
Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision Ignore
This research explores the interpretability of Vision Transformers through Cross-Layer Transcoders.
GitHub stars n/a Velocity flat History 1 snapshot Interpretable AI Apr 14
Weight Patching: Toward Source-Level Mechanistic Localization in LLMs Ignore
A novel parameter-space intervention method for localizing specific behaviors within Large Language Models.
GitHub stars n/a Velocity flat History 1 snapshot LLM Interpretability Apr 15
Ordinary Least Squares is a Special Case of Transformer Ignore
This paper theoretically demonstrates that Ordinary Least Squares is a special case of the Transformer architecture, revealing a decoupled slow and fast memory mechanism.
GitHub stars n/a Velocity flat History 1 snapshot LLM Theory Apr 15
Rethinking AI Hardware: A Three-Layer Cognitive Architecture for Autonomous Agents Ignore
A novel three-layer cognitive architecture for autonomous agents that decomposes intelligence across heterogeneous hardware to reduce latency and energy consumption.
GitHub stars n/a Velocity flat History 1 snapshot AI Hardware Architecture Apr 15
Soft $Q(λ)$: A multi-step off-policy method for entropy regularised reinforcement learning using eligibility traces Ignore
A theoretical framework for multi-step off-policy soft Q-learning using eligibility traces for improved credit assignment in reinforcement learning.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Apr 15
Young people's perceptions and recommendations for conversational generative artificial intelligence in youth mental health Ignore
Young people's perceptions and co-designed recommendations for conversational generative AI in youth mental health, focusing on humanizing care, transparency, and personalized integration.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics & UX Apr 15
From Order to Distribution: A Spectral Characterization of Forgetting in Continual Learning Ignore
A theoretical characterization of forgetting in continual learning by analyzing task distributions rather than random orderings.
GitHub stars n/a Velocity flat History 1 snapshot Continual Learning Theory Apr 15
A Dynamic-Growing Fuzzy-Neuro Controller, Application to a 3PSP Parallel Robot Ignore
A dynamic-growing fuzzy-neuro controller with an adaptive strategy for precise and stable control of parallel robots.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Control Apr 15