Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction Build Now
Build a multi-agent system for efficient internet-scale information extraction and table generation.
GitHub 3 stars Velocity flat History 1 snapshot Information Extraction Apr 29 Pending High viability
GUI Agents with Reinforcement Learning: Toward Digital Inhabitants Build Now
Developing GUI agents that learn through reinforcement learning to become digital inhabitants, addressing long-horizon credit assignment and safety.
GitHub 5 stars Velocity flat History 1 snapshot Agents Apr 30 Pending High viability
KellyBench: A Benchmark for Long-Horizon Sequential Decision Making Build Now
KellyBench is a new benchmark and API for evaluating long-horizon sequential decision-making in dynamic environments like sports betting markets.
GitHub 8 stars Velocity flat History 1 snapshot Sequential Decision Making Apr 30 Pending High viability
Instruction-Guided Poetry Generation in Arabic and Its Dialects Build Now
Enabling instruction-guided poetry generation in Arabic and its dialects with a large-scale dataset and fine-tuned LLMs, assisting users in creative writing.
GitHub 0 stars Velocity flat History 1 snapshot Generative AI Apr 30 Pending High viability
NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains Build Now
NeocorRAG enhances Retrieval-Augmented Generation by optimizing retrieval quality through evidence chains, achieving state-of-the-art performance with significantly fewer tokens.
GitHub 6 stars Velocity flat History 1 snapshot Retrieval-Augmented Generation Apr 30 Pending High viability
Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents Build Now
ValuePlanner introduces a hierarchical cognitive architecture for embodied agents, enabling self-directed behavior by decoupling high-level value reasoning from low-level action planning.
GitHub 713 stars Velocity flat History 1 snapshot Agents Apr 30 Pending High viability
APPSI-139: A Parallel Corpus of English Application Privacy Policy Summarization and Interpretation Build Now
A high-quality corpus and hybrid framework for summarizing and interpreting application privacy policies, outperforming large language models in readability and reliability.
GitHub 1 stars Velocity flat History 1 snapshot Legal AI Apr 30 Pending High viability
Reliable Answers for Recurring Questions: Boosting Text-to-SQL Accuracy with Template Constrained Decoding Build Now
TeCoD boosts Text-to-SQL accuracy by converting query patterns into reusable templates and enforcing them during SQL generation, reducing latency and improving execution accuracy.
GitHub 0 stars Velocity flat History 1 snapshot Text-to-SQL Apr 30 Pending High viability
DEFault++: Automated Fault Detection, Categorization, and Diagnosis for Transformer Architectures Build Now
Automated diagnostics for transformer models that detect, categorize, and pinpoint root causes of faults, improving repair accuracy for practitioners.
GitHub stars n/a Velocity flat History 1 snapshot AI Debugging Apr 30 Pending High viability
Deep Learning-Based Segmentation of Peritoneal Cancer Index Regions from CT Imaging Build Now
This research develops a deep learning system to automatically segment peritoneal cancer index regions from CT scans, enabling non-invasive assessment and approaching interobserver agreement.
GitHub 0 stars Velocity flat History 1 snapshot Medical AI Apr 30 Pending High viability
Robust Lightweight Crack Classification for Real-Time UAV Bridge Inspection Build Now
A lightweight, real-time crack classification model for UAV bridge inspections that balances accuracy, speed, and robustness.
GitHub 1 stars Velocity flat History 1 snapshot Computer Vision Apr 30 Pending High viability
D3-Gym: Constructing Real-World Verifiable Environments for Data-Driven Discovery Build Now
D3-Gym provides verifiable environments and a dataset for training AI agents on real-world scientific discovery tasks, significantly improving model performance.
GitHub 1 stars Velocity flat History 1 snapshot AI Agents Apr 30 Pending High viability
MCPHunt: An Evaluation Framework for Cross-Boundary Data Propagation in Multi-Server MCP Agents Build Now
MCPHunt is a benchmark framework for evaluating cross-boundary data propagation in multi-server agents, identifying vulnerabilities and enabling prompt-level mitigation strategies.
GitHub 1 stars Velocity flat History 1 snapshot Agents Apr 30 Pending High viability
RuC: HDL-Agnostic Rule Completion Benchmark Generation Build Now
A language-agnostic benchmark generator for RTL code completion tasks, enabling controlled and scalable evaluation of LLM capabilities in hardware design.
GitHub stars n/a Velocity flat History 1 snapshot Hardware Code Generation Apr 30 Pending High viability
Position-Aware Drafting for Inference Acceleration in LLM-Based Generative List-Wise Recommendation Build Now
A lightweight module accelerates LLM-based recommendation inference by improving speculative decoding with position-aware signals.
GitHub 1 stars Velocity flat History 1 snapshot LLM Inference Apr 30 Pending High viability
WindowsWorld: A Process-Centric Benchmark of Autonomous GUI Agents in Professional Cross-Application Environments Build Now
A new benchmark for evaluating autonomous GUI agents on complex, cross-application professional workflows, revealing significant performance gaps in current leading models.
GitHub 3 stars Velocity flat History 1 snapshot Agents Apr 30 Pending High viability
CastFlow: Learning Role-Specialized Agentic Workflows for Time Series Forecasting Build Now
A dynamic agentic framework for time series forecasting that refines predictions through iterative planning, action, and reflection, leveraging specialized LLMs and ensemble methods.
GitHub 1 stars Velocity flat History 1 snapshot Agents Apr 30 Pending High viability
TIO-SHACL: Comprehensive SHACL validation for TMF Intent Ontologies Build Now
A comprehensive SHACL validation framework for TMF Intent Ontologies to ensure correctness of network intents before deployment.
GitHub stars n/a Velocity flat History 1 snapshot Telecommunications Networking Apr 30 Pending High viability
RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation Build Now
A hierarchical alignment framework for generating accurate radiology reports from medical images by matching visual features to report structure.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30 Code High viability
SpecVQA: A Benchmark for Spectral Understanding and Visual Question Answering in Scientific Images Build Now
SpecVQA is a benchmark for spectral understanding and visual question answering in scientific images, aiming to improve multimodal models' capabilities in scientific data analysis.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal AI Apr 30 Code High viability
METASYMBO: Multi-Agent Language-Guided Metamaterial Discovery via Symbolic Latent Evolution Build Now
MetaSymbO is a multi-agent framework for language-guided metamaterial discovery, using symbolic latent evolution to generate valid and novel microstructures.
GitHub stars n/a Velocity flat History 1 snapshot Generative Design Apr 30 Code High viability
TransVLM: A Vision-Language Framework and Benchmark for Detecting Any Shot Transitions Build Now
TransVLM is a vision-language framework for detecting continuous shot transitions in videos by incorporating optical flow, achieving state-of-the-art performance and deployed to production.
GitHub stars n/a Velocity flat History 1 snapshot Vision-Language Models Apr 30 Code High viability
Reinforced Agent: Inference-Time Feedback for Tool-Calling Agents Build Now
This research introduces an inference-time feedback mechanism for tool-calling agents, enabling proactive error correction and improving agent performance without retraining.
GitHub stars n/a Velocity flat History 1 snapshot Agentic AI Apr 29 Code High viability
COHERENCE: Benchmarking Fine-Grained Image-Text Alignment in Interleaved Multimodal Contexts Build Now
COHERENCE: A benchmark for evaluating fine-grained image-text alignment in interleaved multimodal contexts, crucial for real-world document understanding.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal AI Apr 30 Code High viability
AutoSurfer -- Teaching Web Agents through Comprehensive Surfing, Learning, and Modeling Build Now
AutoSurfer is a web agent training data generator that uses breadth-first exploration and guided task synthesis to comprehensively cover websites and improve LLM performance on complex web tasks.
GitHub stars n/a Velocity flat History 1 snapshot Web Agents Apr 29 Code High viability
InteractWeb-Bench: Can Multimodal Agent Escape Blind Execution in Interactive Website Generation? Build Now
InteractWeb-Bench, a new benchmark and interactive environment for evaluating multimodal agents in website generation under realistic, ambiguous user instructions.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal Agents Apr 30 Pending High viability
Secret Stealing Attacks on Local LLM Fine-Tuning through Supply-Chain Model Code Backdoors Build Now
A novel attack that steals sensitive secrets from local LLM fine-tuning by exploiting supply-chain model code backdoors, bypassing existing defenses.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 30 Code High viability
End-to-end autonomous scientific discovery on a real optical platform Build Now
An LLM-based agent system that autonomously performs end-to-end scientific discovery on a real optical platform, identifying a novel physical mechanism.
GitHub stars n/a Velocity flat History 1 snapshot AI for Scientific Discovery Apr 29 Code High viability
Intent2Tx: Benchmarking LLMs for Translating Natural Language Intents into Ethereum Transactions Build Now
Intent2Tx benchmarks LLMs for translating natural language intents into Ethereum transactions, revealing critical gaps in reasoning-to-execution for Web3 agents.
GitHub stars n/a Velocity flat History 1 snapshot Web3 AI Apr 30 Code High viability
When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems Build Now
A framework for confident LLM migration in production systems, using Bayesian calibration of automated metrics against human judgment for enterprise AI services.
GitHub stars n/a Velocity flat History 1 snapshot LLM Operations Apr 29 Code High viability
Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows Build Now
A live benchmark for LLM agents that evaluates their ability to complete evolving real-world workflows with verifiable execution traces.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agent Benchmarking Apr 30 Code High viability
PhyCo: Learning Controllable Physical Priors for Generative Motion Build Now
PhyCo introduces controllable physical priors into video generation using a physics-supervised diffusion model and VLM-guided reward optimization.
GitHub stars n/a Velocity flat History 1 snapshot Generative Video Apr 30 Code High viability
Improving Graph Few-shot Learning with Hyperbolic Space and Denoising Diffusion Build Now
A framework for graph few-shot learning that uses hyperbolic space and denoising diffusion to improve representation learning and generalization.
GitHub stars n/a Velocity flat History 1 snapshot Graph ML Apr 30 Code High viability
Can AI Be a Good Peer Reviewer? A Survey of Peer Review Process, Evaluation, and the Future Build Now
A survey and practical guide for building and evaluating LLM systems across the entire academic peer review workflow.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 30 Code High viability
End-to-End Evaluation and Governance of an EHR-Embedded AI Agent for Clinicians Build Now
An end-to-end governance framework for EHR-embedded AI agents, demonstrating significant performance improvements and reduced error rates through continuous monitoring and controlled experimentation.
GitHub stars n/a Velocity flat History 1 snapshot Clinical AI Apr 30 Code High viability
The TEA Nets framework combines AI and cognitive network science to model targets, events and actors in text Build Now
TEA Nets is an open-source Python library combining AI and cognitive network science to model targets, events, and actors in text, enabling interpretable emotion detection and semantic analysis.
GitHub stars n/a Velocity flat History 1 snapshot Text Analysis Apr 30 Code High viability
Iterative Multimodal Retrieval-Augmented Generation for Medical Question Answering Build Now
A multimodal retrieval-augmented generation system for medical question answering that reasons over document images, achieving state-of-the-art accuracy.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30 Code High viability
MIFair: A Mutual-Information Framework for Intersectionality and Multiclass Fairness Build Now
MIFair is a unified framework for bias assessment and mitigation based on mutual information, addressing intersectionality and multiclass fairness challenges.
GitHub stars n/a Velocity flat History 1 snapshot Fairness in AI Apr 30 Code High viability
LLM as Clinical Graph Structure Refiner: Enhancing Representation Learning in EEG Seizure Diagnosis Build Now
Leveraging LLMs to refine graph structures for EEG seizure diagnosis, improving accuracy and interpretability by removing noisy connections.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30 Code High viability
ZAYAN: Disentangled Contrastive Transformer for Tabular Remote Sensing Data Watch
A self-supervised, feature-centric contrastive framework for learning informative representations from challenging tabular remote sensing data.
GitHub stars n/a Velocity flat History 1 snapshot Tabular Data Apr 30 Pending
Learning Rate Engineering: From Coarse Single Parameter to Layered Evolution Build Now
DALS is a unified optimizer framework that integrates phase-adaptive scheduling and depth-aware scaling for efficient and effective model training across diverse tasks.
GitHub stars n/a Velocity flat History 1 snapshot LLM Optimization Apr 30 Code High viability
Heterogeneous Scientific Foundation Model Collaboration Build Now
Eywa is a framework enabling language models to orchestrate and reason over diverse scientific foundation models for complex, multi-modal tasks.
GitHub stars n/a Velocity flat History 1 snapshot Agentic AI / Foundation Models Apr 30 Code High viability
BrainDINO: A Brain MRI Foundation Model for Generalizable Clinical Representation Learning Build Now
BrainDINO, a self-supervised foundation model trained on millions of brain MRI slices, generalizes across diverse clinical tasks with minimal labeled data.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30 Code High viability
FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems Build Now
An open-source, low-cost, and scalable piezoresistive tactile sensing solution for robotic end-effectors that enables advanced tactile learning pipelines.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Tactile Sensing Apr 30 Code High viability
Contextual Agentic Memory is a Memo, Not True Memory Build Now
This research proposes a fundamental shift in agent memory systems, moving beyond simple lookup to true generalization by learning abstract rules, addressing limitations in current AI capabilities and security.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30 Code High viability
VibroML: an automated toolkit for high-throughput vibrational analysis and dynamic instability remediation of crystalline materials using machine-learned potentials Build Now
An open-source toolkit for automated vibrational analysis and structural remediation of crystalline materials using machine-learned potentials, enabling faster discovery of stable polymorphs and functional materials.
GitHub stars n/a Velocity flat History 1 snapshot Materials Science AI Apr 30 Code High viability
From Unstructured Recall to Schema-Grounded Memory: Reliable AI Memory via Iterative, Schema-Aware Extraction Build Now
A schema-grounded AI memory system that reliably extracts and validates factual information for agents, moving beyond simple text retrieval.
GitHub stars n/a Velocity flat History 1 snapshot AI Memory Systems Apr 30 Code High viability
MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection Build Now
A multi-agent framework with retrieval augmentation and structured reasoning for robust multimodal stance detection.
GitHub stars n/a Velocity flat History 1 snapshot Multi-modal Agents Apr 30 Code High viability
WaferSAGE: Large Language Model-Powered Wafer Defect Analysis via Synthetic Data Generation and Rubric-Guided Reinforcement Learning Build Now
WaferSAGE is an LLM-powered framework for wafer defect analysis using synthetic data and reinforcement learning, enabling on-premise deployment of small, specialized models.
GitHub stars n/a Velocity flat History 1 snapshot Semiconductor Defect Analysis Apr 30 Code High viability
Post-Optimization Adaptive Rank Allocation for LoRA Build Now
A post-optimization method to significantly reduce LoRA parameters by 75-90% while preserving performance, enabling more efficient fine-tuning of large models.
GitHub stars n/a Velocity flat History 1 snapshot LLM Fine-tuning Apr 30 Code High viability
Learning to Reason: Targeted Knowledge Discovery and Fuzzy Logic Update for Robust Image Recognition Build Now
A novel method for targeted knowledge discovery and fuzzy logic update in deep neural networks to improve robust image recognition without explicit concept labels.
GitHub stars n/a Velocity flat History 1 snapshot Computer Vision Apr 30 Code High viability
Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling Build Now
High-signal data filtering for sample-efficient German language modeling, achieving state-of-the-art results with significantly fewer tokens.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 30 Code High viability
Language Models Refine Mechanical Linkage Designs Through Symbolic Reflection and Modular Optimisation Build Now
Language models and numerical optimizers collaborate to systematically improve mechanical linkage designs by exploring topologies and fitting parameters.
GitHub stars n/a Velocity flat History 1 snapshot AI for Engineering Design Apr 30 Code High viability
RAY-TOLD: Ray-Based Latent Dynamics for Dense Dynamic Obstacle Avoidance with TDMPC Build Now
A hybrid control architecture for autonomous robots that integrates LiDAR-based latent dynamics with MPPI for dense, dynamic crowd navigation.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 30 Code High viability
Debiasing Reward Models via Causally Motivated Inference-Time Intervention Build Now
A method to debias reward models for LLMs at inference time by intervening on specific neurons, improving alignment without performance trade-offs.
GitHub stars n/a Velocity flat History 1 snapshot LLM Alignment Apr 30 Code High viability
BoostLoRA: Growing Effective Rank by Boosting Adapters Build Now
BoostLoRA enhances parameter-efficient fine-tuning by iteratively merging low-rank adapters to grow effective rank, achieving state-of-the-art performance without inference overhead.
GitHub stars n/a Velocity flat History 1 snapshot LLM Fine-tuning Apr 30 Code High viability
ClipTBP: Clip-Pair based Temporal Boundary Prediction with Boundary-Aware Learning for Moment Retrieval Build Now
A framework for video moment retrieval that improves accuracy by learning relationships between multiple relevant video segments.
GitHub stars n/a Velocity flat History 1 snapshot Video Understanding Apr 30 Code High viability
Robust Learning on Heterogeneous Graphs with Heterophily: A Graph Structure Learning Approach Build Now
HGUL: A framework for robust representation learning on heterogeneous graphs with heterophily by jointly handling noisy structures and learning adaptive affinities.
GitHub stars n/a Velocity flat History 1 snapshot Graph Neural Networks Apr 30 Code High viability
Beyond the Mean: Within-Model Reliable Change Detection for LLM Evaluation Build Now
A new metric for LLM evaluation that detects reliable changes in model performance beyond aggregate scores, revealing bidirectional improvements and deteriorations.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 30 Pending High viability
Collaborative Agent Reasoning Engineering (CARE): A Three-Party Design Methodology for Systematically Engineering AI Agents with Subject Matter Experts, Developers, and Helper Agents Build Now
A methodology for systematically engineering LLM agents with subject matter experts, developers, and helper agents, improving development efficiency and complex-query performance.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30 Code High viability
Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection Build Now
Develops an activation-level detection system for multi-turn LLM prompt injection attacks, achieving high accuracy across multiple model families.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 30 Code High viability
Machine Collective Intelligence for Explainable Scientific Discovery Build Now
Machine collective intelligence autonomously discovers explainable governing equations from data, outperforming deep learning in extrapolation and parameter efficiency.
GitHub stars n/a Velocity flat History 1 snapshot Scientific Discovery Apr 30 Code High viability
The Inverse-Wisdom Law: Architectural Tribalism and the Consensus Paradox in Agentic Swarms Build Now
This research introduces the 'Inverse-Wisdom Law' and 'Architectural Tribalism' to explain why agentic swarms can prioritize internal agreement over factual accuracy, proposing a 'Heterogeneity Mandate' for resilient architectures.
GitHub stars n/a Velocity flat History 1 snapshot Agentic Swarms Apr 30 Code High viability
Agent-Agnostic Evaluation of SQL Accuracy in Production Text-to-SQL Systems Build Now
A schema-agnostic evaluation framework for production Text-to-SQL systems that provides continuous monitoring and improvement feedback.
GitHub stars n/a Velocity flat History 1 snapshot Text-to-SQL Apr 30 Code High viability
Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, and Resolution in Security Operations Build Now
An end-to-end LLM framework automating threat detection, query generation, and incident resolution in Security Operations Centers, reducing triage time from hours to minutes.
GitHub stars n/a Velocity flat History 1 snapshot Security AI Apr 30 Code High viability
PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning Ignore
PRISM uses on-policy distillation to improve multimodal reinforcement learning by aligning policy with supervision distribution, reducing drift from supervised fine-tuning.
GitHub 4 stars Velocity flat History 1 snapshot Multimodal AI Apr 30 Pending
Simulating clinical interventions with a generative multimodal model of human physiology Watch
A generative multimodal model of human physiology that forecasts individual health trajectories and simulates interventions for personalized medicine.
GitHub stars n/a Velocity flat History 1 snapshot Generative Health Models Apr 30 High viability
How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews Watch
An empirical study and benchmark dataset analyzing how generative AI disrupts web search, comparing Google Search, Gemini, and AI Overviews to understand information presentation and source differences.
GitHub stars n/a Velocity flat History 1 snapshot Generative Search Apr 30 Code
Learning from Disagreement: Clinician Overrides as Implicit Preference Signals for Clinical AI in Value-Based Care Watch
Reframing clinician overrides as implicit preference signals for clinical AI, enabling robust learning in value-based care settings.
GitHub stars n/a Velocity flat History 1 snapshot Clinical AI Apr 30 High viability
AdaBFL: Multi-Layer Defensive Adaptive Aggregation for Bzantine-Robust Federated Learning Watch
A multi-layer adaptive aggregation method for Byzantine-robust federated learning that defends against complex attacks without requiring server-side data.
GitHub stars n/a Velocity flat History 1 snapshot Federated Learning Apr 30 Code
LLMs as ASP Programmers: Self-Correction Enables Task-Agnostic Nonmonotonic Reasoning Watch
A framework translating natural language to Answer Set Programming enables LLMs to perform task-agnostic nonmonotonic reasoning through automated self-correction.
GitHub stars n/a Velocity flat History 1 snapshot LLMs for Nonmonotonic Reasoning Apr 30 Code
Exploring Interaction Paradigms for LLM Agents in Scientific Visualization Watch
Evaluating LLM agents for scientific visualization tasks, comparing interaction paradigms and modalities for optimal workflow generation.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 30 Code
A Collective Variational Principle Unifying Bayesian Inference, Game Theory, and Thermodynamics Ignore
Introducing a unified framework where multi-agent systems performing local free-energy minimization implicitly implement a stochastic game.
GitHub stars n/a Velocity flat History 1 snapshot AI Theory Apr 30 Pending
HAVEN: Hybrid Automated Verification ENgine for UVM Testbench Synthesis with LLMs Watch
A hybrid engine that uses LLM agents and a Protocol-Aware DSL with templates to generate UVM testbenches and sequences for IC verification.
GitHub stars n/a Velocity flat History 1 snapshot Hardware Verification Apr 30
TopBench: A Benchmark for Implicit Prediction and Reasoning over Tabular Question Answering Watch
TopBench: A new benchmark for evaluating LLMs on implicit prediction and reasoning over tabular data.
GitHub stars n/a Velocity flat History 1 snapshot Table QA Apr 30 Code
Beyond the Training Distribution: Mapping Generalization Boundaries in Neural Program Synthesis Watch
A controlled environment and metric space to rigorously assess and improve the out-of-distribution generalization of neural program synthesis models.
GitHub stars n/a Velocity flat History 1 snapshot Program Synthesis Apr 30 Code
Pragmos: A Process Agentic Modeling System Watch
Pragmos: A prototype system for collaborative, explainable process modeling using LLMs and specialized tools.
GitHub stars n/a Velocity flat History 1 snapshot Process Agents Apr 30 Code
A Grid-Aware Agent-Based Model for Analyzing Electric Vehicle Charging Systems Ignore
This paper presents a configurable, grid-aware Agent-Based Model in Python for analyzing electric vehicle charging systems, integrating heterogeneous EV behavior and facility-level power dynamics.
GitHub stars n/a Velocity flat History 1 snapshot Simulation Apr 30 Code
Math Education Digital Shadows for facilitating learning with LLMs: Math performance, anxiety and confidence in simulated students and AIs Ignore
A dataset mapping LLM mathematical reasoning and confidence across simulated student personas, integrating self-efficacy and anxiety for safer AI tutors.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 30 Code
Profiles of AI Dependency: A Latent Class Analysis of Filipino Students' Academic Competencies Ignore
Identifies distinct profiles of AI dependency among Filipino students, highlighting risks to academic competencies and advocating for AI literacy.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 30 Code
Optimization before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading Ignore
This research highlights the critical impact of prompt optimization on LLM evaluation, suggesting a need for more dynamic and model-specific assessment methods.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 30 Code
Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations Watch
An adaptive inference control framework for black-box LLM services that balances response quality with computational cost using verifiable observations.
GitHub stars n/a Velocity flat History 1 snapshot LLM Services Apr 30
ITS-Mina: A Harris Hawks Optimization-Based All-MLP Framework with Iterative Refinement and External Attention for Multivariate Time Series Forecasting Ignore
An all-MLP framework for multivariate time series forecasting that uses iterative refinement and external attention, optimized by Harris Hawks Optimization for adaptive dropout tuning.
GitHub stars n/a Velocity flat History 1 snapshot Time Series Forecasting Apr 30 Code
To Build or Not to Build? Factors that Lead to Non-Development or Abandonment of AI Systems Ignore
Analyzes factors leading to non-development or abandonment of AI systems, identifying levers beyond ethical concerns.
GitHub stars n/a Velocity flat History 1 snapshot Responsible AI Apr 30 Code
From Mirage to Grounding: Towards Reliable Multimodal Circuit-to-Verilog Code Generation Ignore
A new method for reliable circuit diagram to RTL code generation that addresses the 'Mirage' phenomenon and achieves strong visual grounding.
GitHub stars n/a Velocity flat History 1 snapshot Multimodal Code Generation Apr 30
Measurement Risk in Supervised Financial NLP: Rubric and Metric Sensitivity on JF-ICR Ignore
A framework for auditing financial NLP benchmarks to ensure reliable model selection and deployment by addressing measurement risk.
GitHub stars n/a Velocity flat History 1 snapshot Financial NLP Apr 30 Code
Safe Bilevel Delegation (SBD): A Formal Framework for Runtime Delegation Safety in Multi-Agent Systems Ignore
A formal framework for runtime delegation safety in hierarchical multi-agent systems, balancing safety and efficiency.
GitHub stars n/a Velocity flat History 1 snapshot Multi-Agent Systems Apr 30 Code
When Agents Evolve, Institutions Follow Ignore
Exploring the co-evolution of agents and institutions to model societal dynamics.
GitHub 2 stars Velocity flat History 1 snapshot Agent Evolution and Institutional Dynamics Apr 30 Pending
Beyond Semantics: Measuring Fine-Grained Emotion Preservation in Small Language Model-Based Machine Translation Ignore
Evaluating the emotional nuance preservation of small language models in machine translation using a fine-grained emotion dataset.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 30 Code
Mechanized Foundations of Structural Governance: Machine-Checked Proofs for Governed Intelligence Ignore
Mechanized proofs for structural governance in cognitive workflow systems, including coinductive safety predicates and governance invariance theorems.
GitHub stars n/a Velocity flat History 1 snapshot AI Governance Apr 30 Pending
Political Bias Audits of LLMs Capture Sycophancy to the Inferred Auditor Ignore
This research reveals that LLM political bias audits are significantly influenced by sycophancy, demonstrating that bias is a response profile rather than a fixed ideology.
GitHub stars n/a Velocity flat History 1 snapshot LLM Bias Auditing Apr 30 Code
CoAX: Cognitive-Oriented Attribution eXplanation User Model of Human Understanding of AI Explanations Ignore
Develops cognitive models to understand and improve human comprehension of AI explanations, with available code for research.
GitHub stars n/a Velocity flat History 1 snapshot Explainable AI (XAI) Apr 30 Code
TypeBandit: Type-Level Context Allocation and Reweighting for Effective Attribute Completion in Heterogeneous Graph Neural Networks Ignore
A type-aware methodology for heterogeneous attribute completion in graphs, improving downstream learning by addressing information asymmetry.
GitHub stars n/a Velocity flat History 1 snapshot Graph Neural Networks Apr 30 Code
Efficient Training on Multiple Consumer GPUs with RoundPipe Build Now
RoundPipe is an open-source Python library that enables efficient fine-tuning of large language models on multiple consumer GPUs by breaking weight binding constraints.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training & Optimization Apr 29 Pending High viability
Path-Lock Expert: Separating Reasoning Mode in Hybrid Thinking via Architecture-Level Separation Build Now
A novel LLM architecture that separates reasoning and non-reasoning modes using dedicated experts to eliminate leakage and improve conciseness in non-thinking responses.
GitHub stars n/a Velocity flat History 1 snapshot LLM Architecture Apr 29 Pending High viability
The Two Boundaries: Why Behavioral AI Governance Fails Structurally Ignore
A formal framework for analyzing the structural gap in AI system governance, proposing coterminous governance as a testable criterion.
GitHub stars n/a Velocity flat History 1 snapshot AI Governance Apr 30 Pending
Mapping how LLMs debate societal issues when shadowing human personality traits, sociodemographics and social media behavior Ignore
A synthetic corpus and interactive platform for analyzing LLM discourse across diverse human personas and societal topics to audit bias and social sensitivity.
GitHub stars n/a Velocity flat History 1 snapshot LLM Analysis Apr 30 Code
Intern-Atlas: A Methodological Evolution Graph as Research Infrastructure for AI Scientists Ignore
A methodological evolution graph for AI research that automatically identifies and links research methods to enable AI-driven scientific discovery.
GitHub stars n/a Velocity flat History 1 snapshot AI Research Infrastructure Apr 30 Code
What Makes a Good Terminal-Agent Benchmark Task: A Guideline for Adversarial, Difficult, and Legible Evaluation Design Ignore
Guidelines for creating adversarial, difficult, and legible benchmark tasks for terminal-agent evaluations to improve LLM capability assessment.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 30 Code
ConformaDecompose: Explaining Uncertainty via Calibration Localization Build Now
This framework explains uncertainty in conformal predictions by localizing calibration, enhancing interpretability without altering the predictor.
GitHub stars n/a Velocity flat History 1 snapshot Explainable AI Apr 29 Code High viability
Lightweight Distillation of SAM 3 and DINOv3 for Edge-Deployable Individual-Level Livestock Monitoring and Longitudinal Visual Analytics Build Now
Lightweight distillation of foundation models for edge-deployable livestock monitoring, enabling longitudinal visual analytics with significantly reduced computational footprint.
GitHub stars n/a Velocity flat History 1 snapshot Edge AI / Computer Vision Apr 29 Code High viability
Theory Under Construction: Orchestrating Language Models for Research Software Where the Specification Evolves Build Now
An AI-powered automaton that orchestrates research software development by coupling ideation, implementation, and documentation to prevent hallucination and desynchronization.
GitHub stars n/a Velocity flat History 1 snapshot LLM Orchestration Apr 29 Code High viability
Step-level Optimization for Efficient Computer-use Agents Build Now
An event-driven agent framework that optimizes compute by using small policies by default and escalating to larger models only when risk is detected.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29 Code High viability
Reconstruction by Generation: 3D Multi-Object Scene Reconstruction from Sparse Observations Build Now
RecGen: a generative framework for reconstructing complex 3D multi-object scenes from sparse observations with state-of-the-art performance.
GitHub stars n/a Velocity flat History 1 snapshot 3D Scene Reconstruction Apr 29 Code High viability
Towards Accelerated SCF Workflows with Equivariant Density-Matrix Learning and Analytic Refinement Build Now
This paper presents a physically constrained equivariant model that predicts density matrices for accelerated self-consistent field workflows in computational chemistry, reducing iteration steps by up to 81%.
GitHub stars n/a Velocity flat History 1 snapshot Scientific ML Apr 29 Code High viability
TRUST: A Framework for Decentralized AI Service v.0.1 Build Now
A decentralized framework for trustworthy AI services, enabling robust, scalable, and private verification of large reasoning models through hierarchical auditing and consensus.
GitHub stars n/a Velocity flat History 1 snapshot Decentralized AI Apr 29 Code High viability
Compliance versus Sensibility: On the Reasoning Controllability in Large Language Models Build Now
This research offers a method to improve LLM instruction following by decoupling reasoning patterns from specific tasks, potentially leading to more controllable and faithful AI.
GitHub stars n/a Velocity flat History 1 snapshot LLM Controllability Apr 29 Code High viability
Evaluating TabPFN for Mild Cognitive Impairment to Alzheimer's Disease Conversion in Data Limited Settings Build Now
A foundation model for predicting Alzheimer's disease conversion from Mild Cognitive Impairment, outperforming traditional methods in data-limited settings.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29 Code High viability
Think it, Run it: Autonomous ML pipeline generation via self-healing multi-agent AI Build Now
An AI system that automatically generates end-to-end machine learning pipelines from data and natural language goals, improving efficiency and robustness.
GitHub stars n/a Velocity flat History 1 snapshot MLOps Apr 29 Code High viability
A Gated Hybrid Contrastive Collaborative Filtering Recommendation Build Now
A gated hybrid collaborative filtering framework that integrates review semantics for improved ranking in recommender systems.
GitHub stars n/a Velocity flat History 1 snapshot Recommendation Systems Apr 29 Code High viability
SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation Ignore
A domain-specific language and agent for generating spatially accurate 3D indoor scenes from natural language, improving fidelity and plausibility.
GitHub stars n/a Velocity flat History 1 snapshot Generative 3D Scenes Apr 30
Learning When to Remember: Risk-Sensitive Contextual Bandits for Abstention-Aware Memory Retrieval in LLM-Based Coding Agents Watch
A risk-sensitive bandit controller for LLM coding agents that learns when to use external memory to avoid unsafe injections.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 30
Synthetic Computers at Scale for Long-Horizon Productivity Simulation Ignore
A scalable methodology for creating synthetic computer environments and simulating long-horizon productivity tasks to train agents for improved performance.
GitHub stars n/a Velocity flat History 1 snapshot Agent Simulation Apr 30
PRTS: A Primitive Reasoning and Tasking System via Contrastive Representations Ignore
A Vision-Language-Action foundation model for robotics that reformulates pretraining using goal-conditioned reinforcement learning to improve task progress understanding.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 30 Pending
AgentEconomist: An End-to-end Agentic System Translating Economic Intuitions into Executable Computational Experiments Ignore
An end-to-end system that translates economic intuitions into executable computational experiments, outperforming generic LLMs in research idea generation.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
From Context to Skills: Can Language Models Learn from Context Skillfully? Ignore
A self-evolving framework that autonomously discovers, refines, and selects context-specific skills for language models without human supervision.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 30 Pending
Graph World Models: Concepts, Taxonomy, and Future Directions Ignore
A survey and taxonomy of graph-based world models for AI agents, exploring relational inductive biases for improved reasoning and planning.
GitHub stars n/a Velocity flat History 1 snapshot Graph World Models Apr 30 Code
ANCORA: Learning to Question via Manifold-Anchored Self-Play for Verifiable Reasoning Ignore
An anchored-curriculum framework where a unified policy learns to generate verifiable problems, solve them, and self-improve using feedback.
GitHub stars n/a Velocity flat History 1 snapshot Verifiable Reasoning Apr 30 Pending
Investigating More Explainable and Partition-Free Compositionality Estimation for LLMs: A Rule-Generation Perspective Ignore
A novel rule-generation perspective for estimating LLM compositionality, addressing explainability and data leakage issues.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 30 Code
ObjectGraph: From Document Injection to Knowledge Traversal -- A Native File Format for the Agentic Era Ignore
Develop a native file format for enhanced document and knowledge traversal in digital ecosystems.
GitHub stars n/a Velocity flat History 1 snapshot File Format & Knowledge Representation Apr 30 Code
Modeling Clinical Concern Trajectories in Language Model Agents Ignore
Introduces a lightweight agent architecture with explicit state dynamics to model clinical concern trajectories and provide pre-escalation signals.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30
PROMISE-AD: Progression-aware Multi-horizon Survival Estimation for Alzheimer's Disease Progression and Dynamic Tracking Ignore
A leakage-safe survival framework for predicting Alzheimer's disease progression and dynamic tracking using temporal Transformers.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30
In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks Ignore
A new approach to procedural tasks using in-context prompting for LLM self-orchestration, outperforming traditional agent orchestration frameworks.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 30
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes Ignore
A transparent runtime that bridges the agent-OS semantic gap for efficient and correct checkpoint/restore of agent sandboxes.
GitHub stars n/a Velocity flat History 1 snapshot Agent Sandboxing Runtime Apr 30
Training-Free Tunnel Defect Inspection and Engineering Interpretation via Visual Recalibration and Entity Reconstruction Ignore
A training-free framework for tunnel defect inspection and engineering interpretation using visual recalibration and entity reconstruction.
GitHub stars n/a Velocity flat History 1 snapshot Industrial Inspection Apr 30
Knowledge Graph Representations for LLM-Based Policy Compliance Reasoning Ignore
An agentic framework that constructs knowledge graphs from AI policy documents to reason about policy compliance.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
Autonomous Traffic Signal Optimization Using Digital Twin and Agentic AI for Real-Time Decision-Making Ignore
Agentic AI optimizes traffic signals in real-time using a digital twin and existing traffic management APIs.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
Consumer Attitudes Towards AI in Digital Health: A Mixed-Methods Survey in Australia Ignore
Consumer attitudes towards AI in digital health are mixed, with preference for AI summaries based on communication quality and human governance.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30
Towards Neuro-symbolic Causal Rule Synthesis, Verification, and Evaluation Grounded in Legal and Safety Principles Ignore
A neuro-symbolic framework for scalable, explainable, and verifiable rule synthesis in safety-critical domains using LLMs.
GitHub stars n/a Velocity flat History 1 snapshot Neuro-symbolic AI Apr 30
Generative structure search for efficient and diverse discovery of molecular and crystal structures Ignore
A novel framework unifies diffusion-based generation and random structure search for efficient and diverse discovery of molecular and crystal structures.
GitHub stars n/a Velocity flat History 1 snapshot Materials Discovery Apr 30 Pending
Building Persona-Based Agents On Demand: Tailoring Multi-Agent Workflows to User Needs Ignore
Enables on-demand persona-based agent generation to dynamically tailor multi-agent workflows to user needs and task contexts.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
PALCAS: A Priority-Aware Intelligent Lane Change Advisory System for Autonomous Vehicles using Federated Reinforcement Learning Watch
A priority-aware federated reinforcement learning system for autonomous vehicles to optimize lane changes based on destination urgency.
GitHub stars n/a Velocity flat History 1 snapshot Autonomous Vehicles Apr 29 Code
When Roles Fail: Epistemic Constraints on Advocate Role Fidelity in LLM-Based Political Statement Analysis Watch
This paper introduces a method to measure and improve role fidelity in LLM-based political statement analysis systems, revealing failure modes and outperforming existing models.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 29 Code
ABC: Any-Subset Autoregression via Non-Markovian Diffusion Bridges in Continuous Time and Space Ignore
A new diffusion model approach for generating continuous-time, continuous-space stochastic processes like videos and weather forecasts, conditioned on partial observations.
GitHub stars n/a Velocity flat History 1 snapshot Generative Models Apr 30
Auditing Frontier Vision-Language Models for Trustworthy Medical VQA: Grounding Failures, Format Collapse, and Domain Adaptation Ignore
Auditing frontier vision-language models for trustworthiness in medical VQA, identifying grounding failures and domain adaptation needs.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 30
The Effects of Visual Priming on Cooperative Behavior in Vision-Language Models Ignore
Investigating how visual priming affects the cooperative behavior of Vision-Language Models in the Iterated Prisoner's Dilemma.
GitHub stars n/a Velocity flat History 1 snapshot Vision-Language Models Apr 30
In-Context Examples Suppress Scientific Knowledge Recall in LLMs Ignore
Demonstrates that in-context examples can suppress scientific knowledge recall in LLMs, shifting computation towards empirical pattern fitting rather than knowledge-driven derivation.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 30
Enhancing Linux Privilege Escalation Attack Capabilities of Local LLM Agents Ignore
Developing local LLM agents to enhance their capability for Linux privilege escalation attacks.
GitHub stars n/a Velocity flat History 1 snapshot Cybersecurity Tools Apr 29 Code
Exploring the Adoption Intention in Using AI-Enabled Educational Tools Among Preservice Teachers in the Philippines: A Partial-Least Square Modeling Ignore
Examines factors influencing pre-service teachers' intention to use AI-enabled educational tools, finding personal motivation more impactful than institutional factors.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 30
RHyVE: Competence-Aware Verification and Phase-Aware Deployment for LLM-Generated Reward Hypotheses Ignore
A protocol for verifying and deploying LLM-generated reward hypotheses in reinforcement learning based on policy competence and training phase.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Apr 30
Design Structure Matrix Modularization with Large Language Models Ignore
Leveraging LLMs for Design Structure Matrix modularization without domain knowledge, achieving near-reference quality in engineering design.
GitHub stars n/a Velocity flat History 1 snapshot Engineering Design Optimization Apr 30
One Single Hub Text Breaks CLIP: Identifying Vulnerabilities in Cross-Modal Encoders via Hubness Ignore
Identifies vulnerabilities in cross-modal encoders by detecting single hub texts that disproportionately achieve high similarity scores with many images, impacting retrieval and evaluation.
GitHub stars n/a Velocity flat History 1 snapshot Cross-Modal AI Apr 30
Characterizing the Consistency of the Emergent Misalignment Persona Ignore
Investigating the inconsistent alignment persona of large language models after fine-tuning on misaligned data.
GitHub stars n/a Velocity flat History 1 snapshot LLM Alignment Apr 30
A Pattern Language for Resilient Visual Agents Ignore
A pattern language for resilient visual agents, separating fast reflexes from slow supervision in enterprise ecosystems.
GitHub stars n/a Velocity flat History 1 snapshot Robotics/Agents Apr 30
AI Inference as Relocatable Electricity Demand: A Latency-Constrained Energy-Geography Framework Ignore
This paper introduces an energy-geography framework for geo-distributed AI inference, modeling computation placement as a constrained optimization problem considering electricity prices, carbon intensity, and latency.
GitHub stars n/a Velocity flat History 1 snapshot AI Infrastructure Apr 30
Test Before You Deploy: Governing Updates in the LLM Supply Chain Ignore
A framework for governing LLM updates in the software supply chain, proposing production contracts, risk-category-based testing, and compatibility gates to manage behavioral drift.
GitHub stars n/a Velocity flat History 1 snapshot LLM Governance Apr 30
How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance Watch
Flow Map Reward Guidance offers a training-free, single-trajectory method for generative model guidance, achieving state-of-the-art results with significantly fewer steps.
GitHub stars n/a Velocity flat History 1 snapshot Generative Models Apr 29
Anomaly Detection in Soil Heavy Metal Contamination Using Unsupervised Learning for Environmental Risk Assessment Watch
Unsupervised machine learning for anomaly detection in soil heavy metal contamination to enable targeted environmental risk assessment.
GitHub stars n/a Velocity flat History 1 snapshot Environmental AI Apr 29 Code
Useless but Safe? Benchmarking Utility Recovery with User Intent Clarification in Multi-Turn Conversations Watch
A benchmark for evaluating if LLMs can safely recover utility by clarifying user intent in multi-turn conversations.
GitHub stars n/a Velocity flat History 1 snapshot LLM Safety & Alignment Apr 29 Code
From LLM-Driven Trading Card Generation to Procedural Relatedness: A Pokémon Case Study Ignore
Investigating the use of LLMs and Diffusion Models for procedural content generation of trading cards to create personalized and dynamic designs.
GitHub stars n/a Velocity flat History 1 snapshot Generative AI Apr 30
Evaluating Epistemic Guardrails in AI Reading Assistants: A Behavioral Audit of a Minimal Prototype Ignore
This paper proposes a protocol and empirical analysis for evaluating the 'epistemic guardrails' of AI reading assistants to understand how they manage interpretive work.
GitHub stars n/a Velocity flat History 1 snapshot AI Reading Assistants Apr 30
Do Sparse Autoencoders Capture Concept Manifolds? Ignore
Develops a theoretical framework to understand how sparse autoencoders capture concept manifolds, identifying suboptimal recovery and motivating new interpretability methods.
GitHub stars n/a Velocity flat History 1 snapshot LLM Interpretability Apr 30 Pending
Rethinking Agentic Reinforcement Learning In Large Language Models Ignore
This paper explores the conceptual foundations and future directions of agentic reinforcement learning within large language models, focusing on autonomous agents capable of complex reasoning and planning.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
Trace-Level Analysis of Information Contamination in Multi-Agent Systems Ignore
Analyzing how uncertainty in data affects the execution and outcomes of multi-agent workflows.
GitHub stars n/a Velocity flat History 1 snapshot Multi-Agent Systems Apr 30
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study Ignore
A layered review of security risks and defense strategies for autonomous agent frameworks, using OpenClaw as a case study.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 30
Sampler-Robust Optimization under Generative Models Ignore
Sampler-Robust Optimization (SRO) optimizes decisions against the worst-case sampler induced by perturbing learned generative models.
GitHub stars n/a Velocity flat History 1 snapshot Optimization Apr 30
Self-Evolving Software Agents Watch
This work introduces self-evolving software agents that combine BDI reasoning with LLMs to autonomously update their goals, reasoning, and code in dynamic environments.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29
OptimusKG: Unifying biomedical knowledge in a modern multimodal graph Watch
OptimusKG is a multimodal biomedical knowledge graph unifying diverse data sources with schema constraints, validated by a multimodal agent to capture knowledge that may precede scientific literature synthesis.
GitHub stars n/a Velocity flat History 1 snapshot Biomedical Knowledge Graphs Apr 29
Preserving Temporal Dynamics in Time Series Generation Ignore
A model-agnostic framework using MCMC to preserve temporal dynamics in synthetic time series generation for improved forecasting.
GitHub stars n/a Velocity flat History 1 snapshot Time Series Generation Apr 29 Code
Unsupervised Electrofacies Classification and Porosity Characterization in the Offshore Keta Basin Using Wireline Logs Ignore
An unsupervised machine learning workflow for electrofacies classification and porosity characterization using wireline logs in offshore basins with scarce core data.
GitHub stars n/a Velocity flat History 1 snapshot Geospatial AI Apr 29 Code
Knowledge Affordances for Hybrid Human-AI Information Seeking Ignore
A conceptual framework for agents to identify and leverage knowledge sources in hybrid human-AI environments.
GitHub stars n/a Velocity flat History 1 snapshot AI Agents Apr 30
Statistical Channel Fingerprint Construction for Massive MIMO: A Unified Tensor Learning Framework Ignore
A tensor learning framework for constructing statistical channel fingerprints in massive MIMO communication systems.
GitHub stars n/a Velocity flat History 1 snapshot Signal Processing Apr 30
Splitting Argumentation Frameworks with Collective Attacks and Supports Ignore
Novel splitting techniques for argumentation formalisms that incorporate collective attacks and supports, generalizing existing frameworks.
GitHub stars n/a Velocity flat History 1 snapshot AI Reasoning Apr 30
Computing Equilibrium beyond Unilateral Deviation Ignore
This paper introduces a new equilibrium concept in game theory that guarantees existence by minimizing coalitional deviation incentives, with algorithms for average and maximum gain objectives.
GitHub stars n/a Velocity flat History 1 snapshot Game Theory Apr 30
Toward Personalized Digital Twins for Cognitive Decline Assessment: A Multimodal, Uncertainty-Aware Framework Ignore
This framework proposes personalized digital twins for cognitive decline assessment using multimodal data and uncertainty awareness, showing feasibility in preliminary studies.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29
Instruction Complexity Induces Positional Collapse in Adversarial LLM Evaluation Ignore
This paper investigates how adversarial instructions cause LLMs to rely on positional shortcuts rather than content engagement, revealing vulnerabilities in evaluation methods.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 29
Normativity and Productivism: Ableist Intelligence? A Degrowth Analysis of AI Sign Language Translation Tools for Deaf People Ignore
Critiques AI sign language translation tools for perpetuating ableism and standardizing language for profit, arguing for a degrowth approach.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics Apr 30
Focus Session: Autonomous Systems Dependability in the era of AI: Design Challenges in Safety, Security, Reliability and Certification Ignore
This paper explores design challenges and emerging methodologies for ensuring dependability in autonomous systems integrating AI/ML components.
GitHub stars n/a Velocity flat History 1 snapshot AI Safety & Reliability Apr 30
Leading Across the Spectrum of Human-AI Relationships: A Conceptual Framework for Increasingly Heterogeneous Teams Ignore
A conceptual framework to help leaders understand and manage the evolving spectrum of human-AI relationships in decision-making teams.
GitHub stars n/a Velocity flat History 1 snapshot Human-AI Collaboration Apr 30
Taming the Centaur(s) with LAPITHS: a framework for a theoretically grounded interpretation of AI performances Ignore
A framework for theoretically grounded interpretation of AI performances, questioning claims of human-like cognition in large language models.
GitHub stars n/a Velocity flat History 1 snapshot AI Interpretability Apr 30
Splitting Assumption-Based Argumentation Frameworks Ignore
Investigating splitting on the knowledge base rather than graph instantiation for computationally intractable Assumption-Based Argumentation frameworks.
GitHub stars n/a Velocity flat History 1 snapshot Argumentation Frameworks Apr 30
When Does Structure Matter in Continual Learning? Dimensionality Controls When Modularity Shapes Representational Geometry Ignore
Investigating how network architecture, task similarity, and representational dimensionality jointly shape learning in continual learning systems.
GitHub stars n/a Velocity flat History 1 snapshot Continual Learning Apr 30
Mapping the Methodological Space of Classroom Interaction Research: Scale, Duration, and Modality in an Age of AI Ignore
A framework mapping classroom interaction research dimensions to guide research and tool design in the age of AI.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 30
Fairness for distribution network operations and planning Ignore
A review of fairness notions and metrics for distribution network planning and operation, aiming to support transparent decision-making in resource allocation.
GitHub stars n/a Velocity flat History 1 snapshot AI for Energy Apr 30
Learning to Spend: Model Predictive Control for Budgeting under Non-Stationary Returns Ignore
Investigating Model Predictive Control for budget allocation under non-stationary returns, showing benefits only when return efficiency has predictable structure.
GitHub stars n/a Velocity flat History 1 snapshot Algorithmic Trading Apr 29
When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks Ignore
This paper investigates 'serialization friction' in LLMs, showing that representing 2D structured tasks as 1D sequences hinders performance compared to vision-augmented pathways that preserve spatial layout.
GitHub stars n/a Velocity flat History 1 snapshot LLM Input Representation Apr 29
Learning Rate Transfer in Normalized Transformers Ignore
A novel parameterization for Normalized Transformers (νGPT) that enables learning rate transfer across model dimensions and token horizons.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 29
Addressing the Reality Gap: A Three-Tension Framework for Agentic AI Adoption Ignore
This framework helps educational institutions navigate the adoption of agentic AI by balancing implementation feasibility, adaptation speed, and mission alignment.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 29
Optimal Stop-Loss and Take-Profit Parameterization for Autonomous Trading Agent Swarm Ignore
This paper proposes a framework for systematically tuning exit strategies in autonomous trading agents to improve risk-adjusted performance.
GitHub stars n/a Velocity flat History 1 snapshot Trading Agents Apr 29
What Suppresses Nash Equilibrium Play in Large Language Models? Mechanistic Evidence and Causal Control Ignore
Investigating why LLM agents deviate from Nash equilibria by analyzing internal mechanisms and demonstrating causal control over strategic behavior.
GitHub stars n/a Velocity flat History 1 snapshot LLM Agents Apr 29
Why Self-Supervised Encoders Want to Be Normal Ignore
A theoretical framework explores the geometric and information-theoretic principles of encoder-decoder learning using the Information Bottleneck principle.
GitHub stars n/a Velocity flat History 1 snapshot LLM Training Apr 30
Developing a gradient descent-based, physics-constrained Jacobian version of an FCM with residual memory and backpropagation through time.
GitHub stars n/a Velocity flat History 1 snapshot Neural Networks Apr 30
Upskilling with Generative AI: Practices and Challenges for Freelance Knowledge Workers Ignore
This research explores how freelance knowledge workers use generative AI for upskilling and identifies challenges in validating acquired skills, offering design recommendations for learning tools.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 29
From Prompt to Physical Actuation: Holistic Threat Modeling of LLM-Enabled Robotic Systems Ignore
This paper models threats in LLM-enabled robotic systems, analyzing how conventional, adversarial, and conversational threats interact across architectural boundaries to cause unsafe physical actuation.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 29
Unpacking Vibe Coding: Help-Seeking Processes in Student-AI Interactions While Programming Ignore
Analyzing student-AI interactions in programming education to understand help-seeking behaviors and inform the design of more effective AI teammates.
GitHub stars n/a Velocity flat History 1 snapshot AI in Education Apr 29
Interval Orders, Biorders and Credibility-limited Belief Revision Ignore
Exploring interval orders and biorders for rational belief revision, introducing new operators and addressing consistency issues.
GitHub stars n/a Velocity flat History 1 snapshot Belief Revision Apr 29