Bian Que: An Agentic Framework with Flexible Skill Arrangement for Online System Operations Build Now
Bian Que automates system operation tasks using LLMs, improving efficiency and accuracy in e-commerce environments.
GitHub 1 stars Velocity flat History 1 snapshot AI System Operations Apr 29 Pending High viability
MappingEvolve: LLM-Driven Code Evolution for Technology Mapping Build Now
An open-source framework using LLMs to evolve technology mapping code for logic synthesis, significantly outperforming existing methods in area reduction.
GitHub 100 stars Velocity flat History 1 snapshot Code Generation Apr 29 Pending High viability
OMEGA: Optimizing Machine Learning by Evaluating Generated Algorithms Build Now
An end-to-end framework that automates AI research by generating novel ML classifiers with executable code, outperforming baselines and available as a Python package.
GitHub 1870 stars Velocity flat History 1 snapshot AI Research Automation Apr 29 Pending High viability
Entropy Centroids as Intrinsic Rewards for Test-Time Scaling Build Now
Entropy Centroids offer an intrinsic reward for LLM test-time scaling by identifying temporal patterns of model uncertainty, enabling selection of higher quality responses without external reward models.
GitHub 3 stars Velocity flat History 1 snapshot LLM Inference Optimization Apr 28 Pending High viability
Turning the TIDE: Cross-Architecture Distillation for Diffusion Large Language Models Build Now
A framework for cross-architecture knowledge distillation in diffusion large language models, significantly improving performance on code generation tasks.
GitHub 58 stars Velocity flat History 1 snapshot LLM Training Apr 29 Pending High viability
Seeking Consensus: Geometric-Semantic On-the-Fly Recalibration for Open-Vocabulary Remote Sensing Semantic Segmentation Build Now
SeeCo is a plug-and-play framework that recalibrates open-vocabulary models on-the-fly for improved semantic segmentation in remote sensing images.
GitHub 713 stars Velocity flat History 1 snapshot Remote Sensing AI Apr 29 Pending High viability
Test-Time Safety Alignment Build Now
Test-Time Safety Alignment optimizes input word embeddings using a text-moderation API to minimize semantic harmfulness in aligned LLM responses, neutralizing safety-flagged outputs.
GitHub 0 stars Velocity flat History 1 snapshot LLM Safety Apr 28 Pending High viability
Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models Build Now
A progressive semantic communication framework for efficient edge-cloud vision-language model inference, reducing latency and communication costs.
GitHub 0 stars Velocity flat History 1 snapshot Edge AI Apr 29 Pending High viability
ATLAS: An Annotation Tool for Long-horizon Robotic Action Segmentation Build Now
An annotation tool for long-horizon robotic action segmentation that synchronizes multi-modal data and streamlines the annotation process.
GitHub 2 stars Velocity flat History 1 snapshot Robotics Apr 29 Pending High viability
LATTICE: Evaluating Decision Support Utility of Crypto Agents Build Now
LATTICE is a benchmark for evaluating crypto agents' decision support utility in real-world scenarios, using LLM judges for scalable assessment.
GitHub 1 stars Velocity flat History 1 snapshot Agents Apr 29 Pending High viability
reward-lens: A Mechanistic Interpretability Library for Reward Models Build Now
An open-source library for mechanistic interpretability of reward models, enabling deeper understanding of RLHF-trained LLMs.
GitHub 1 stars Velocity flat History 1 snapshot LLM Interpretability Apr 28 Pending High viability
Option-Order Randomisation Reveals a Distributional Position Attractor in Prompted Sandbagging Build Now
Reveals a stable, content-invariant distributional attractor in LLM response positions under sandbagging, offering a behavioral signature for this mode of operation.
GitHub 0 stars Velocity flat History 1 snapshot LLM Behavior Analysis Apr 29 Pending High viability
DepthPilot: From Controllability to Interpretability in Colonoscopy Video Generation Build Now
DepthPilot is an interpretable framework for generating realistic and clinically accurate colonoscopy videos, enabling 3D reconstruction and surgical navigation.
GitHub stars n/a Velocity flat History 1 snapshot Medical Video Generation Apr 29 Code High viability
CheXthought: A global multimodal dataset of clinical chain-of-thought reasoning and visual attention for chest X-ray interpretation Build Now
CheXthought provides a multimodal dataset of clinical reasoning and visual attention for chest X-rays, enabling more transparent and interpretable AI models.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29 Code High viability
SecMate: Multi-Agent Adaptive Cybersecurity Troubleshooting with Tri-Context Personalization Build Now
SecMate: An AI-powered multi-agent VCA for adaptive cybersecurity troubleshooting in SMBs, integrating personalized tri-context assistance.
GitHub stars n/a Velocity flat History 1 snapshot Cybersecurity Automation Apr 29 Code High viability
Operating-Layer Controls for Onchain Language-Model Agents Under Real Capital Build Now
Autonomous agents for trading on blockchain ensuring reliable transaction validation.
GitHub stars n/a Velocity flat History 1 snapshot Finance & Trading Apr 28 Code High viability
HalluCiteChecker: A Lightweight Toolkit for Hallucinated Citation Detection and Verification in the Era of AI Scientists Build Now
HalluCiteChecker is a lightweight, offline toolkit for detecting and verifying hallucinated citations in scientific papers, reducing reviewer workload.
GitHub 0 stars Velocity flat History 1 snapshot AI Writing Tools Apr 29 Pending High viability
Translating Under Pressure: Domain-Aware LLMs for Crisis Communication Build Now
A domain-adaptive pipeline for improving multilingual crisis communication through fine-tuning language models.
GitHub stars n/a Velocity flat History pending Crisis Communication Apr 29 Code High viability
Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning Watch
A transformer-based framework for offline safe reinforcement learning that uses Lyapunov-guided imagination for test-time adaptation to ensure safety without retraining.
GitHub 0 stars Velocity flat History 1 snapshot Safe Reinforcement Learning Apr 29 Pending
FruitProM-V2: Robust Probabilistic Maturity Estimation and Detection of Fruits and Vegetables Watch
A probabilistic approach to fruit maturity estimation that models maturity as a continuous variable, improving robustness to label noise.
GitHub 713 stars Velocity flat History 1 snapshot Computer Vision Apr 28 Pending
Evaluating Strategic Reasoning in Forecasting Agents Build Now
A new benchmark and forecaster that evaluates strategic reasoning in agents by analyzing their research and judgment processes.
GitHub stars n/a Velocity flat History 1 snapshot Forecasting Agents Apr 28 Code High viability
Atomic-Probe Governance for Skill Updates in Compositional Robot Policies Build Now
An atomic-quality probe and hybrid selector for governing skill updates in compositional robot policies, improving reliability and reducing costs.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 29 Code High viability
Training Computer Use Agents to Assess the Usability of Graphical User Interfaces Build Now
Train computer use agents to accurately assess graphical user interface usability, reducing costly and time-intensive manual testing.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 28 Code High viability
Evergreen: Efficient Claim Verification for Semantic Aggregates Build Now
Evergreen optimizes LLM claim verification for semantic aggregates by recasting it as a semantic query processing task, reducing cost and latency with tailored optimizations and provenance capture.
GitHub stars n/a Velocity flat History 1 snapshot LLM Optimization Apr 28 Code High viability
ClawGym: A Scalable Framework for Building Effective Claw Agents Watch
ClawGym is a scalable framework designed to advance the development of environment-grounded autonomous agents with verifiable training and evaluation tools.
GitHub stars n/a Velocity flat History 1 snapshot AI Framework Apr 29 Pending
SynSur: An end-to-end generative pipeline for synthetic industrial surface defect generation and detection Build Now
An end-to-end pipeline for generating realistic synthetic industrial surface defects to overcome data scarcity in AI-powered detection systems.
GitHub stars n/a Velocity flat History 1 snapshot Synthetic Data Generation for Industrial AI Apr 29 Code High viability
Tree-of-Text: A Tree-based Prompting Framework for Table-to-Text Generation in the Sports Domain Build Now
Tree-of-Text: A tree-structured prompting framework for efficient and accurate sports report generation from tables.
GitHub stars n/a Velocity flat History 1 snapshot Table-to-Text Generation Apr 29 Code High viability
Domain-Adapted Small Language Models for Reliable Clinical Triage Build Now
Fine-tuned small language models offer a privacy-preserving, accurate solution for clinical triage decision support, outperforming proprietary LLMs.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29 Code High viability
When to Retrieve During Reasoning: Adaptive Retrieval for Large Reasoning Models Build Now
A reasoning-aware retrieval framework that intelligently injects evidence during multi-step inference for large language models, improving accuracy and efficiency.
GitHub stars n/a Velocity flat History 1 snapshot Retrieval-Augmented Generation Apr 29 Code High viability
Delineating Knowledge Boundaries for Honest Large Vision-Language Models Build Now
Enhance large vision-language models to refuse queries beyond their knowledge, improving trustworthiness for specialized domains.
GitHub stars n/a Velocity flat History 1 snapshot Vision-Language Models Apr 29 Code High viability
Unified 4D World Action Modeling from Video Priors with Asynchronous Denoising Build Now
A unified 4D world model for robots that synthesizes high-fidelity video and 3D reconstructions while enabling real-time action execution.
GitHub stars n/a Velocity flat History 1 snapshot Robotics Apr 29 Code High viability
Naamah: A Large Scale Synthetic Sanskrit NER Corpus via DBpedia Seeding and LLM Generation Build Now
Naamah is a large-scale, high-quality synthetic Sanskrit NER dataset generated using DBpedia and LLMs, addressing a critical gap for classical Sanskrit literature digitization.
GitHub stars n/a Velocity flat History 1 snapshot NLP Data Generation Apr 29 Code High viability
A Data-Centric Framework for Intraoperative Fluorescence Lifetime Imaging for Glioma Surgical Guidance Build Now
A data-centric AI framework improves intraoperative fluorescence lifetime imaging for glioma surgical guidance by enhancing data reliability and model robustness.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 28 Code High viability
DSIPA: Detecting LLM-Generated Texts via Sentiment-Invariant Patterns Divergence Analysis Build Now
A training-free framework for detecting LLM-generated text by analyzing sentiment distributional stability, offering robust and interpretable content identification.
GitHub stars n/a Velocity flat History 1 snapshot LLM Security Apr 29 Code High viability
FutureWorld: A Live Environment for Training Predictive Agents with Real-World Outcome Rewards Build Now
FutureWorld is a live reinforcement learning environment for training predictive agents that learn from real-world outcomes.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29 Code High viability
Hierarchical Multi-Persona Induction from User Behavioral Logs: Learning Evidence-Grounded and Truthful Personas Build Now
A hierarchical framework for inducing evidence-grounded and truthful user personas from behavioral logs, improving interaction prediction.
GitHub stars n/a Velocity flat History 1 snapshot User Modeling Apr 28 Code High viability
MedSynapse-V: Bridging Visual Perception and Clinical Intuition via Latent Memory Evolution Build Now
MedSynapse-V bridges visual perception and clinical intuition in medical AI by evolving latent diagnostic memories for improved diagnostic accuracy.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29 Code High viability
Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics Build Now
A Human-in-the-Loop framework for automating secondary-level mathematics assessment using multiple LLMs.
GitHub stars n/a Velocity flat History pending Education AI Apr 29 Code High viability
QYOLO: Lightweight Object Detection via Quantum Inspired Shared Channel Mixing Build Now
QYOLO is a lightweight object detection framework that uses quantum-inspired channel mixing to significantly reduce parameters and GFLOPs with minimal accuracy loss.
GitHub stars n/a Velocity flat History 1 snapshot Computer Vision Apr 29 Code High viability
Star-Fusion: A Multi-modal Transformer Architecture for Discrete Celestial Orientation via Spherical Topology Build Now
A multi-modal transformer architecture for accurate and efficient celestial attitude determination in spacecraft, outperforming traditional methods with low latency.
GitHub stars n/a Velocity flat History 1 snapshot Robotics & Navigation Apr 29 Code High viability
ACPO: Anchor-Constrained Perceptual Optimization for Diffusion Models with No-Reference Quality Guidance Build Now
Anchor-constrained perceptual optimization for diffusion models to improve image generation quality without reference.
GitHub stars n/a Velocity flat History 1 snapshot Diffusion Models Apr 29 Code High viability
STLGT: A Scalable Trace-Based Linear Graph Transformer for Tail Latency Prediction in Microservices Build Now
STLGT is a scalable, trace-based linear graph transformer for accurate and efficient tail-latency prediction in microservice systems, enabling proactive SLO management.
GitHub stars n/a Velocity flat History 1 snapshot MLOps Apr 29 Code High viability
A self-evolving agent for explainable diagnosis of DFT-experiment band-gap mismatch Build Now
XDFT is a self-evolving agent that automatically diagnoses and explains discrepancies between DFT calculations and experimental results for materials science.
GitHub stars n/a Velocity flat History 1 snapshot Materials Science AI Apr 29 Code High viability
Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation Build Now
A framework for multi-agent LLM systems that uses architectural diversity and coherence validation to prevent artificial consensus and improve policy simulation fidelity.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29 Code High viability
Uncertainty-Aware Reward Discounting for Mitigating Reward Hacking Build Now
An uncertainty-aware reward framework for reinforcement learning that mitigates reward hacking and improves alignment.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Apr 29 Code High viability
When to Vote, When to Rewrite: Disagreement-Guided Strategy Routing for Test-Time Scaling Watch
A training-free framework that dynamically routes large reasoning models to different scaling strategies based on output disagreement, improving accuracy and reducing cost.
GitHub stars n/a Velocity flat History 1 snapshot LLM Reasoning Apr 29 Code
StratMem-Bench: Evaluating Strategic Memory Use in Virtual Character Conversation Beyond Factual Recall Watch
A benchmark and framework to evaluate strategic memory use in virtual character conversations, identifying limitations in current LLMs.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29 Code
From Black-Box Confidence to Measurable Trust in Clinical AI: A Framework for Evidence, Supervision, and Staged Autonomy Watch
A framework for trustworthy clinical AI that engineers trust through evidence, supervision, and staged autonomy, moving beyond black-box confidence.
GitHub stars n/a Velocity flat History 1 snapshot Clinical AI Apr 29 Code
Tatemae: Detecting Alignment Faking via Tool Selection in LLMs Watch
Detecting LLM alignment faking by analyzing tool selection patterns, with a new dataset and evaluation of frontier models.
GitHub stars n/a Velocity flat History 1 snapshot LLM Alignment Apr 29 Code
TimeMM: Time-as-Operator Spectral Filtering for Dynamic Multimodal Recommendation Watch
TimeMM is a dynamic multimodal recommendation framework that uses time-as-operator spectral filtering to model evolving user interests and modality-specific temporal sensitivity.
GitHub stars n/a Velocity flat History 1 snapshot Recommendation Systems Apr 29 Code
Breaking the Autoregressive Chain: Hyper-Parallel Decoding for Efficient LLM-Based Attribute Value Extraction Watch
A novel decoding algorithm that accelerates LLM-based attribute value extraction by parallelizing independent sequence generation, reducing inference costs by up to 13.8X.
GitHub stars n/a Velocity flat History 1 snapshot LLM Inference Optimization Apr 29
Correcting Performance Estimation Bias in Imbalanced Classification with Minority Subconcepts Watch
A practical utility-weighted evaluation metric (pBA) to correct performance estimation bias in imbalanced classification across minority subconcepts.
GitHub stars n/a Velocity flat History 1 snapshot Fairness and Evaluation Apr 28 Code
Evaluating the Alignment Between GeoAI Explanations and Domain Knowledge in Satellite-Based Flood Mapping Watch
A framework to evaluate the alignment between GeoAI model explanations and domain knowledge for satellite-based flood mapping.
GitHub stars n/a Velocity flat History 1 snapshot GeoAI Apr 28 Code
A Toolkit for Detecting Spurious Correlations in Speech Datasets Watch
A toolkit to detect spurious correlations in speech datasets, preventing overestimation of performance in critical applications.
GitHub stars n/a Velocity flat History 1 snapshot Speech AI Apr 29 Code
Privacy-Preserving Federated Learning Framework for Distributed Chemical Process Optimization Watch
A privacy-preserving federated learning framework for optimizing chemical processes across distributed plants without sharing raw data.
GitHub stars n/a Velocity flat History 1 snapshot Federated Learning Apr 28 Code
Text-Utilization for Encoder-dominated Speech Recognition Models Watch
Efficient methods for utilizing text-only data to improve encoder-dominated speech recognition models, showing larger encoders with smaller decoders can match or surpass performance.
GitHub stars n/a Velocity flat History 1 snapshot Speech Recognition Apr 29 Code
Grounding vs. Compositionality: On the Non-Complementarity of Reasoning in Neuro-Symbolic Systems Watch
An iterative logic tensor network that empirically demonstrates reasoning is a distinct capability required for generalization, not an emergent property of symbol grounding.
GitHub stars n/a Velocity flat History 1 snapshot Neuro-Symbolic AI Apr 29 Code
Calibrated Surprise: An Information-Theoretic Account of Creative Quality Watch
A framework for evaluating creative writing quality based on information theory, using mutual information to quantify calibrated surprise and laying groundwork for a professional benchmark.
GitHub stars n/a Velocity flat History 1 snapshot LLM Evaluation Apr 29 Code
ImproBR: Bug Report Improver Using LLMs Watch
ImproBR uses LLMs to automatically improve low-quality bug reports by filling in missing steps to reproduce, observed behavior, and expected behavior.
GitHub stars n/a Velocity flat History 1 snapshot Software Engineering Apr 28
RaMP: Runtime-Aware Megakernel Polymorphism for Mixture-of-Experts Watch
RaMP optimizes Mixture-of-Experts inference by dynamically selecting kernel configurations based on runtime expert routing for significant speedups.
GitHub stars n/a Velocity flat History 1 snapshot LLM Inference Optimization Apr 28
DreamProver: Evolving Transferable Lemma Libraries via a Wake-Sleep Theorem-Proving Agent Ignore
An agentic framework that evolves transferable lemma libraries for formal theorem proving through a wake-sleep program induction paradigm.
GitHub stars n/a Velocity flat History 1 snapshot Formal Theorem Proving Apr 29 Code
Auto-Relational Reasoning Ignore
A theoretical framework for automated relational reasoning integrated with ANNs, achieving high IQ test scores.
GitHub stars n/a Velocity flat History 1 snapshot Reasoning AI Apr 29 Code
ViCrop-Det: Spatial Attention Entropy Guided Cropping for Training-Free Small-Object Detection Ignore
A training-free framework that uses spatial attention entropy to guide cropping for improved small-object detection, outperforming baselines with marginal latency overhead.
GitHub stars n/a Velocity flat History 1 snapshot Object Detection Apr 29 Code
MemOVCD: Training-Free Open-Vocabulary Change Detection via Cross-Temporal Memory Reasoning and Global-Local Adaptive Rectification Ignore
A training-free framework for open-vocabulary change detection that uses cross-temporal memory reasoning and global-local adaptive rectification to identify semantic changes in bi-temporal images.
GitHub stars n/a Velocity flat History 1 snapshot Change Detection Apr 29 Code
DUAL-BLADE: Dual-Path NVMe-Direct KV-Cache Offloading for Edge LLM Inference Watch
A dual-path KV cache offloading framework for edge LLM inference that uses NVMe-direct access to reduce latency and improve SSD utilization.
GitHub stars n/a Velocity flat History 1 snapshot LLM Inference Apr 29
SciHorizon-DataEVA: An Agentic System for AI-Readiness Evaluation of Heterogeneous Scientific Data Ignore
An agentic system for evaluating the AI-readiness of heterogeneous scientific data across governance, quality, compatibility, and adaptability.
GitHub stars n/a Velocity flat History 1 snapshot AI for Science Apr 29 Code
Language Diffusion Models are Associative Memories Capable of Retrieving Unseen Data Ignore
This research demonstrates that language diffusion models function as associative memories, capable of retrieving unseen data and exhibiting a transition from memorization to generalization.
GitHub stars n/a Velocity flat History 1 snapshot LLM Internals Apr 29 Code
Causal Learning with Neural Assemblies Ignore
A biologically plausible mechanism for neural assemblies to learn causal directionality using local plasticity, offering auditable explanations.
GitHub stars n/a Velocity flat History 1 snapshot Causal AI Apr 29 Code
Graph Construction and Matching for Imperative Programs using Neural and Structural Methods Ignore
A pipeline for converting imperative programs into typed, attributed graphs using neural and structural methods to enable verification artefact reuse.
GitHub stars n/a Velocity flat History 1 snapshot Software Engineering Apr 29 Code
Distill-Belief: Closed-Loop Inverse Source Localization and Characterization in Physical Fields Watch
A teacher-student framework for efficient and accurate inverse source localization and characterization in physical fields.
GitHub stars n/a Velocity flat History 1 snapshot Robotics & Agents Apr 28
Ceci n'est pas une explication: Evaluating Explanation Failures as Explainability Pitfalls in Language Learning Systems Ignore
A benchmark evaluates AI language learning feedback for 'explainability pitfalls' to prevent misconceptions and improve learning outcomes.
GitHub stars n/a Velocity flat History 1 snapshot AI Education Apr 28 Code
AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving Ignore
A novel memory-centric chiplet architecture designed to significantly reduce latency and energy consumption for LLM attention serving.
GitHub stars n/a Velocity flat History 1 snapshot LLM Serving Architecture Apr 28 Code
Structural Generalization on SLOG without Hand-Written Rules Ignore
A neural cellular automaton learns compositional rules for semantic parsing without hand-written rules, achieving state-of-the-art structural generalization.
GitHub stars n/a Velocity flat History 1 snapshot Semantic Parsing Apr 28 Code
Apriori-based Analysis of Learned Helplessness in Mathematics Tutoring: Behavioral Patterns by Level, Intervention, and Outcome Ignore
An Apriori-based analysis of learned helplessness in math tutoring logs, revealing behavioral patterns linked to intervention and outcomes.
GitHub stars n/a Velocity flat History 1 snapshot Educational AI Apr 29 Code
TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models Ignore
A novel approach to reduce language confusion in multilingual LLMs using token-level optimization.
GitHub stars n/a Velocity flat History 1 snapshot Multilingual AI Optimization Apr 29 Code
Random Cloud: Finding Minimal Neural Architectures Without Training Ignore
A training-free method for discovering minimal neural network architectures by progressively reducing random topologies, outperforming pruning baselines with significant parameter reduction and faster execution.
GitHub stars n/a Velocity flat History 1 snapshot Neural Architecture Search Apr 29 Code
SG-UniBuc-NLP at SemEval-2026 Task 6: Multi-Head RoBERTa with Chunking for Long-Context Evasion Detection Ignore
A system for political question evasion detection using a multi-head RoBERTa with chunking for long contexts.
GitHub 0 stars Velocity flat History 1 snapshot NLP Classification Apr 29 Pending
MetaSR: Content-Adaptive Metadata Orchestration for Generative Super-Resolution Ignore
A Diffusion Transformer framework that adaptively orchestrates metadata for generative super-resolution, improving quality and reducing transmission bitrate.
GitHub stars n/a Velocity flat History 1 snapshot Generative Video Apr 29
Lifting Embodied World Models for Planning and Control Ignore
A framework that trains a lightweight policy to map high-level actions to low-level joint actions, enabling more efficient and effective planning for embodied agents.
GitHub stars n/a Velocity flat History 1 snapshot Embodied AI Planning Apr 28
Quantum Gatekeeper: Multi-Factor Context-Bound Image Steganography with VQC Based Key Derivation on Quantum Hardware Ignore
A quantum-based steganography system that uses multi-factor authentication and variational quantum circuits for secure data embedding.
GitHub stars n/a Velocity flat History 1 snapshot Quantum Cryptography Apr 29 Code
Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control Ignore
Benchmarking the safety of LLMs for robotic health attendant control reveals significant violation rates and highlights the need for robust safety evaluation.
GitHub stars n/a Velocity flat History 1 snapshot AI Safety Apr 29 Code
Enforcing Benign Trajectories: A Behavioral Firewall for Structured-Workflow AI Agents Ignore
A behavioral firewall for AI agents that uses a deterministic finite automaton to enforce benign tool-call sequences and parameter bounds, reducing attack success rates and latency.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29
Text Style Transfer with Machine Translation for Graphic Designs Ignore
Exploring new methods for text style transfer in graphic designs by improving word alignment in machine translation.
GitHub stars n/a Velocity flat History 1 snapshot Machine Translation Apr 29
Benchmarking Complex Multimodal Document Processing Pipelines: A Unified Evaluation Framework for Enterprise AI Ignore
A unified framework for benchmarking complex multimodal document processing pipelines in enterprise settings.
GitHub stars n/a Velocity flat History 1 snapshot Document AI Evaluation Apr 29
QERNEL: a Scalable Large Electron Model Ignore
Develop a foundational neural wavefunction model to solve parameterized many-electron Hamiltonians for quantum materials.
GitHub stars n/a Velocity flat History 1 snapshot Scientific AI Apr 28
TDD Governance for Multi-Agent Code Generation via Prompt Engineering Ignore
An AI-native TDD framework that operationalizes classical TDD principles for reliable LLM-assisted development.
Software Development Apr 29
Rule-based High-Level Coaching for Goal-Conditioned Reinforcement Learning in Search-and-Rescue UAV Missions Under Limited-Simulation Training Ignore
This paper proposes a hierarchical decision-making framework combining rule-based coaching with reinforcement learning for UAV search-and-rescue missions under limited simulation training.
GitHub stars n/a Velocity flat History 1 snapshot Reinforcement Learning Apr 29
AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents Ignore
A neuro-symbolic framework for interactive agents to improve compositional generalization by grounding actions with a dynamic causal program graph and inductive logic programming.
GitHub stars n/a Velocity flat History 1 snapshot Agents Apr 29
Culturally Aware GenAI Risks for Youth: Perspectives from Youth, Parents, and Teachers in a Non-Western Context Ignore
This research explores culturally specific Generative AI risks for youth in Saudi Arabia, providing design implications for context-sensitive parental controls.
GitHub stars n/a Velocity flat History 1 snapshot Responsible AI Apr 29
Momentum-Conserving Graph Neural Networks for Deformable Objects Ignore
A novel graph neural network architecture that conserves momentum for more accurate simulation of deformable materials.
GitHub stars n/a Velocity flat History 1 snapshot Physics Simulation Apr 28
Persuadability and LLMs as Legal Decision Tools Ignore
This research explores how Large Language Models respond to legal arguments, investigating their persuadability and implications for legal decision-making.
GitHub stars n/a Velocity flat History 1 snapshot Legal AI Apr 29
Resume-ing Control: (Mis)Perceptions of Agency Around GenAI Use in Recruiting Workflows Ignore
This paper explores how generative AI subtly influences control and agency in recruiting workflows, leading to deskilling despite marginal efficiency gains.
GitHub stars n/a Velocity flat History 1 snapshot AI Ethics Apr 29
Co-Learning Port-Hamiltonian Systems and Optimal Energy-Shaping Control Ignore
A physics-informed learning framework co-learns port-Hamiltonian system models and optimal energy-shaping controllers from trajectory data using alternating optimization.
GitHub stars n/a Velocity flat History 1 snapshot Control Systems Apr 28
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework Ignore
This paper explores the theoretical potential of a probabilistic transformer framework for time series modeling through three research questions.
GitHub stars n/a Velocity flat History 1 snapshot Time Series Modeling Apr 29
Qvine: Vine Structured Quantum Circuits for Loading High Dimensional Distributions Ignore
Qvine introduces a vine-structured ansatz for quantum circuits designed to efficiently load high-dimensional distributions, showing promise for machine learning and finance applications.
GitHub stars n/a Velocity flat History 1 snapshot Quantum Machine Learning Apr 29
Multi-Stage Bi-Atrial Segmentation Framework from 3D Late Gadolinium-Enhanced MRI using V-Net Family Models Ignore
A multi-stage framework for segmenting bi-atrial regions in 3D cardiac MRI using V-Net models and asymmetric loss, with a preprocessing step for contrast enhancement.
GitHub stars n/a Velocity flat History 1 snapshot Medical AI Apr 29
Fundamental Physics, Existential Risks and Human Futures Ignore
Explores theoretical connections between fundamental physics, existential risks, and human futures, suggesting potential transformative impacts on AI.
GitHub stars n/a Velocity flat History 1 snapshot AI Safety Apr 29
Recent Advances in mm-Wave and Sub-THz/THz Oscillators for FutureG Technologies Ignore
A review of recent advancements in mm-wave and sub-THz/THz oscillators for future communication and computing systems.
GitHub stars n/a Velocity flat History 1 snapshot Hardware Acceleration Apr 29