ScienceToStartup Papers

ScienceToStartup Papers https://sciencetostartup.com/papers Latest research papers with startup viability analysis. en-us LLM REgression with a Latent Iterative State Head https://sciencetostartup.com/paper/llm-regression-with-a-latent-iterative-state-head https://sciencetostartup.com/paper/llm-regression-with-a-latent-iterative-state-head Wed, 01 Apr 2026 17:50:32 GMT We present RELISH (REgression with a Latent Iterative State Head), a novel, lightweight architecture designed for text regression with large language models. Rather than decoding numeric targets as text or aggregating multiple generated outputs, RELISH predicts scalar values directly from frozen LLM representations by iteratively refining a learned latent state through cross-attention over token-level representations, and then mapping the final state to a point estimate with a linear regressor.… Machine Learning Regression CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery https://sciencetostartup.com/paper/cliffsearch-structured-agentic-co-evolution-over-theory-and-code-for-scientific-algorithm-discovery https://sciencetostartup.com/paper/cliffsearch-structured-agentic-co-evolution-over-theory-and-code-for-scientific-algorithm-discovery Wed, 01 Apr 2026 17:51:26 GMT Scientific algorithm discovery is iterative: hypotheses are proposed, implemented, stress-tested, and revised. Current LLM-guided search systems accelerate proposal generation, but often under-represent scientific structure by optimizing code-only artifacts with weak correctness/originality gating. We present CliffSearch, an agentic evolutionary framework in which the core evolution operators (pair selection, crossover, mutation, and review) are implemented as LLM agents, and the loop is design… Scientific Algorithm Discovery HippoCamp: Benchmarking Contextual Agents on Personal Computers https://sciencetostartup.com/paper/hippocamp-benchmarking-contextual-agents-on-personal-computers https://sciencetostartup.com/paper/hippocamp-benchmarking-contextual-agents-on-personal-computers Wed, 01 Apr 2026 17:58:33 GMT We present HippoCamp, a new benchmark designed to evaluate agents' capabilities on multimodal file management. Unlike existing agent benchmarks that focus on tasks like web interaction, tool use, or software automation in generic settings, HippoCamp evaluates agents in user-centric environments to model individual user profiles and search massive personal files for context-aware reasoning. Our benchmark instantiates device-scale file systems over real-world profiles spanning diverse modalities,… Personal Computing Agents Risk-Aware Batch Testing for Performance Regression Detection https://sciencetostartup.com/paper/risk-aware-batch-testing-for-performance-regression-detection https://sciencetostartup.com/paper/risk-aware-batch-testing-for-performance-regression-detection Tue, 31 Mar 2026 20:39:46 GMT Performance regression testing is essential in large-scale continuous-integration (CI) systems, yet executing full performance suites for every commit is prohibitively expensive. Prior work on performance regression prediction and batch testing has shown independent benefits, but each faces practical limitations: predictive models are rarely integrated into CI decision-making, and conventional batching strategies ignore commit-level heterogeneity. We unify these strands by introducing a risk-… Risk Management & CI Optimization In harmony with gpt-oss https://sciencetostartup.com/paper/in-harmony-with-gpt-oss https://sciencetostartup.com/paper/in-harmony-with-gpt-oss Wed, 01 Apr 2026 01:16:13 GMT No one has independently reproduced OpenAI's published scores for gpt-oss-20b with tools, because the original paper discloses neither the tools nor the agent harness. We reverse-engineered the model's in-distribution tools: when prompted without tool definitions, gpt-oss still calls tools from its training distribution with high statistical confidence -- a strong prior, not a hallucination. We then built a native harmony agent harness (https://github.com/borislavmavrin/harmonyagent.git) that e… AI/ML Model Tools Learning Humanoid Navigation from Human Data https://sciencetostartup.com/paper/learning-humanoid-navigation-from-human-data https://sciencetostartup.com/paper/learning-humanoid-navigation-from-human-data Wed, 01 Apr 2026 02:59:42 GMT We present EgoNav, a system that enables a humanoid robot to traverse diverse, unseen environments by learning entirely from 5 hours of human walking data, with no robot data or finetuning. A diffusion model predicts distributions of plausible future trajectories conditioned on past trajectory, a 360 deg visual memory fusing color, depth, and semantics, and video features from a frozen DINOv3 backbone that capture appearance cues invisible to depth sensors. A hybrid sampling scheme achieves rea… Humanoid Robotics BloClaw: An Omniscient, Multi-Modal Agentic Workspace for Next-Generation Scientific Discovery https://sciencetostartup.com/paper/bloclaw-an-omniscient-multi-modal-agentic-workspace-for-next-generation-scientific-discovery https://sciencetostartup.com/paper/bloclaw-an-omniscient-multi-modal-agentic-workspace-for-next-generation-scientific-discovery Wed, 01 Apr 2026 06:47:40 GMT The integration of Large Language Models (LLMs) into life sciences has catalyzed the development of "AI Scientists." However, translating these theoretical capabilities into deployment-ready research environments exposes profound infrastructural vulnerabilities. Current frameworks are bottlenecked by fragile JSON-based tool-calling protocols, easily disrupted execution sandboxes that lose graphical outputs, and rigid conversational interfaces inherently ill-suited for high-dimensional scientifi… AI4S Operating Systems Internal APIs Are All You Need: Shadow APIs, Shared Discovery, and the Case Against Browser-First Agent Architectures https://sciencetostartup.com/paper/internal-apis-are-all-you-need-shadow-apis-shared-discovery-and-the-case-against-browser-first-agent-architectures https://sciencetostartup.com/paper/internal-apis-are-all-you-need-shadow-apis-shared-discovery-and-the-case-against-browser-first-agent-architectures Wed, 01 Apr 2026 09:51:46 GMT Autonomous agents increasingly interact with the web, yet most websites remain designed for human browsers -- a fundamental mismatch that the emerging ``Agentic Web'' must resolve. Agents must repeatedly browse pages, inspect DOMs, and reverse-engineer callable routes -- a process that is slow, brittle, and redundantly repeated across agents. We observe that every modern website already exposes internal APIs (sometimes called \emph{shadow APIs}) behind its user interface -- first-party endpoint… AI Middleware & Tools YieldSAT: A Multimodal Benchmark Dataset for High-Resolution Crop Yield Prediction https://sciencetostartup.com/paper/yieldsat-a-multimodal-benchmark-dataset-for-high-resolution-crop-yield-prediction https://sciencetostartup.com/paper/yieldsat-a-multimodal-benchmark-dataset-for-high-resolution-crop-yield-prediction Wed, 01 Apr 2026 14:13:23 GMT Crop yield prediction requires substantial data to train scalable models. However, creating yield prediction datasets is constrained by high acquisition costs, heterogeneous data quality, and data privacy regulations. Consequently, existing datasets are scarce, low in quality, or limited to regional levels or single crop types, hindering the development of scalable data-driven solutions. In this work, we release YieldSAT, a large, high-quality, and multimodal dataset for high-resolution crop yi… Agriculture Technology OmniMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory https://sciencetostartup.com/paper/omnimem-autoresearch-guided-discovery-of-lifelong-multimodal-agent-memory https://sciencetostartup.com/paper/omnimem-autoresearch-guided-discovery-of-lifelong-multimodal-agent-memory Wed, 01 Apr 2026 15:06:23 GMT AI agents increasingly operate over extended time horizons, yet their ability to retain, organize, and recall multimodal experiences remains a critical bottleneck. Building effective lifelong memory requires navigating a vast design space spanning architecture, retrieval strategies, prompt engineering, and data pipelines; this space is too large and interconnected for manual exploration or traditional AutoML to explore effectively. We deploy an autonomous research pipeline to discover OmniMem,… AI Memory Systems Generalization Bounds for Spectral GNNs via Fourier Domain Analysis https://sciencetostartup.com/paper/generalization-bounds-for-spectral-gnns-via-fourier-domain-analysis https://sciencetostartup.com/paper/generalization-bounds-for-spectral-gnns-via-fourier-domain-analysis Wed, 01 Apr 2026 13:58:50 GMT Spectral graph neural networks learn graph filters, but their behavior with increasing depth and polynomial order is not well understood. We analyze these models in the graph Fourier domain, where each layer becomes an element-wise frequency update, separating the fixed spectrum from trainable parameters and making depth and order explicit. In this setting, we show that Gaussian complexity is invariant under the Graph Fourier Transform, which allows us to derive data-dependent, depth, and order… Graph Neural Networks Go Big or Go Home: Simulating Mobbing Behavior with Braitenbergian Robots https://sciencetostartup.com/paper/go-big-or-go-home-simulating-mobbing-behavior-with-braitenbergian-robots https://sciencetostartup.com/paper/go-big-or-go-home-simulating-mobbing-behavior-with-braitenbergian-robots Wed, 01 Apr 2026 00:56:50 GMT We used the Webots robotics simulation platform to simulate a dyadic avoiding and mobbing predator behavior in a group of Braitenbergian robots. Mobbing is an antipredator adaptation used by some animals in which the individuals cooperatively attack or harass a predator to protect themselves. One way of coordinating a mobbing attack is using mobbing calls to summon other individuals of the mobbing species. We imitated this mechanism and simulated Braitenbergian robots that use mobbing calls whe… Robotics Simulation Predicting Wave Reflection and Transmission in Heterogeneous Media via Fourier Operator-Based Transformer Modeling https://sciencetostartup.com/paper/predicting-wave-reflection-and-transmission-in-heterogeneous-media-via-fourier-operator-based-transformer-modeling https://sciencetostartup.com/paper/predicting-wave-reflection-and-transmission-in-heterogeneous-media-via-fourier-operator-based-transformer-modeling Tue, 31 Mar 2026 18:37:56 GMT We develop a machine learning (ML) surrogate model to approximate solutions to Maxwell's equations in one dimension, focusing on scenarios involving a material interface that reflects and transmits electro-magnetic waves. Derived from high-fidelity Finite Volume (FV) simulations, our training data includes variations of the initial conditions, as well as variations in one material's speed of light, allowing for the model to learn a range of wave-material interaction behaviors. The ML model auto… Scientific ML Softmax gradient policy for variance minimization and risk-averse multi armed bandits https://sciencetostartup.com/paper/softmax-gradient-policy-for-variance-minimization-and-risk-averse-multi-armed-bandits https://sciencetostartup.com/paper/softmax-gradient-policy-for-variance-minimization-and-risk-averse-multi-armed-bandits Tue, 31 Mar 2026 21:08:14 GMT Algorithms for the Multi-Armed Bandit (MAB) problem play a central role in sequential decision-making and have been extensively explored both theoretically and numerically. While most classical approaches aim to identify the arm with the highest expected reward, we focus on a risk-aware setting where the goal is to select the arm with the lowest variance, favoring stability over potentially high but uncertain returns. To model the decision process, we consider a softmax parameterization of the… Reinforcement Learning Reconsidering Dependency Networks from an Information Geometry Perspective https://sciencetostartup.com/paper/reconsidering-dependency-networks-from-an-information-geometry-perspective https://sciencetostartup.com/paper/reconsidering-dependency-networks-from-an-information-geometry-perspective Wed, 01 Apr 2026 16:40:47 GMT Dependency networks (Heckerman et al., 2000) provide a flexible framework for modeling complex systems with many variables by combining independently learned local conditional distributions through pseudo-Gibbs sampling. Despite their computational advantages over Bayesian and Markov networks, the theoretical foundations of dependency networks remain incomplete, primarily because their model distributions -- defined as stationary distributions of pseudo-Gibbs sampling -- lack closed-form expres… Probabilistic Modeling Valency Classification of Mapudungun Verbal Roots. Established by the language's own morphotactics https://sciencetostartup.com/paper/valency-classification-of-mapudungun-verbal-roots-established-by-the-language-s-own-morphotactics https://sciencetostartup.com/paper/valency-classification-of-mapudungun-verbal-roots-established-by-the-language-s-own-morphotactics Wed, 01 Apr 2026 11:54:06 GMT In the previous work, a lexical (re)categorisation -- or confirmation of the given category -- of roots identified as verbal was undertaken to determine their original category accurately. Building on this, the present paper offers an account of the valency classification of those Mapudungun roots confirmed to be verbal, using the language's own morphotactics; specifically, by examining the permissible and restricted combinations of various suffixes with roots or verbal stems in the Mapuche ver… NLP Linguistics Convergence of projected stochastic natural gradient variational inference for various step size and sample or batch size schedules https://sciencetostartup.com/paper/convergence-of-projected-stochastic-natural-gradient-variational-inference-for-various-step-size-and-sample-or-batch-siz https://sciencetostartup.com/paper/convergence-of-projected-stochastic-natural-gradient-variational-inference-for-various-step-size-and-sample-or-batch-siz Wed, 01 Apr 2026 09:39:50 GMT Stochastic natural gradient variational inference (NGVI) is a popular and efficient algorithm for Bayesian inference. Despite empirical success, the convergence of this method is still not fully understood. In this work, we define and study a projected stochastic NGVI when variational distributions form an exponential family. Stochasticity arises when either gradients are intractable expectations or large sums. We prove new non-asymptotic convergence results for combinations of constant or decr… Bayesian Inference Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels https://sciencetostartup.com/paper/breaking-data-symmetry-is-needed-for-generalization-in-feature-learning-kernels https://sciencetostartup.com/paper/breaking-data-symmetry-is-needed-for-generalization-in-feature-learning-kernels Tue, 31 Mar 2026 23:28:33 GMT Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product… Machine Learning Theory LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED) https://sciencetostartup.com/paper/latent-phase-inference-from-short-time-sequences-using-shallow-recurrent-decoders-lapis-shred https://sciencetostartup.com/paper/latent-phase-inference-from-short-time-sequences-using-shallow-recurrent-decoders-lapis-shred Wed, 01 Apr 2026 17:55:10 GMT Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central challenge in complex systems, as measurements can be spatially incomplete and can be also limited to narrow temporal windows. Yet approximating the complete spatio-temporal trajectory is essential for mechanistic insight and understanding, model calibration, and operational decision-making. We introduce LAPIS-SHRED (LAtent Phase Inference from Short time sequence using SHallow REcurren… Scientific AI Measuring the Representational Alignment of Neural Systems in Superposition https://sciencetostartup.com/paper/measuring-the-representational-alignment-of-neural-systems-in-superposition https://sciencetostartup.com/paper/measuring-the-representational-alignment-of-neural-systems-in-superposition Tue, 31 Mar 2026 20:23:07 GMT Comparing the internal representations of neural networks is a central goal in both neuroscience and machine learning. Standard alignment metrics operate on raw neural activations, implicitly assuming that similar representations produce similar activity patterns. However, neural systems frequently operate in superposition, encoding more features than they have neurons via linear compression. We derive closed-form expressions showing that superposition systematically deflates Representational S… Neural Representation Analysis On the Necessity of Pre-agreed Secrets for Thwarting Last-minute Coercion: Vulnerabilities and Lessons From the Loki E-voting Protocol https://sciencetostartup.com/paper/on-the-necessity-of-pre-agreed-secrets-for-thwarting-last-minute-coercion-vulnerabilities-and-lessons-from-the-loki-e-vo https://sciencetostartup.com/paper/on-the-necessity-of-pre-agreed-secrets-for-thwarting-last-minute-coercion-vulnerabilities-and-lessons-from-the-loki-e-vo Tue, 31 Mar 2026 19:40:55 GMT Coercion-resistance (CR) is a crucial security property in e-voting systems. It ensures that an attacker cannot compel a voter to vote in a specific way by using threats or rewards. The Loki e-voting protocol, proposed by Giustolisi \emph{et al.} at IEEE S\&P (2024), introduces a novel design that mitigates last-minute coercion through a re-voting mechanism. It also aims to address the usability issues of the seminal JCJ e-voting protocol, specifically: i) the requirement that voters can store… Security Frege in the Flesh: Biolinguistics and the Neural Enforcement of Syntactic Structures https://sciencetostartup.com/paper/frege-in-the-flesh-biolinguistics-and-the-neural-enforcement-of-syntactic-structures https://sciencetostartup.com/paper/frege-in-the-flesh-biolinguistics-and-the-neural-enforcement-of-syntactic-structures Tue, 31 Mar 2026 22:32:51 GMT Biolinguistics is the interdisciplinary scientific study of the biological foundations, evolution, and genetic basis of human language. It treats language as an innate biological organ or faculty of the mind, rather than a cultural tool, and it challenges a behaviorist conception of human language acquisition as being based on stimulus-response associations. Extracting its most essential component, it takes seriously the idea that mathematical, algebraic models of language capture something nat… Linguistics Model-Based Learning of Near-Optimal Finite-Window Policies in POMDPs https://sciencetostartup.com/paper/model-based-learning-of-near-optimal-finite-window-policies-in-pomdps https://sciencetostartup.com/paper/model-based-learning-of-near-optimal-finite-window-policies-in-pomdps Wed, 01 Apr 2026 15:32:47 GMT We study model-based learning of finite-window policies in tabular partially observable Markov decision processes (POMDPs). A common approach to learning under partial observability is to approximate unbounded history dependencies using finite action-observation windows. This induces a finite-state Markov decision process (MDP) over histories, referred to as the superstate MDP. Once a model of this superstate MDP is available, standard MDP algorithms can be used to compute optimal policies, mot… Reinforcement Learning Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks https://sciencetostartup.com/paper/towards-initialization-dependent-and-non-vacuous-generalization-bounds-for-overparameterized-shallow-neural-networks https://sciencetostartup.com/paper/towards-initialization-dependent-and-non-vacuous-generalization-bounds-for-overparameterized-shallow-neural-networks Wed, 01 Apr 2026 05:42:40 GMT Overparameterized neural networks often show a benign overfitting property in the sense of achieving excellent generalization behavior despite the number of parameters exceeding the number of training examples. A promising direction to explain benign overfitting is to relate generalization to the norm of distance from initialization, motivated by the empirical observations that this distance is often significantly smaller than the norm itself. However, the existing initialization-dependent comp… Machine Learning Theory Reachability-Aware Time Scaling for Path Tracking https://sciencetostartup.com/paper/reachability-aware-time-scaling-for-path-tracking https://sciencetostartup.com/paper/reachability-aware-time-scaling-for-path-tracking Wed, 01 Apr 2026 03:34:06 GMT This paper studies tracking of collision-free waypoint paths produced by an offline planner for a planar double-integrator system with bounded speed and acceleration. Because sampling-based planners must route around obstacles, the resulting waypoint paths can contain sharp turns and high-curvature regions, so one-step reachability under acceleration limits becomes critical even when the path geometry is collision-free. We build on a pure-pursuit-style, reachability-guided quadratic-program (QP… Robotics Path Tracking Play-Testing REMind: Evaluating an Educational Robot-Mediated Role-Play Game https://sciencetostartup.com/paper/play-testing-remind-evaluating-an-educational-robot-mediated-role-play-game https://sciencetostartup.com/paper/play-testing-remind-evaluating-an-educational-robot-mediated-role-play-game Tue, 31 Mar 2026 22:51:53 GMT This paper presents REMind, an innovative educational robot-mediated role-play game designed to support anti-bullying bystander intervention among children. REMind invites players to observe a bullying scenario enacted by social robots, reflect on the perspectives of the characters, and rehearse defending strategies by puppeteering a robotic avatar. We evaluated REMind through a mixed-methods play-testing study with 18 children aged 9--10. The findings suggest that the experience supported key… Educational Robotics Rapid mixing in positively weighted restricted Boltzmann machines https://sciencetostartup.com/paper/rapid-mixing-in-positively-weighted-restricted-boltzmann-machines https://sciencetostartup.com/paper/rapid-mixing-in-positively-weighted-restricted-boltzmann-machines Wed, 01 Apr 2026 14:38:35 GMT We show polylogarithmic mixing time bounds for the alternating-scan sampler for positively weighted restricted Boltzmann machines. This is done via analysing the same chain and the Glauber dynamics for ferromagnetic two-spin systems, where we obtain new mixing time bounds up to the critical thresholds. Machine Learning Theory Hierarchical Discrete Flow Matching for Graph Generation https://sciencetostartup.com/paper/hierarchical-discrete-flow-matching-for-graph-generation https://sciencetostartup.com/paper/hierarchical-discrete-flow-matching-for-graph-generation Tue, 31 Mar 2026 20:58:12 GMT Denoising-based models, including diffusion and flow matching, have led to substantial advances in graph generation. Despite this progress, such models remain constrained by two fundamental limitations: a computational cost that scales quadratically with the number of nodes and a large number of function evaluations required during generation. In this work, we introduce a novel hierarchical generative framework that reduces the number of node pairs that must be evaluated and adopts discrete flo… Graph Generation A Survey of On-Policy Distillation for Large Language Models https://sciencetostartup.com/paper/a-survey-of-on-policy-distillation-for-large-language-models https://sciencetostartup.com/paper/a-survey-of-on-policy-distillation-for-large-language-models Wed, 01 Apr 2026 08:32:34 GMT Knowledge distillation has become a primary mechanism for transferring reasoning and domain expertise from frontier Large Language Models (LLMs) to smaller, deployable students. However, the dominant paradigm remains \textit{off-policy}: students train on static teacher-generated data and never encounter their own errors during learning. This train--test mismatch, an instance of \textit{exposure bias}, causes prediction errors to compound autoregressively at inference time. On-Policy Distillati… LLM Training Representation choice shapes the interpretation of protein conformational dynamics https://sciencetostartup.com/paper/representation-choice-shapes-the-interpretation-of-protein-conformational-dynamics https://sciencetostartup.com/paper/representation-choice-shapes-the-interpretation-of-protein-conformational-dynamics Wed, 01 Apr 2026 07:38:52 GMT Molecular dynamics simulations provide detailed trajectories at the atomic level, but extracting interpretable and robust insights from these high-dimensional data remains challenging. In practice, analyses typically rely on a single representation. Here, we show that representation choice is not neutral: it fundamentally shapes the conformational organization, similarity relationships, and apparent transitions inferred from identical simulation data. To complement existing representations, w… Scientific AI Multimodal Language Models Cannot Spot Spatial Inconsistencies https://sciencetostartup.com/paper/multimodal-language-models-cannot-spot-spatial-inconsistencies https://sciencetostartup.com/paper/multimodal-language-models-cannot-spot-spatial-inconsistencies Wed, 01 Apr 2026 12:06:54 GMT Spatial consistency is a fundamental property of the visual world and a key requirement for models that aim to understand physical reality. Despite recent advances, multimodal large language models (MLLMs) often struggle to reason about 3D geometry across multiple views. Rather than asking models to describe scene attributes, we introduce a more challenging task: given two views of the same scene, identify the object that violates 3D motion consistency. We propose a simple and scalable method f… Multimodal AI Multicentric thrombus segmentation using an attention-based recurrent network with gradual modality dropout https://sciencetostartup.com/paper/multicentric-thrombus-segmentation-using-an-attention-based-recurrent-network-with-gradual-modality-dropout https://sciencetostartup.com/paper/multicentric-thrombus-segmentation-using-an-attention-based-recurrent-network-with-gradual-modality-dropout Wed, 01 Apr 2026 12:25:00 GMT Detecting and delineating tiny targets in 3D brain scans is a central yet under-addressed challenge in medical imaging.In ischemic stroke, for instance, the culprit thrombus is small, low-contrast, and variably expressed across modalities(e.g., susceptibility-weighted T2 blooming, diffusion restriction on DWI/ADC), while real-world multi-center dataintroduce domain shifts, anisotropy, and frequent missing sequences. We introduce a methodology that couples an attention-based recurrent segmentati… Medical Imaging Segmentation Fluently Lying: Adversarial Robustness Can Be Substrate-Dependent https://sciencetostartup.com/paper/fluently-lying-adversarial-robustness-can-be-substrate-dependent https://sciencetostartup.com/paper/fluently-lying-adversarial-robustness-can-be-substrate-dependent Wed, 01 Apr 2026 08:10:56 GMT The primary tools used to monitor and defend object detectors under adversarial attack assume that when accuracy degrades, detection count drops in tandem. This coupling was assumed, not measured. We report a counterexample observed on a single model: under standard PGD, EMS-YOLO, a spiking neural network (SNN) object detector, retains more than 70% of its detections while mAP collapses from 0.528 to 0.042. We term this count-preserving accuracy collapse Quality Corruption (QC), to distinguish… Adversarial Robustness PRISM: Differentiable Analysis-by-Synthesis for Fixel Recovery in Diffusion MRI https://sciencetostartup.com/paper/prism-differentiable-analysis-by-synthesis-for-fixel-recovery-in-diffusion-mri https://sciencetostartup.com/paper/prism-differentiable-analysis-by-synthesis-for-fixel-recovery-in-diffusion-mri Tue, 31 Mar 2026 21:22:49 GMT Diffusion MRI microstructure fitting is nonconvex and often performed voxelwise, which limits fiber peak recovery in narrow crossings. This work introduces PRISM, a differentiable analysis-by-synthesis framework that fits an explicit multi-compartment forward model end-to-end over spatial patches. The model combines cerebrospinal fluid (CSF), gray matter, up to K white-matter fiber compartments (stick-and-zeppelin), and a restricted compartment, with explicit fiber directions and soft model sel… Medical AI Deep Networks Favor Simple Data https://sciencetostartup.com/paper/deep-networks-favor-simple-data https://sciencetostartup.com/paper/deep-networks-favor-simple-data Wed, 01 Apr 2026 02:23:21 GMT Estimated density is often interpreted as indicating how typical a sample is under a model. Yet deep models trained on one dataset can assign \emph{higher} density to simpler out-of-distribution (OOD) data than to in-distribution test data. We refer to this behavior as the OOD anomaly. Prior work typically studies this phenomenon within a single architecture, detector, or benchmark, implicitly assuming certain canonical densities. We instead separate the trained network from the density estimat… LLM Analysis A Safety-Aware Role-Orchestrated Multi-Agent LLM Framework for Behavioral Health Communication Simulation https://sciencetostartup.com/paper/a-safety-aware-role-orchestrated-multi-agent-llm-framework-for-behavioral-health-communication-simulation https://sciencetostartup.com/paper/a-safety-aware-role-orchestrated-multi-agent-llm-framework-for-behavioral-health-communication-simulation Tue, 31 Mar 2026 21:21:31 GMT Single-agent large language model (LLM) systems struggle to simultaneously support diverse conversational functions and maintain safety in behavioral health communication. We propose a safety-aware, role-orchestrated multi-agent LLM framework designed to simulate supportive behavioral health dialogue through coordinated, role-differentiated agents. Conversational responsibilities are decomposed across specialized agents, including empathy-focused, action-oriented, and supervisory roles, while a… Agents Trust and Reliance on AI in Education: AI Literacy and Need for Cognition as Moderators https://sciencetostartup.com/paper/trust-and-reliance-on-ai-in-education-ai-literacy-and-need-for-cognition-as-moderators https://sciencetostartup.com/paper/trust-and-reliance-on-ai-in-education-ai-literacy-and-need-for-cognition-as-moderators Wed, 01 Apr 2026 16:38:47 GMT As generative AI systems are integrated into educational settings, students often encounter AI-generated output while working through learning tasks, either by requesting help or through integrated tools. Trust in AI can influence how students interpret and use that output, including whether they evaluate it critically or exhibit overreliance. We investigate how students' trust relates to their appropriate reliance on an AI assistant during programming problem-solving tasks, and whether this re… AI Literacy in Education MF-QAT: Multi-Format Quantization-Aware Training for Elastic Inference https://sciencetostartup.com/paper/mf-qat-multi-format-quantization-aware-training-for-elastic-inference https://sciencetostartup.com/paper/mf-qat-multi-format-quantization-aware-training-for-elastic-inference Wed, 01 Apr 2026 06:12:19 GMT Quantization-aware training (QAT) is typically performed for a single target numeric format, while practical deployments often need to choose numerical precision at inference time based on hardware support or runtime constraints. We study multi-format QAT, where a single model is trained to be robust across multiple quantization formats. We find that multi-format QAT can match single-format QAT at each target precision, yielding one model that performs well overall across different formats, eve… LLM Training A Cross-graph Tuning-free GNN Prompting Framework https://sciencetostartup.com/paper/a-cross-graph-tuning-free-gnn-prompting-framework https://sciencetostartup.com/paper/a-cross-graph-tuning-free-gnn-prompting-framework Wed, 01 Apr 2026 02:34:23 GMT GNN prompting aims to adapt models across tasks and graphs without requiring extensive retraining. However, most existing graph prompt methods still require task-specific parameter updates and face the issue of generalizing across graphs, limiting their performance and undermining the core promise of prompting. In this work, we introduce a Cross-graph Tuning-free Prompting Framework (CTP), which supports both homogeneous and heterogeneous graphs, can be directly deployed to unseen graphs withou… Graph Neural Networks Narrative Fingerprints: Multi-Scale Author Identification via Novelty Curve Dynamics https://sciencetostartup.com/paper/narrative-fingerprints-multi-scale-author-identification-via-novelty-curve-dynamics https://sciencetostartup.com/paper/narrative-fingerprints-multi-scale-author-identification-via-novelty-curve-dynamics Wed, 01 Apr 2026 16:07:58 GMT We test whether authors have characteristic "fingerprints" in the information-theoretic novelty curves of their published works. Working with two corpora -- Books3 (52,796 books, 759 qualifying authors) and PG-19 (28,439 books, 1,821 qualifying authors) -- we find that authorial voice leaves measurable traces in how novelty unfolds across a text. The signal is multi-scale: at book level, scalar dynamics (mean novelty, speed, volume, circuitousness) identify 43% of authors significantly above ch… Author Identification Beyond Latency: A System-Level Characterization of MPC and FHE for PPML https://sciencetostartup.com/paper/beyond-latency-a-system-level-characterization-of-mpc-and-fhe-for-ppml https://sciencetostartup.com/paper/beyond-latency-a-system-level-characterization-of-mpc-and-fhe-for-ppml Tue, 31 Mar 2026 19:18:52 GMT Privacy protection has become an increasing concern in modern machine learning applications. Privacy-preserving machine learning (PPML) has attracted growing research attention, with approaches such as secure multiparty computation (MPC) and fully homomorphic encryption (FHE) being actively explored. However, existing evaluations of these approaches have frequently been done on a narrow, fragmented setup and only focused on a specific performance metric, such as the online inference latency of… Privacy-Preserving ML Speech LLMs are Contextual Reasoning Transcribers https://sciencetostartup.com/paper/speech-llms-are-contextual-reasoning-transcribers https://sciencetostartup.com/paper/speech-llms-are-contextual-reasoning-transcribers Wed, 01 Apr 2026 08:13:50 GMT Despite extensions to speech inputs, effectively leveraging the rich knowledge and contextual understanding of large language models (LLMs) in automatic speech recognition (ASR) remains non-trivial, as the task primarily involves direct speech-to-text mapping. To address this, this paper proposes chain-of-thought ASR (CoT-ASR), which constructs a reasoning chain that enables LLMs to first analyze the input speech and generate contextual analysis, thereby fully exploiting their generative capabi… Speech AI UniMixer: A Unified Architecture for Scaling Laws in Recommendation Systems https://sciencetostartup.com/paper/unimixer-a-unified-architecture-for-scaling-laws-in-recommendation-systems https://sciencetostartup.com/paper/unimixer-a-unified-architecture-for-scaling-laws-in-recommendation-systems Wed, 01 Apr 2026 07:57:40 GMT In recent years, the scaling laws of recommendation models have attracted increasing attention, which govern the relationship between performance and parameters/FLOPs of recommenders. Currently, there are three mainstream architectures for achieving scaling in recommendation models, namely attention-based, TokenMixer-based, and factorization-machine-based methods, which exhibit fundamental differences in both design philosophy and architectural structure. In this paper, we propose a unified sca… Recommendation Systems Sampling-based Task and Kinodynamic Motion Planning under Semantic Uncertainty https://sciencetostartup.com/paper/sampling-based-task-and-kinodynamic-motion-planning-under-semantic-uncertainty https://sciencetostartup.com/paper/sampling-based-task-and-kinodynamic-motion-planning-under-semantic-uncertainty Wed, 01 Apr 2026 02:36:14 GMT This paper tackles the problem of integrated task and kinodynamic motion planning in uncertain environments. We consider a robot with nonlinear dynamics tasked with a Linear Temporal Logic over finite traces ($\ltlf$) specification operating in a partially observable environment. Specifically, the uncertainty is in the semantic labels of the environment. We show how the problem can be modeled as a Partially Observable Stochastic Hybrid System that captures the robot dynamics, $\ltlf$ task, and… Robotics Not My Truce: Personality Differences in AI-Mediated Workplace Negotiation https://sciencetostartup.com/paper/not-my-truce-personality-differences-in-ai-mediated-workplace-negotiation https://sciencetostartup.com/paper/not-my-truce-personality-differences-in-ai-mediated-workplace-negotiation Wed, 01 Apr 2026 04:26:26 GMT AI-driven conversational coaching is increasingly used to support workplace negotiation, yet prior work assumes uniform effectiveness across users. We challenge this assumption by examining how individual differences, particularly personality traits, moderate coaching outcomes. We conducted a between-subjects experiment (N=267) comparing theory-driven AI (Trucey), general-purpose AI (Control-AI), and a traditional negotiation handbook (Control-NoAI). Participants were clustered into three profi… AI Coaching A Physical Imitation Learning Pipeline for Energy-Efficient Quadruped Locomotion Assisted by Parallel Elastic Joint https://sciencetostartup.com/paper/a-physical-imitation-learning-pipeline-for-energy-efficient-quadruped-locomotion-assisted-by-parallel-elastic-joint https://sciencetostartup.com/paper/a-physical-imitation-learning-pipeline-for-energy-efficient-quadruped-locomotion-assisted-by-parallel-elastic-joint Wed, 01 Apr 2026 08:13:54 GMT Due to brain-body co-evolution, animals' intrinsic body dynamics play a crucial role in energy-efficient locomotion, which shares control effort between active muscles and passive body dynamics -- a principle known as Embodied Physical Intelligence. In contrast, robot bodies are often designed with one centralised controller that typically suppress the intrinsic body dynamics instead of exploiting it. We introduce Physical Imitation Learning (PIL), which distils a Reinforcement Learning (RL) co… Robotics Convergence of Byzantine-Resilient Gradient Tracking via Probabilistic Edge Dropout https://sciencetostartup.com/paper/convergence-of-byzantine-resilient-gradient-tracking-via-probabilistic-edge-dropout https://sciencetostartup.com/paper/convergence-of-byzantine-resilient-gradient-tracking-via-probabilistic-edge-dropout Wed, 01 Apr 2026 03:55:42 GMT We study distributed optimization over networks with Byzantine agents that may send arbitrary adversarial messages. We propose \emph{Gradient Tracking with Probabilistic Edge Dropout} (GT-PD), a stochastic gradient tracking method that preserves the convergence properties of gradient tracking under adversarial communication. GT-PD combines two complementary defense layers: a universal self-centered projection that clips each incoming message to a ball of radius $τ$ around the receiving agent, a… Distributed Optimization A CEFR-Inspired Classification Framework with Fuzzy C-Means To Automate Assessment of Programming Skills in Scratch https://sciencetostartup.com/paper/a-cefr-inspired-classification-framework-with-fuzzy-c-means-to-automate-assessment-of-programming-skills-in-scratch https://sciencetostartup.com/paper/a-cefr-inspired-classification-framework-with-fuzzy-c-means-to-automate-assessment-of-programming-skills-in-scratch Wed, 01 Apr 2026 10:42:07 GMT Context: Schools, training platforms, and technology firms increasingly need to assess programming proficiency at scale with transparent, reproducible methods that support personalized learning pathways. Objective: This study introduces a pedagogical framework for Scratch project assessment, aligned with the Common European Framework of Reference (CEFR), providing universal competency levels for students and teachers alongside actionable insights for curriculum design. Method: We apply Fuzzy C-… Educational AI Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation https://sciencetostartup.com/paper/logarithmic-scores-power-law-discoveries-disentangling-measurement-from-coverage-in-agent-based-evaluation https://sciencetostartup.com/paper/logarithmic-scores-power-law-discoveries-disentangling-measurement-from-coverage-in-agent-based-evaluation Wed, 01 Apr 2026 04:44:21 GMT LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty remains: can we trust their assessments, and if so, how many are needed? Through 960 sessions with two model pairs across 15 tasks, we show that persona-based agent judges produce evaluations indistinguishable from human raters in a Turing-style validation. We then identify a score-coverage dissociation: quality scores improve logarithmically with panel size, while unique issue discove… Agent Evaluation Phase transition on a context-sensitive random language model with short range interactions https://sciencetostartup.com/paper/phase-transition-on-a-context-sensitive-random-language-model-with-short-range-interactions https://sciencetostartup.com/paper/phase-transition-on-a-context-sensitive-random-language-model-with-short-range-interactions Wed, 01 Apr 2026 14:19:11 GMT Since the random language model was proposed by E. DeGiuli [Phys. Rev. Lett. 122, 128301], language models have been investigated intensively from the viewpoint of statistical mechanics. Recently, the existence of a Berezinskii--Kosterlitz--Thouless transition was numerically demonstrated in models with long-range interactions between symbols. In statistical mechanics, it has long been known that long-range interactions can induce phase transitions. Therefore, it has remained unclear whether ph… Language Model Theory