ScienceToStartup
Product
Proof
DevelopersTrends
Resources
Company

113 Cherry St #92768

Seattle, WA 98104-2205

Backed by Research Labs

Product, Proof, and developer surfaces share one public navigation contract.

Product

  • Daily Dashboard
  • Signal Canvas
  • Build Loop
  • Evidence
  • Workspace
  • Terminal
  • Talent Layer
  • GitHub Velocity

Proof

  • Foresight
  • Proof Layer
  • Proof Homepage
  • Freshness Hub
  • Example Paper Page
  • Topic Proof Layer
  • Benchmark Scorecard
  • Public Dataset

Developers

  • Overview
  • Start Here
  • REST API
  • MCP Server
  • SDKs
  • Examples
  • Keys
  • Docs

Trends

  • Live Desk
  • Archive
  • Entities
  • Narratives
  • Topics
  • Methodology

Resources

  • All Resources
  • Benchmark
  • Dataset
  • Database
  • Glossary
  • Directory
  • Templates
  • Topics

Company

  • Company Hub
  • About
  • Articles
  • Changelog
  • Careers
  • Enterprise
  • Scout
  • RFPs
  • FAQ
  • Legal
  • Privacy
  • Contact
ScienceToStartup

Copyright © 2026 ScienceToStartup. All rights reserved.

Privacy|Legal

Product / Daily Dashboard

Daily operator view for ranked research movement

This is the canonical Daily Dashboard surface. It restores the research map, paper previews, GitHub velocity, prediction markets, filters, stats, and sidebars under one product route.

Product HubProof LayerTrends

Canonical Snapshot Truth

Canonical Daily Dashboard health

This route, the homepage preview, the developer status block, and the public manifests now read the same dashboard snapshot record.

ready
Observed snapshot
2026-04-22
Last landed snapshot
2026-04-22
Fresh until
2026-04-23
Expected recovery
Unknown

Tech Stack

ScienceToStartup

Research Intelligence

Apr 21
Papers127
Opportunities102

Research Map

Daily cluster surface for the landed snapshot.

Daily Brief

127 ranked papers landed for 2026-04-21. 95 are high-potential, 29 are quick builds, and opportunity share is 80.3%. VLA Foundry: A Unified Framework for Training Vision-Language-Action Models ranked #1 because signal fusion 69.5 with fresh evidence, 0 references, and 67% evidence coverage..

Built entirely from persisted dashboard metric snapshots and canonical opportunity kernels.

Top Anomalies

No anomaly crossed the dashboard trust threshold versus the last landed snapshot.

Action Rail

Sign in to hydrate workspace memory. Until then, the rail shows snapshot-backed quick actions.

Sources counted: 0

Sign in

Daily Snapshot

Every card below renders the canonical `MetricContract`, including freshness, provenance, and formula labels.

Trending Today

The highest-ranked papers in the canonical dashboard snapshot, rendered without client-side score recomputation.

Rank #1Signal 69.5fresh
VLA Foundry: A Unified Framework for Training Vision-Language-Action Models

We present VLA Foundry, an open-source framework that unifies LLM, VLM, and VLA training in a single codebase. Most open-source VLA efforts specialize on the action training stage, often stitching together incompatible pretraining pipelines. VLA Foundry instead provides a shared training stack with end-to-end control, from language pretraining to action-expert fine-tuning. VLA Foundry supports both from-scratch training and pretrained backbones from Hugging Face. To demonstrate the utility of our framework, we train and release two types of models: the first trained fully from scratch through our LLM-->VLM-->VLA pipeline and the second built on the pretrained Qwen3-VL backbone. We evaluate closed-loop policy performance of both models on LBM Eval, an open-data, open-source simulator. We also contribute usability improvements to the simulator and the STEP analysis tools for easier public use. In the nominal evaluation setting, our fully-open from-scratch model is on par with our prior closed-source work and substituting in the Qwen3-VL backbone leads to a strong multi-task table top manipulation policy outperforming our baseline by a wide margin. The VLA Foundry codebase is available at https://github.com/TRI-ML/vla_foundry and all multi-task model weights are released on https://huggingface.co/collections/TRI-ML/vla-foundry. Additional qualitative videos are available on the project website https://tri-ml.github.io/vla_foundry.

Why This Ranked Here

Signal Fusion 69.5 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

VLA Foundry: A Unified Framework for Training Vision-Language-Action Models visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

5.9

Method

4.6

Hugging FaceQwen3-VL

GitHub Velocity

141

Repository stars tracked from cached pulse or recent historical snapshots.

+1/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #2Signal 69.2fresh
A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding

Understanding artworks requires multi-step reasoning over visual content and cultural, historical, and stylistic context. While recent multimodal large language models show promise in artwork explanation, they rely on implicit reasoning and internalized knowl- edge, limiting interpretability and explicit evidence grounding. We propose A-MAR, an Agent-based Multimodal Art Retrieval framework that explicitly conditions retrieval on structured reasoning plans. Given an artwork and a user query, A-MAR first decomposes the task into a structured reasoning plan that specifies the goals and evidence requirements for each step. Retrieval is then conditionedon this plan, enabling targeted evidence selection and supporting step-wise, grounded explanations. To evaluate agent-based multi- modal reasoning within the art domain, we introduce ArtCoT-QA. This diagnostic benchmark features multi-step reasoning chains for diverse art-related queries, enabling a granular analysis that extends beyond simple final answer accuracy. Experiments on SemArt and Artpedia show that A-MAR consistently outperforms static, non planned retrieval and strong MLLM baselines in final explanation quality, while evaluations on ArtCoT-QA further demonstrate its advantages in evidence grounding and multi-step reasoning ability. These results highlight the importance of reasoning-conditioned retrieval for knowledge-intensive multimodal understanding and position A-MAR as a step toward interpretable, goal-driven AI systems, with particular relevance to cultural industries. The code and data are available at: https://github.com/ShuaiWang97/A-MAR.

Why This Ranked Here

Signal Fusion 69.2 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

A-MAR: Agent-based Multimodal Art Retrieval for Fine-Grained Artwork Understanding visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

6.5

Method

3.8

GitHub Velocity

0

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #3Signal 68.7fresh
DT2IT-MRM: Debiased Preference Construction and Iterative Training for Multimodal Reward Modeling

Multimodal reward models (MRMs) play a crucial role in aligning Multimodal Large Language Models (MLLMs) with human preferences. Training a good MRM requires high-quality multimodal preference data. However, existing preference datasets face three key challenges: lack of granularity in preference strength, textual style bias, and unreliable preference signals. Besides, existing open-source multimodal preference datasets suffer from substantial noise, yet there is a lack of effective and scalable curation methods to enhance their quality. To address these limitations, we propose \textbf{DT2IT-MRM}, which integrates a \textbf{D}ebiased preference construction pipeline, a novel reformulation of text-to-image (\textbf{T2I}) preference data, and an \textbf{I}terative \textbf{T}raining framework that curates existing multimodal preference datasets for \textbf{M}ultimodal \textbf{R}eward \textbf{M}odeling. Our experimental results show that DT2IT-MRM achieves new \textbf{state-of-the-art} overall performance on three major benchmarks: VL-RewardBench, Multimodal RewardBench, and MM-RLHF-RewardBench.

Why This Ranked Here

Signal Fusion 68.7 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

DT2IT-MRM: Debiased Preference Construction and Iterative Training for Multimodal Reward Modeling visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

5.5

Method

4.3

GitHub Velocity

0

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active

Ranked Opportunities

Canonical score, evidence, and direct execution links.

Ranked Opportunities

Filtered locally for discovery only. Rank, score, and freshness remain server-owned.

Rank #4Signal 68.6fresh
SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization

Small language models (SLMs), such as BART, can achieve summarization performance comparable to large language models (LLMs) via distillation. However, existing LLM-based ranking strategies for summary candidates suffer from instability, while classical metrics (e.g., ROUGE) are insufficient to rank high-quality summaries. To address these issues, we introduce \textbf{SCURank}, a framework that enhances summarization by leveraging \textbf{Summary Content Units (SCUs)}. Instead of relying on unstable comparisons or surface-level overlap, SCURank evaluates summaries based on the richness and semantic importance of information content. We investigate the effectiveness of SCURank in distilling summaries from multiple diverse LLMs. Experimental results demonstrate that SCURank outperforms traditional metrics and LLM-based ranking methods across evaluation measures and datasets. Furthermore, our findings show that incorporating diverse LLM summaries enhances model abstractiveness and overall distilled model performance, validating the benefits of information-centric ranking in multi-LLM distillation. The code for SCURank is available at https://github.com/IKMLab/SCURank.

Why This Ranked Here

Signal Fusion 68.6 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

SCURank: Ranking Multiple Candidate Summaries with Summary Content Units for Enhanced Summarization visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

5.3

Method

4.3

BART

GitHub Velocity

No repo

Repository stars tracked from cached pulse or recent historical snapshots.

0/wk
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #5Signal 67.2fresh
AutoAWG: Adverse Weather Generation with Adaptive Multi-Controls for Automotive Videos

Perception robustness under adverse weather remains a critical challenge for autonomous driving, with the core bottleneck being the scarcity of real-world video data in adverse weather. Existing weather generation approaches struggle to balance visual quality and annotation reusability. We present AutoAWG, a controllable Adverse Weather video Generation framework for Autonomous driving. Our method employs a semantics-guided adaptive fusion of multiple controls to balance strong weather stylization with high-fidelity preservation of safety-critical targets; leverages a vanishing point-anchored temporal synthesis strategy to construct training sequences from static images, thereby reducing reliance on synthetic data; and adopts masked training to enhance long-horizon generation stability. On the nuScenes validation set, AutoAWG significantly outperforms prior state-of-the-art methods: without first-frame conditioning, FID and FVD are relatively reduced by 50.0% and 16.1%; with first-frame conditioning, they are further reduced by 8.7% and 7.2%, respectively. Extensive qualitative and quantitative results demonstrate advantages in style fidelity, temporal consistency, and semantic--structural integrity, underscoring the practical value of AutoAWG for improving downstream perception in autonomous driving. Our code is available at: https://github.com/higherhu/AutoAWG

Why This Ranked Here

Signal Fusion 67.2 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

AutoAWG: Adverse Weather Generation with Adaptive Multi-Controls for Automotive Videos visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

4.7

Method

3.6

GitHub Velocity

No repo

Repository stars tracked from cached pulse or recent historical snapshots.

0/wk
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #6Signal 65.7fresh
From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning

Generative engines (GEs) are reshaping information access by replacing ranked links with citation-grounded answers, yet current Generative Engine Optimization (GEO) methods optimize each instance in isolation, unable to accumulate or transfer effective strategies across tasks and engines. We reframe GEO as a strategy learning problem and propose MAGEO, a multi-agent framework in which coordinated planning, editing, and fidelity-aware evaluation serve as the execution layer, while validated editing patterns are progressively distilled into reusable, engine-specific optimization skills. To enable controlled assessment, we introduce a Twin Branch Evaluation Protocol for causal attribution of content edits and DSV-CF, a dual-axis metric that unifies semantic visibility with attribution accuracy. We further release MSME-GEO-Bench, a multi-scenario, multi-engine benchmark grounded in real-world queries. Experiments on three mainstream engines show that MAGEO substantially outperforms heuristic baselines in both visibility and citation fidelity, with ablations confirming that engine-specific preference modeling and strategy reuse are central to these gains, suggesting a scalable learning-driven paradigm for trustworthy GEO. Code is available at https://github.com/Wu-beining/MAGEO

Why This Ranked Here

Signal Fusion 65.7 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

From Experience to Skill: Multi-Agent Generative Engine Optimization via Reusable Strategy Learning visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

7.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

6.5

Method

4.7

GitHub Velocity

2

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #7Signal 65.5fresh
Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents

Long-horizon enterprise agents make high-stakes decisions (loan underwriting, claims adjudication, clinical review, prior authorization) under lossy memory, multi-step reasoning, and binding regulatory constraints. Current evaluation reports a single task-success scalar that conflates distinct failure modes and hides whether an agent is aligned with the standards its deployment environment requires. We propose that long-horizon decision behavior decomposes into four orthogonal alignment axes, each independently measurable and failable: factual precision (FRP), reasoning coherence (RCS), compliance reconstruction (CRR), and calibrated abstention (CAR). CRR is a novel regulatory-grounded axis; CAR is a measurement axis separating coverage from accuracy. We exercise the decomposition on a controlled benchmark (LongHorizon-Bench) covering loan qualification and insurance claims adjudication with deterministic ground-truth construction. Running six memory architectures, we find structure aggregate accuracy cannot see: retrieval collapses on factual precision; schema-anchored architectures pay a scaffolding tax; plain summarization under a fact-preservation prompt is a strong baseline on FRP, RCS, EDA, and CRR; and all six architectures commit on every case, exposing a decisional-alignment axis the field has not targeted. The decomposition also surfaced a pre-registered prediction of our own, that summarization would fail factual recall, which the data reversed at large magnitude, an axis-level reversal aggregate accuracy would have hidden. Institutional alignment (regulatory reconstruction) and decisional alignment (calibrated abstention) are under-represented in the alignment literature and become load-bearing once decisions leave the laboratory. The framework transfers to any regulated decisioning domain via two steps: build a fact schema, and calibrate the CRR auditor prompt.

Why This Ranked Here

Signal Fusion 65.5 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

Four-Axis Decision Alignment for Long-Horizon Enterprise AI Agents visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

8.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

2.7

Method

3.9

PyTorchHugging Face

GitHub Velocity

0

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #8Signal 64.8fresh
FASTER: Value-Guided Sampling for Fast RL

Some of the most performant reinforcement learning algorithms today can be prohibitively expensive as they use test-time scaling methods such as sampling multiple action candidates and selecting the best one. In this work, we propose FASTER, a method for getting the benefits of sampling-based test-time scaling of diffusion-based policies without the computational cost by tracing the performance gain of action samples back to earlier in the denoising process. Our key insight is that we can model the denoising of multiple action candidates and selecting the best one as a Markov Decision Process (MDP) where the goal is to progressively filter action candidates before denoising is complete. With this MDP, we can learn a policy and value function in the denoising space that predicts the downstream value of action candidates in the denoising process and filters them while maximizing returns. The result is a method that is lightweight and can be plugged into existing generative RL algorithms. Across challenging long-horizon manipulation tasks in online and batch-online RL, FASTER consistently improves the underlying policies and achieves the best overall performance among the compared methods. Applied to a pretrained VLA, FASTER achieves the same performance while substantially reducing training and inference compute requirements. Code is available at https://github.com/alexanderswerdlow/faster .

Why This Ranked Here

Signal Fusion 64.8 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

FASTER: Value-Guided Sampling for Fast RL visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

7.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

6.5

Method

3.9

PyTorch

GitHub Velocity

0

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #9Signal 64.4fresh
Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language

At present, executable visual workflows have emerged as a mainstream paradigm in real-world industrial deployments, offering strong reliability and controllability. However, in current practice, such workflows are almost entirely constructed through manual engineering: developers must carefully design workflows, write prompts for each step, and repeatedly revise the logic as requirements evolve-making development costly, time-consuming, and error-prone. To study whether large language models can automate this multi-round interaction process, we introduce Chat2Workflow, a benchmark for generating executable visual workflows directly from natural language, and propose a robust agentic framework to mitigate recurrent execution errors. Chat2Workflow is built from a large collection of real-world business workflows, with each instance designed so that the generated workflow can be transformed and directly deployed to practical workflow platforms such as Dify and Coze. Experimental results show that while state-of-the-art language models can often capture high-level intent, they struggle to generate correct, stable, and executable workflows, especially under complex or changing requirements. Although our agentic framework yields up to 5.34% resolve rate gains, the remaining real-world gap positions Chat2Workflow as a foundation for advancing industrial-grade automation. Code is available at https://github.com/zjunlp/Chat2Workflow.

Why This Ranked Here

Signal Fusion 64.4 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

Chat2Workflow: A Benchmark for Generating Executable Visual Workflows with Natural Language visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

7.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

5.5

Method

4.5

PyTorch

GitHub Velocity

10

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth C
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Rank #10Signal 63.5fresh
SAMoRA: Semantic-Aware Mixture of LoRA Experts for Task-Adaptive Learning

The combination of Mixture-of-Experts (MoE) and Low-Rank Adaptation (LoRA) has shown significant potential for enhancing the multi-task learning capabilities of Large Language Models. However, existing methods face two primary challenges: (1)Imprecise Routing in the current MoE-LoRA method fails to explicitly match input semantics with expert capabilities, leading to weak expert specialization. (2)Uniform weight fusion strategies struggle to provide adaptive update strengths, overlooking the varying complexity of different tasks. To address these limitations, we propose SAMoRA (Semantic-Aware Mixture of LoRA Experts), a novel parameter-efficient fine-tuning framework tailored for task-adaptive learning. Specifically, A Semantic-Aware Router is proposed to explicitly align textual semantics with the most suitable experts for precise routing. A Task-Adaptive Scaling mechanism is designed to regulate expert contributions based on specific task requirements dynamically. In addition, a novel regularization objective is proposed to jointly promote expert specialization and effective scaling. Extensive experiments on multiple multi-task benchmarks demonstrate that SAMoRA significantly outperforms the state-of-the-art methods and holds excellent task generalization capabilities. Code is available at https://github.com/boyan-code/SAMoRA

Why This Ranked Here

Signal Fusion 63.5 with fresh evidence, 0 references, and 67% evidence coverage.

Evidence Receipt

proof unverifiedrepo active0 refs4 sources

Unresolved: proof verification has not been recorded yet

SAMoRA: Semantic-Aware Mixture of LoRA Experts for Task-Adaptive Learning visual preview

Figure Preview

Top extracted figure from the paper figures store.

Score Breakdown

Overall

7.0

Technical

1.4

Commercial

5.0

Market

5.5

Team

4.7

Method

4.3

PyTorchLoRA

GitHub Velocity

1

Repository stars tracked from cached pulse or recent historical snapshots.

0/wkHealth D
Prediction Market...
Community Confidence...
PaperSignal CanvasBuild LoopTalentOpenRepo active
Snapshot readySnapshot 2026-04-21

Canonical dashboard metrics and ranked papers are current.

Computed: Apr 22, 2:18 AMCoverage: 52%Sources counted: 1377Last landed snapshot: 2026-04-21Last known good: 2026-04-21
missing: trend_points.opportunity_share
Live News
View all

Last updated Apr 13, 2026, 3:15 PM

Feed delayed, retrying.

Feed delayed, retrying.