AI Agents Learn Expertise, News Retrieval Accelerates

AI Agents Learn Expertise, News Retrieval Accelerates | ScienceToStartup

🧠 AI Agents

COLLEAGUE.SKILL: Distilling Expert AI Capabilities

The Rundown

Researchers introduced COLLEAGUE.SKILL, a system that automates the generation of AI skills by distilling expert knowledge from heterogeneous traces. This approach moves beyond simple prompt engineering, aiming to imbue AI agents with bounded representations of human expertise, judgment, and interaction styles. The system produces versioned skill packages with two coordinated tracks: a capability track for practices and decision heuristics, and a bounded behavior track for communication style and interaction rules. This allows for inspectable, correctable, and agent-usable skills. The open-source system, which boasts approximately 18.5k GitHub stars, has already cataloged 215 skills from 165 contributors, accumulating over 100k cumulative stars across listed skill cards. This represents a significant step towards creating person-grounded AI agents that can reliably mimic and extend human professional capabilities, offering a structured way to package and deploy specialized AI expertise.

The details

COLLEAGUE.SKILL distills expert knowledge from heterogeneous traces into inspectable and correctable AI skills.
The system generates versioned skill packages with separate tracks for capabilities and bounded behaviors.
The open-source system has garnered approximately 18.5k GitHub stars.
It currently lists 215 skills contributed by 165 individuals.

Why it matters

Startups can leverage COLLEAGUE.SKILL to rapidly develop specialized AI agents that embody unique company expertise or replicate high-performing employee workflows, accelerating product development and customer service.

📰 AI for Media

DynaTree: Faster, Smarter News Retrieval

The Rundown

DynaTree, a novel two-stage framework, significantly enhances time-sensitive news retrieval by decoupling planning from retrieval. Existing agentic RAG methods often couple these, leading to high inference costs and slower performance, especially for news that requires immediate access. DynaTree first constructs a reusable retrieval tree offline using coordinated agents, mapping out the semantic space of a query topic. In the online stage, it performs lightweight daily subtree selection based on a time-localized evaluation proxy, eliminating the need for further agentic reasoning or tree modification. Deployed in the Syft production system, DynaTree's dynamically adapted variant improved survival rates from 0.32-0.53 to 0.59-0.73 during A/B testing from Jan. 28 to Feb. 6, 2026. It consistently outperformed existing production recallers daily, demonstrating the power of persistent, structure-aware semantic expansion for real-world news applications.

The details

DynaTree uses a two-stage framework to improve efficiency in time-sensitive news retrieval.
An offline stage builds a reusable retrieval tree, materializing the semantic space of a query topic.
The online stage performs lightweight daily subtree selection without further agentic reasoning.
In production A/B testing, DynaTree improved survival rates from 0.32-0.53 to 0.59-0.73.
It consistently outperformed existing production recallers on multiple evaluation days.

Why it matters

For media startups and content platforms, DynaTree offers a pathway to deliver fresher, more relevant news faster. This efficiency gain can translate to better user engagement and a competitive edge in the fast-paced news cycle.

🖼️ Vision-Language Models

FBHM: Tackling Hateful Memes with VLMs

The Rundown

Detecting hateful memes presents a significant challenge for vision-language models (VLMs). Existing benchmarks often confound rhetorical mechanisms with target community features, hindering causal evaluation of model vulnerabilities. To address this, researchers introduced FBHM (Functionality Based Hateful Memes), a benchmark with 5,000 memes structured along 25 distinct rhetorical functionalities and 10 target communities. Benchmarking current best VLMs revealed a severe generalization gap, with models dropping to near-random performance on FBHM. To efficiently close this gap, they propose LSV (learnable steering vectors), an ultra-low data regime strategy. Using as few as 500 steering samples, LSV boosted FBHM performance by approximately 30 Macro-F1 points, outperforming in-context learning and PEFT without degrading source-domain performance. This approach allows for more robust and causal evaluation of VLM capabilities in sensitive content detection.

The details

FBHM is a new benchmark for hateful meme detection, featuring 25 rhetorical functionalities and 10 target communities.
current best VLMs show a severe generalization gap, performing poorly on FBHM compared to standard datasets.
LSV (learnable steering vectors) is proposed to efficiently close this performance gap.
LSV improves FBHM performance by ~30 Macro-F1 points using only 500 steering samples.
The method outperforms in-context learning and PEFT without degrading source-domain performance.

Why it matters

Startups developing content moderation tools can use FBHM and LSV to build more robust VLM-based systems. This capability is crucial for platforms aiming to combat online hate speech effectively and responsibly.

Frequently Asked Questions

COLLEAGUE.SKILL is a system that automates the generation of AI skills by distilling expert knowledge from various sources into reusable formats.

DynaTree uses a two-stage process, building a retrieval tree offline and then performing lightweight daily selections online, making news retrieval faster and more adaptive.

FBHM is a benchmark designed to evaluate vision-language models on hateful meme detection, focusing on rhetorical functionalities and target communities.

LSV is a strategy using minimal data to improve VLM performance on challenging tasks like hateful meme detection.

COLLEAGUE.SKILL aims to imbue AI agents with bounded representations of human expertise, judgment, and interaction styles.

For breaking news and rapidly evolving events, fast and accurate retrieval is crucial for timely information dissemination and decision-making.

Existing benchmarks often confound rhetorical mechanisms with target community features, preventing causal evaluation of model vulnerabilities.

Startups can use AI skill generation to create specialized agents that embody unique company expertise or replicate efficient workflows.

The offline stage in DynaTree constructs a reusable retrieval tree, mapping the semantic space of a query topic for efficient online use.

AutoSci aims to be a memory-centric agentic system that manages the full scientific research lifecycle, from literature review to manuscript preparation.

LSV uses an ultra-low data regime and a causal intervention objective, outperforming methods like PEFT and in-context learning.

Agents that learn interaction styles can provide more natural and effective human-AI collaboration, improving user experience.

The framework's principles of offline semantic mapping and online adaptive selection could potentially be adapted for other retrieval-intensive tasks.

A structured benchmark like FBHM allows for more precise evaluation of VLM capabilities and vulnerabilities in detecting harmful content.

The system produces inspectable and correctable skill packages, allowing for natural language feedback and updates.

AI Agents Learn Expertise, News Retrieval Accelerates

Use this article as a reusable operator context layer

In today's rundown

COLLEAGUE.SKILL: Distilling Expert AI Capabilities

DynaTree: Faster, Smarter News Retrieval

FBHM: Tackling Hateful Memes with VLMs

Trending AI Tools and AI Research

Everything Else

Frequently Asked Questions

Related Articles

AI Agents Automate Work, TTS Models Shrink

Ultralytics YOLO26: Real-Time Vision Gets Smarter

AI Agents Get Personal Skills, News Retrieval Gets Faster

Related Articles

Jun 8
AI Agents Automate Work, TTS Models Shrink
AI agents slash knowledge work time, new TTS models hit low latency.

Jun 5
Ultralytics YOLO26: Real-Time Vision Gets Smarter
Ultralytics YOLO26, EvoDS agents, and Humanoid-GPT push real-time vision and autonomous data science.

Jun 1
AI Agents Get Personal Skills, News Retrieval Gets Faster
Automated skill generation, dynamic news retrieval, and hateful meme detection.