ScienceToStartup
Product
Trends
Topics
Saved
Articles
Changelog
Careers
About
Enterprise
Resources
DPO | Glossary | ScienceToStartup
Home
Resources
Glossary
DPO
DPO
Definition
DPO is a research_field in our research taxonomy.
Related papers
From Baselines to Preferences: A Comparative Study of LoRA/QLoRA and Preference Optimization for Mental Health Text Classification
ARIADNE: A Perception-Reasoning Synergy Framework for Trustworthy Coronary Angiography Analysis
Stabilizing Iterative Self-Training with Verified Reasoning via Symbolic Recursive Self-Alignment
GameTalk: Training LLMs for Strategic Conversation
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
IntroSVG: Learning from Rendering Feedback for Text-to-SVG Generation via an Introspective Generator-Critic Framework
QualiTeacher: Quality-Conditioned Pseudo-Labeling for Real-World Image Restoration
wDPO: Winsorized Direct Preference Optimization for Robust LLM Alignment
DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding
The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation
AdaRubric: Task-Adaptive Rubrics for LLM Agent Evaluation
It's Time to Get It Right: Improving Analog Clock Reading and Clock-Hand Spatial Reasoning in Vision-Language Models
Learning Latent Proxies for Controllable Single-Image Relighting
LLMOrbit: A Circular Taxonomy of Large Language Models -From Scaling Walls to Agentic AI Systems
From Prompting to Preference Optimization: A Comparative Study of LLM-based Automated Essay Scoring
Was this definition helpful?
Yes
No