Causal Prompt Optimization (CPO) is a framework that reframes LLM prompt design as a causal estimation problem to mitigate performance instability. It learns an unbiased causal reward model using Double Machine Learning (DML) to isolate prompt effects, then guides a resource-efficient search for query-specific prompts.
Causal Prompt Optimization (CPO) helps make large language models (LLMs) more reliable by automatically creating better prompts. It figures out which parts of a prompt truly cause better results, rather than just being correlated, and then uses this understanding to find the best prompt for each specific question without expensive trial-and-error.
CPO, Causal APO, Causal Prompt Engineering
Was this definition helpful?