How are new benchmarks like InterveneBench driving progress in causal inference for LLMs?Reviewed by ScienceToStartup EditorialUpdated 4/16/2026Answer not yet generated.