data
Hand-labelled evaluation set with known commercial outcomes. New scoring models must beat the previous generation on it before shipping.
Golden Corpus is the regression test for scoring. It is a stratified sample of papers with hand-labelled commercial outcomes, signed by a human reviewer. Every new model iteration replays the Golden Corpus and the result is stamped into the Model Iteration record. A model that does not beat the previous generation cannot ship to the daily pipeline.
Source: curated glossary catalog. Freshness: git_versioned_curated_catalog.
Term API · apps/web/data/glossary/terms.ts