o1 represents a paradigm in LLM development that optimizes performance by increasing computational effort during inference, known as test-time compute. It enables models to achieve high capabilities, like GPT-4 performance, by leveraging more compute at runtime rather than solely through larger model sizes.
o1 is a strategy for large language models that boosts performance by using more computing power when the model is actually generating responses, rather than just making the model bigger. This allows models to achieve top-tier performance, like GPT-4, by investing in smarter inference rather than just massive scale.
test-time compute, inference-time optimization
Was this definition helpful?