LongCoT: Benchmarking Long-Horizon Chain-of-Thought Reasoning | ScienceToStartup