When simulations look right but causal effects go wrong: Large language models as behavioral simulators | ScienceToStartup