How can NLP evaluation frameworks like Omanic be made more a | ScienceToStartup | ScienceToStartup