Recent research in enterprise AI is increasingly focused on enhancing the capabilities of large language models (LLMs) to navigate complex organizational environments. New benchmarks, such as World of Workflows and OurBench, highlight the limitations of LLMs in understanding intricate workflows and debugging SQL, respectively, revealing significant performance gaps that need to be addressed for practical applications. Additionally, advancements in routing natural language queries across multi-database systems emphasize the necessity for structured reasoning to improve accuracy in data retrieval. The introduction of architectures like REGAL aims to provide deterministic grounding for agentic AI in enterprise telemetry, addressing challenges related to model context and semantic alignment. Collectively, these efforts signal a shift toward developing more reliable and context-aware AI systems that can effectively manage the complexities of enterprise operations, potentially reducing coordination costs and reshaping organizational structures in the process.
Frontier large language models (LLMs) excel as autonomous agents in many domains, yet they remain untested in complex enterprise systems where hidden workflows create cascading effects across intercon...
SQL is central to enterprise data engineering, yet generating fully correct SQL code in a single attempt remains difficult, even for experienced developers and advanced text-to-SQL LLMs, often requiri...
We address the task of routing natural language queries in multi-database enterprise environments. We construct realistic benchmarks by extending existing NL-to-SQL datasets. Our study shows that rout...
Enterprise engineering organizations produce high-volume, heterogeneous telemetry from version control systems, CI/CD pipelines, issue trackers, and observability platforms. Large Language Models (LLM...
The boundary of the firm is determined by coordination cost. We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew with ...