Recent theoretical advancements in AI are increasingly focused on understanding the underlying mechanics of model behavior and improving their efficiency. Work on spectral superposition is revealing how neural networks manage feature representation, emphasizing the geometric relationships between features, which could enhance interpretability and diagnostics in complex models. Meanwhile, research on rectified flow models is demonstrating significant improvements in sample complexity, offering a more efficient alternative to traditional generative models, which could streamline applications in data generation and simulation. The exploration of self-rewarding language models is shedding light on their iterative alignment capabilities, providing theoretical guarantees that explain their success in improving performance without external feedback. This shift toward rigorous theoretical frameworks not only clarifies existing methodologies but also suggests new pathways for developing AI systems that are both more efficient and interpretable, addressing commercial needs for reliable and understandable AI applications across various industries.
Self-Rewarding Language Models (SRLMs) achieve notable success in iteratively improving alignment without external feedback. Yet, despite their striking empirical progress, the core mechanisms driving...
Neural networks represent more features than they have dimensions via superposition, forcing features to share representational space. Current methods decompose activations into sparse linear features...
Recently, flow-based generative models have shown superior efficiency compared to diffusion models. In this paper, we study rectified flow models, which constrain transport trajectories to be linear f...
Can language models improve their accuracy without external supervision? Methods such as debate, bootstrap, and internal coherence maximization achieve this surprising feat, even matching golden finet...
For each axiom of KM belief update we provide a corresponding axiom in a modal logic containing three modal operators: a unimodal belief operator $B$, a bimodal conditional operator $>$ and the unimod...
The encounter between human reasoning and generative artificial intelligence (GenAI) cannot be adequately described by inherited metaphors of tool use, augmentation, or collaborative partnership. This...
Contrary to common belief, common belief is not KD4. If individual belief is KD45, common belief does indeed lose the 5 property and keep the D and 4 properties -- and it has none of the other commo...
This paper formalizes religious epistemology through the mathematics of Variational Autoencoders. We model religious traditions as distinct generative mappings from a shared, low-dimensional latent sp...