33 papers - avg viability 3.8
Current research on large language models (LLMs) is increasingly focused on understanding their internal mechanisms and improving their outputs for practical applications. Recent work has highlighted the importance of discourse coherence in persona discovery, revealing that deeper semantic structures govern LLM behavior rather than superficial lexical patterns. This has implications for creating more relatable AI interactions. Additionally, the emergence of verbal tics in model outputs raises concerns about the authenticity of LLM-generated text, suggesting that alignment techniques may impose a cost on naturalness. Researchers are also investigating the independence of LLMs, uncovering behavioral entanglements that could compromise multi-model systems. Furthermore, new frameworks for analyzing representation drift and reasoning trajectories aim to enhance model reliability and performance. Together, these efforts address commercial challenges in deploying LLMs across various sectors, from healthcare to social media, by striving for more nuanced, coherent, and independent AI systems that can better engage with human users.
Discover and analyze LLM personas through advanced bridging inference techniques.
This research provides evidence that agent identity induces attractor-like geometry in LLM activation space, offering a new way to understand and potentially control LLM behavior.
A systematic analysis and metric for quantifying and reducing repetitive verbal tics in LLMs, addressing the 'alignment tax' for more natural human-AI interaction.
A statistical framework to audit and mitigate behavioral entanglement in large language models, improving ensemble verification accuracy.
A geometric risk bound that decomposes LLM variant drift into scale, shape, and head components, enabling targeted remediation and acting as a regularizer against forgetting.
Protocol to probe reasoning trajectory efficiency in language models enhancing safe deployment.
This paper probes BERT embeddings to understand if they encode narrative dimensions like time, space, causality, and character in fiction, achieving 94% accuracy with a linear probe.
A framework for conditional hypothesis generation that incorporates researcher-specified covariates to discover interpretable language differences within relevant subgroups.
This research analyzes how LLMs represent human word associations, revealing differences in variability and typicality based on model size and temperature settings.
A synthetic corpus and interactive platform for analyzing LLM discourse across diverse human personas and societal topics to audit bias and social sensitivity.