Current research in information retrieval is increasingly focused on enhancing robustness and adaptability in the face of real-world challenges, such as noisy user queries and evolving data landscapes. Recent work emphasizes the importance of modeling query uncertainty and temporal drift, with frameworks like QUARK improving retrieval performance by aggregating multiple interpretations of user intent. This is complemented by investigations into how temporal changes in data affect benchmark reliability, suggesting that retrieval models can remain effective even as underlying corpora evolve. Additionally, innovative approaches like denoising diffusion models are being explored to create more robust ranking systems, moving beyond traditional discriminative methods. The field is also addressing the complexities of neuro-symbolic reasoning, proposing new retrieval languages and algorithms that enable efficient evaluation of intricate logical queries. Collectively, these advancements aim to solve pressing commercial challenges, such as improving user satisfaction and maintaining relevance in dynamic information environments.
Document retrieval identifies relevant documents but does not provide fine-grained evidence cues, such as specific relevant spans. A possible solution is to apply an LLM after retrieval; however, this...
User queries in real-world retrieval are often non-faithful (noisy, incomplete, or distorted), causing retrievers to fail when key semantics are missing. We formalize this as retrieval under recall no...
Information retrieval (IR) benchmarks typically follow the Cranfield paradigm, relying on static and predefined corpora. However, temporal changes in technical corpora, such as API deprecations and co...
We present ReFormeR, a pattern-guided approach for query reformulation. Instead of prompting a language model to generate reformulations of a query directly, ReFormeR first elicits short reformulation...
While Late Interaction models exhibit strong retrieval performance, many of their underlying dynamics remain understudied, potentially hiding performance bottlenecks. In this work, we focus on two top...
In information retrieval (IR), learning-to-rank (LTR) methods have traditionally limited themselves to discriminative machine learning approaches that model the probability of the document being relev...
Modern information retrieval is transitioning from simple document filtering to complex, neuro-symbolic reasoning workflows. However, current retrieval architectures face a fundamental efficiency dile...