Recent research in human-computer interaction is increasingly focused on enhancing user experience through predictive modeling and adaptive interfaces. One notable trend is the development of systems that anticipate user actions by analyzing multimodal interaction data, which could streamline workflows in various applications, from mobile devices to enterprise software. Additionally, new approaches to natural language querying are being explored, emphasizing pragmatic repair to clarify user intent and improve interaction efficiency. This is complemented by efforts to integrate theory of mind capabilities into AI, enabling systems to better understand user mental states and adapt accordingly. Furthermore, advancements in visual attention modeling and augmented reading systems are providing more resource-efficient design strategies, allowing for real-time personalization and optimization. As generative AI becomes more prevalent, understanding user trust dynamics in these interactions is critical, particularly as they intersect with emotional support roles. Collectively, these developments signal a shift toward more intuitive, context-aware, and user-centered technology.
Truly proactive AI systems must anticipate what we will do next. This foresight demands far richer information than the sparse signals we type into our prompts -- it demands reasoning over the entire ...
In this paper, we introduce a new task, Reactive Listener Motion Generation from Speaker Utterance, which aims to generate naturalistic listener body motions that appropriately respond to a speaker's ...
User performance is crucial in interactive systems, capturing how effectively users engage with task execution. Prospectively predicting performance enables the timely identification of users struggli...
Natural language database interfaces broaden data access, yet they remain brittle under input ambiguity. Standard approaches often collapse uncertainty into a single query, offering little support for...
Existing neural network calibration methods often treat calibration as a static, post-hoc optimization task. However, this neglects the dynamic and temporal nature of real-world inference. Moreover, e...
Inferring human engagement from gameplay video is important for game design and player-experience research, yet it remains unclear whether vision--language models (VLMs) can infer such latent psycholo...
Understanding how people allocate visual attention is central to Human-Computer Interaction (HCI), yet existing computational models of attention are often either descriptive, task-specific, or diffic...
Augmented reading systems aim to adapt text presentation to improve comprehension and task performance, yet existing approaches rely heavily on heuristics, opaque data-driven models, or repeated human...
Theory of Mind (ToM) -- the ability to infer what others are thinking (e.g., intentions) from observable cues -- is traditionally considered fundamental to human social interactions. This has sparked ...
Locating a target based on auditory and visual cues$\unicode{x2013}$such as finding a car in a crowded parking lot or identifying a speaker in a virtual meeting$\unicode{x2013}$requires balancing effo...