GazeMoE: Perception of Gaze Target with Mixture-of-Experts explores GazeMoE is an end-to-end framework that selectively leverages gaze-target-related cues from a frozen foundation model through MoE modules, achieving state-of-the-art performance in gaze estimation.. Commercial viability score: 8/10 in Gaze Estimation.
Use an AI coding agent to implement this research.
Lightweight coding agent in your terminal.
Agentic coding tool for terminal workflows.
AI agent mindset installer and workflow scaffolder.
AI-first code editor built on VS Code.
Free, open-source editor by Microsoft.
6mo ROI
2-4x
3yr ROI
10-20x
Lightweight AI tools can reach profitability quickly. At $500/mo average contract, 20 customers = $10K MRR by 6mo, 200+ by 3yr.
Zhongxi Lu
University of Leicester
Vincent G. Zakka
Aston University
Luis J. Manso
Aston University
Find Similar Experts
Gaze experts on LinkedIn & GitHub
High Potential
2/4 signals
Quick Build
4/4 signals
Series A Potential
4/4 signals
Sources used for this analysis
arXiv Paper
Full-text PDF analysis of the research paper
GitHub Repository
Code availability, stars, and contributor activity
Citation Network
Semantic Scholar citations and co-citation patterns
Community Predictions
Crowd-sourced unicorn probability assessments
Analysis model: GPT-4o · Last scored: 4/2/2026
Generating constellation...
~3-8 seconds
This research addresses the need for accurate gaze target estimation in real-world scenarios, enabling improved human-computer interaction and understanding of human cognition through non-invasive means.
The solution can be packaged as a SaaS tool for companies needing to integrate gaze tracking in their robotics, augmented reality, or customer analytics platforms.
It could replace less accurate gaze tracking solutions that do not leverage multi-modal cues or advanced Mixture-of-Experts architectures, offering higher performance and versatility across various deployment scenarios.
The market is substantial, involving sectors like robotics, automotive (for driver monitoring), retail (consumer analytics), and healthcare (autism research), where accurate gaze tracking is crucial. Companies in these sectors would likely pay for such a technology to enhance their products and services.
Deploy GazeMoE in retail environments to analyze customer interest on shelves or products in real-time, aiding in consumer behavior analytics and shelf management.
The paper proposes GazeMoE, a model using Mixture-of-Experts layers to dynamically route and analyze visual cues such as eye landmarks, head poses, gestures and scene context to accurately estimate gaze direction from images, using DINOv2 as a frozen foundation model for feature extraction.
The model was tested on several benchmark datasets, showing superior performance in terms of prediction accuracy and robustness in diverse and out-of-distribution visual environments compared to existing methods.
The model requires fine-tuning and may not be as effective with low-quality input data. Additionally, reliance on large pre-trained models like DINOv2 may introduce limitations in terms of model updates and availability.
Showing 20 of 33 references