A view selection agent is a component in active perception systems, particularly for Embodied Question Answering (EQA), designed to intelligently choose optimal viewpoints. It filters redundant frames and identifies question-aligned anchor views to gather relevant context in 3D environments, overcoming limitations of fixed-view vision-language models.
A view selection agent helps AI models explore 3D environments more effectively by intelligently choosing the best camera angles to find information. This allows the AI to gather all the necessary context for answering questions or performing tasks, especially when information is hidden or spread out, making the AI smarter and more capable.
viewpoint selection agent, active view selector, context view selector
Was this definition helpful?