Think Twice, Click Once: Enhancing GUI Grounding via Fast and Slow Systems
作者: Fei Tang, Yongliang Shen, Hang Zhang, Siqi Chen, Guiyang Hou, Wenqi Zhang, Wenqiao Zhang, Kaitao Song, Weiming Lu, Yueting Zhuang
研究方向: 图形用户界面(GUI)自动化与视觉语言模型(VLM)结合
FOCUS是一个结合快速预测与深入分析的GUI定位模型,旨在提高GUI自动化系统的性能,使其能够更准确地根据自然语言指令定位和解释界面元素。