OmniParser is a screen parsing tool that converts user interface screenshots into structured elements, enhancing the ability of vision-based GUI agents to generate accurate actions grounded in interface regions.