- Published on
AI Agents Computer Interface
- Authors
- Name
- AbnAsia.org
- @steven_n_t
A new class of AI Agents are evolving with the capability to understand and navigate a Graphic Computer Interface like a human would.
Recent advances in Foundation Models, especially Large Language Models (LLMs) and Multimodal Language Models (MLMs), have enabled AI Agentsto complete complex tasks.
Some of these AI Agents with vision capabilities make use of MLMs to interpret and interact with Graphical User Interfaces (GUIs), emulating how a human would interact with a GUI. By performing actions like clicking and typing to fulfil user requests.
This study reviews and map the progress in AI Agent Computer Interfaces (ACI), focusing on innovations in data, frameworks and applications.
Author
AiUTOMATING PEOPLE, ABN ASIA was founded by people with deep roots in academia, with work experience in the US, Holland, Hungary, Japan, South Korea, Singapore, and Vietnam. ABN Asia is where academia and technology meet opportunity. With our cutting-edge solutions and competent software development services, we're helping businesses level up and take on the global scene. Our commitment: Faster. Better. More reliable. In most cases: Cheaper as well.
Feel free to reach out to us whenever you require IT services, digital consulting, off-the-shelf software solutions, or if you'd like to send us requests for proposals (RFPs). You can contact us at [email protected]. We're ready to assist you with all your technology needs.
© ABN ASIA