179 results for “topic:computer-use”
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
Driving all platforms UI automation with vision-based model
RobotGo, Go Native cross-platform RPA, GUI automation, Auto test and Computer use @vcaesar
Bytebot is a self-hosted AI desktop agent that automates computer tasks through natural language commands, operating within a containerized Linux desktop environment.
Agent S: an open agentic framework that uses computers like a human
Agent Framework For Fintech and Banks
Fara-7B: An Efficient Agentic Model for Computer Use
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
AI computer use powered by open source LLMs and E2B Desktop Sandbox
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
This is the official website for TuriX Computer-use-Agent
An open-sourced end-to-end VLM-based GUI Agent
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
CUGA is an open-source generalist agent for the enterprise, supporting complex task execution on web and APIs, OpenAPI/MCP integrations, composable architecture, reasoning modes, and policy-aware features.
A framework to enable autonomous android and computer use using any LLM (local or remote)
Desktop app to control your computer with AI using your terminal, browser, mouse & keyboard
Browser Operator - The AI browser with built in Multi-Agent platform! Open source alternative to ChatGPT Atlas, Perplexity Comet, Dia and Microsoft CoPilot Edge Browser
The only general AI agent that does NOT requires extra API key, giving you full control on your local and remote MacOs from Claude Desktop App
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).
The Open Framework for autonomous virtual computer agents at scale, fully open-source, safe, auditable, and production-ready.
Open source virtual desktops for AI agents
AI-powered login automation. Uses Claude to classify login pages and Playwright to interact with them.
Computer-Use SDK for E2E QA Testing
A fully-featured, GUI-powered local LLM Agent sandbox with complete MCP protocol support. Features both CLI and full desktop environment, enabling AI agents to operate browsers, terminal, and other desktop applications just like humans. Based on E2B oss code.
Spongecake is the easiest way to launch computer use agents.
A general AI agent framework that can be adapted to various tasks and environments.