Computer use agents take actions in a GUI rather than calling structured APIs. They see a screenshot, decide what to click or type, observe the result, and loop. Anthropic launched this capability with Claude 3.5 (October 2024) and it's now available from several providers and open-source projects.
The use cases are real but narrow right now: automating tasks in legacy software with no API, testing UIs at scale, assisted data entry, and research tasks that require browsing. The success rate for complex multi-step web tasks is still 50-70% on benchmarks — useful for assistants, not yet reliable enough for unattended automation.
The security surface is massive: a computer-use agent that can click and type has access to everything the user has. Isolate it in a sandboxed VM, log every action, and require human approval before anything irreversible (sending emails, submitting forms, making purchases).
Bring this to your business
Knowing the term is one thing. Shipping it is another.
We do two-week AI Sprints — one term, one workflow, into production by Day 10.