Anthropic launches an AI that can take control of your computer and do tasks for you

The wave of generative artificial intelligence has caused the emergence of multiple start-ups that offer products and services related to it. Anthropicfounded in 2021 by former employees of OpenAI dissatisfied with the direction of the company, is one of them. He is the developer of a chatbot, in the style of ChatGPT either Geminicalled Claudeand has just announced an important novelty in the capabilities of this technology: that of take control of the computer mouse and carry out the tasks indicated by the user on its own.

Computer Use or Computer Use is a capability that arrives in the form of a public beta for language models Claude 3.5 Sonnetis updated and now available, and Claude 3.5 Haukunew to the company’s catalog and will arrive before the end of the month. ‘Available today in the APIdevelopers can tell Claude to use computers like people do: looking at a screen, moving a cursor, clicking buttons and typing text‘, Anthropic explains on its website.

In the company’s demo, one of its researchers can be seen using this capability so that Claude is in charge of searching for the information, on the computer and on the website, necessary to fill out a form, complete all the fields and send it. In a side column, Claude explains the steps he takes and the interactions he makes with the computer. In other examples that have transcended, Claude fill out job applications, program a website or order food at home.

‘When a developer entrusts Claude with the use of computer software and gives him the necessary access, Claude looks at screenshots of what is visible to the user, then counts how many pixels vertically or horizontally he needs to move the cursor to click in the right place. Training Claude to accurately count pixels was essential. Without this ability, the model finds it difficult to give commands to the mouse,’ notes Anthropic.

The instructions to the AI ​​are indicated in written form, indicating the steps to follow to complete the task and can contain up to dozens or hundreds of them.

Computer Use has a number of limitations. From Anthropic, they describe this capacity as experimental and ‘sometimes cumbersome and error-prone’. By taking screenshots, you can miss notifications short-term or other changes. An action as common as drag and drop you still can’t do it.

An example of Claude being wrong when using Computer Use is given by the company when explaining that A.I. abandoned a coding task before finishing it and began ‘examining photos of Yellowstone National Park’.

Despite these errors and how green the technology may be, it has the potential to automate many tasks in an office or in use cases for individuals.