Home Artificial Intelligence Google unveils Project Mariner: AI agents to use the web for you

Google unveils Project Mariner: AI agents to use the web for you

by admin
Google unveils Project Mariner: AI agents to use the web for you

Google unveiled its first-ever AI agent that can take actions on the web on Wednesday, a research prototype from the company’s DeepMind division called Project Mariner. The Gemini-powered agent takes control of your Chrome browser, moves the cursor on your screen, clicks buttons, and fills out forms, allowing it to use and navigate websites much like a human would.

The company is starting out by releasing its AI agent to a small group of pre-selected testers on Wednesday, Google says.

Google is continuing to experiment with new ways for Gemini to read, summarize, and, now, use websites. A Google executive tells TechCrunch this is part of a “fundamentally new UX paradigm shift”: moving users away from directly interacting with websites, and instead, interacting with a generative AI system that does it for you.

A first look at Project Mariner (Image Credit: Google)

These shifts could affect millions of businesses – from publishers like TechCrunch, to retailers like Walmart – which have historically relied on Google to send real people to visit and use their websites.

In a demo with TechCrunch, Google Labs Director Jaclyn Konzelmann showed how Project Mariner works.

After setting up the AI agent with an extension in Chrome, a chat window pops up to the right of your browser. You can instruct the agent to do things like “create a shopping cart from a grocery store based on this list.”

Here’s what Project Mariner looks like when in use (Image Credit: Google)

From there, the AI agent navigated to a grocery store’s website – in this case, Safeway – and then searched for and added items to a virtual shopping cart. One thing that’s immediately evident is how slow the agent is – there were about 5 seconds of delay in-between each cursor movement. At times, the agent stopped its task and reverted back to the chat window, asking for clarification about certain items (how many carrots, etc.).

Google’s agent cannot check out, as it’s not supposed to fill out credit card numbers or billing information. Project Mariner also won’t accept cookies for users, or sign a terms of service agreement. Google says it purposefully doesn’t allow the agent to do these things, in order to give users more control.

Behind the scenes, Google’s agent is taking screenshots of your browser window, something users must agree to in the terms of service, and sending them to Gemini in the cloud for processing. Gemini then sends instructions back to your computer to navigate the webpage.

Project Mariner can also be used to find flights and hotels, shop for household items, find recipes, and other tasks that currently require users to click through the web.

One major caveat is that Project Mariner only works on a Chrome browser’s foremost active tab, which means you can’t use your computer for other things while the agent works in the background – you need to watch Gemini slowly click around. Google DeepMind’s Chief Technology Officer, Koray Kavukcuoglu, says this was a very intentional decision so that users know what Google’s AI agent is doing.

“Because [Gemini] is now taking actions on a user’s behalf, it’s important to take this step-by-step,” said Kavukcuoglu in an interview with TechCrunch. “It’s complementary. You, as an individual, can use websites, and now your agent can do everything that you do on a website as well.”

Website owners may be relieved to hear that Google’s AI agent works on your computer screen, because that means publishers and retailers still get your eyeballs on their pages. However, Google’s AI agent could mean that users are less engaged with the websites they visit, and one day, it may not require users to use these websites at all.

“[Project Mariner] is a fundamentally new UX paradigm shift that we’re seeing right now,” Konzelmann told TechCrunch. “We need to figure out what is the right way for all of this to change the way users interact with the web, and the way publishers can create experiences for users, as well as for agents, in the future.”

Besides Project Mariner, Google also unveiled several other AI agents for more specific tasks on Wednesday.

One AI agent, Deep Research, aims to help users explore complex topics by creating multi-step research plans. It seems to compete with OpenAI’s o1, which can also do multi-step reasoning. However, a Google spokesperson notes the agent is not designed to solve math and logical reasoning problems, write code, or do data analysis. The AI agent is rolling out in Gemini Advanced today, and will come to the Gemini app in 2025.

When prompted with a difficult or large question, Deep Research will create a multi-step action plan to answer it. After the user approves the plan, Deep Research takes a few minutes to answer the question and search the web, and then generates a lengthy report on its findings.

Another new AI agent from Google, Jules, aims to help developers with coding tasks. It integrates directly into GitHub workflows, allowing Jules to view your existing work and make changes directly in GitHub. Jules is rolling out to a select group of beta testers today, and will be available later in 2025.

Finally, Google DeepMind says its working on an AI agent to help you navigate video games, building on its long history creating game-playing AI. Google is working with game developers, like Supercell, to test Gemini’s ability to interpret gaming worlds such as “Clash of Clans.” Google didn’t offer any release date for this prototype, but says this work is helping them build AI agents that help navigate physical worlds, as well as virtual ones.

It’s unclear when Project Mariner will roll out to Google’s massive userbase, but when they do, these agents will have a significant impact on the broader web. The web is designed for humans to use it, but Google’s AI agents could change that standard.

Source Link

Related Posts

Leave a Comment