00:00:00AI agents have started to integrate with every part of our lives now.
00:00:03And one of the biggest areas that's happened in is the browser.
00:00:06Every major AI company has realized that the browser is the one tool everyone uses every
00:00:11single day.
00:00:12So why not put AI into that?
00:00:14But the truth is they all suck.
00:00:15And it's not a matter of optimization.
00:00:17There's a fundamental problem that no amount of it is going to fix.
00:00:20But Google in collaboration with Microsoft just released something called WebMCP.
00:00:24And instead of trying to make agents better at using websites, it makes websites better
00:00:29at talking to agents.
00:00:30That's a completely different approach.
00:00:32And what it enables is something we haven't seen before.
00:00:35So this is a simple HTML page running on a local server.
00:00:38Opening the extensions tab, we have the WebMCP extension.
00:00:41Opening it, below the name of this site, we have one tool, BookTable.
00:00:45We connected this WebMCP bridge to Clod code and told it that we had a restaurant booking
00:00:49form open with WebMCP tools available.
00:00:52We gave it the task of booking a table for two with a date, a name and a special request.
00:00:57All of those fields are there in the form.
00:00:59It confirmed the date, used the WebMCP tool that the site provided, filled out the fields
00:01:03and successfully made the reservation.
00:01:06Right now, an agent has two ways to figure out what's on screen.
00:01:09The first way is vision-based.
00:01:11The agent takes a screenshot of the entire page, annotates every element it can see and
00:01:15feeds that image to a model that tries to figure out what to click.
00:01:19The second way is DOM parsing.
00:01:21The agent pulls the raw HTML of the page.
00:01:24And if you've ever opened Inspect Element on any website, you know what that looks like.
00:01:28Thousands of lines of code.
00:01:29The agent reads through all of that and tries to identify the right button.
00:01:33Both of these approaches have the same fundamental problem.
00:01:35They're non-deterministic.
00:01:36The agent is making its best guess every single time.
00:01:39The reason none of this works consistently is because the entire internet was built for
00:01:43human eyes.
00:01:45Every website assumes a person is looking at it.
00:01:47There's no structure for machines.
00:01:48So every agent, no matter how good the model is, is stuck trying to interpret something
00:01:53that was never designed to be interpreted by a machine.
00:01:55With WebMCP, instead of the agent trying to figure out your website, your website registers
00:02:00its available actions as tools.
00:02:01When an agent lands on a page, it doesn't guess.
00:02:04It just reads the available tools and calls them directly.
00:02:07Right now, WebMCP is available for early preview only.
00:02:10As the agentic web evolves, websites also need to evolve with it.
00:02:13And as you already saw, by defining those tools, we give these agents better access to interact
00:02:18with our sites.
00:02:19The demo worked because it was a simple HTML form.
00:02:21But most real websites aren't that simple.
00:02:23So WebMCP actually has two different approaches depending on what you're working with.
00:02:28There are two ways that allow agents to take control of the browser.
00:02:31The declarative API is for simple workflows like the HTML forms you just saw.
00:02:35The imperative API is for full scale web apps with multiple pages and those require some
00:02:40extra implementation that we'll get into further on.
00:02:43As of right now, there's no official documentation, but they have a repository of WebMCP tools
00:02:48in Google Chrome labs with two demos and only one of them is actually hosted.
00:02:52There's a simple flight search demo and an official Marvel context tool inspector extension.
00:02:56After you install that, whatever websites have WebMCP implemented, you'll be able to detect
00:03:01those tools via the extension and you'll be able to do some other cool stuff as well.
00:03:05The input schema for the tools shows up right there.
00:03:07Right now, there's only one tool on this page, the search flights tool.
00:03:10They've given two options to use this.
00:03:12You can either give custom input arguments that the AI model has to fill out or you can
00:03:16set your Gemini API key, give a user prompt in simple English and the page will be controlled
00:03:21according to that.
00:03:22So right now it has these default inputs.
00:03:24We swapped them out and it actually searched for flights and got a bunch of results.
00:03:28I went back and this time the WebMCP travel site had four tools available where three of
00:03:32them are now filters that can be applied.
00:03:35The input arguments for the page had also changed.
00:03:37I added another argument and it gave us a notification that the filter settings were updated.
00:03:41No flights matched those filter settings, but all of them were applied.
00:03:44We switched between Zen browser and Chrome throughout this and that's because while they've
00:03:48released WebMCP as an open protocol that any browser could use, right now it only works
00:03:54on Chrome's Canary version.
00:03:55That's until they release the standard so that everyone can use it.
00:03:58So that's as far as the official tooling goes right now.
00:04:01No documentation, only two demos and it only works on Chrome Canary and you can't use it
00:04:05with Claude code because it's actually intended to be used by browser agents.
00:04:09So we found this custom WebMCP bridge that you can install on your system and it gives
00:04:14you an MCP and an extension as well.
00:04:16This is what allows Claude code to use WebMCP and navigate and use the tools that any website
00:04:22offers.
00:04:23To show how sites actually implement this, we'll start with the simpler approach.
00:04:27In the declarative API, which you saw with the HTML form, all you really have to do is
00:04:31declare three things inside the HTML form, the tool name, tool description and tool para
00:04:36description.
00:04:37You don't need to dive deep into them.
00:04:39You just need to make sure your agent adds them in.
00:04:41We had two guides made reverse engineered from the demos in the WebMCP repo and we gave Claude
00:04:46code access to those.
00:04:47Now during that process, we actually ran into some common problems and had to fix these
00:04:51guides along the way.
00:04:53Both of them are available in AI Labs Pro, which is our community where you get ready
00:04:57to use templates.
00:04:58You can plug directly into your projects for this video and all previous ones.
00:05:01The main teaching is all here in the video, but if you want the actual files, the links
00:05:05in the description.
00:05:06If your agent adds in these declarations, the rest is up to the browser, which reads
00:05:10them from the HTML.
00:05:12The second way was the imperative API for cases where you need more complex interactions and
00:05:17JavaScript execution.
00:05:18We had a Next.js app initialized, gave Claude code the Next.js guide and that was all it
00:05:23needed to implement it.
00:05:24In React apps, it creates a new file in the library folder where it declares all the tools
00:05:29the site needs.
00:05:30These are all the functions and these are their definitions.
00:05:33But since these web apps can become so big and even have potentially more than 100 tools,
00:05:38we get the same problem we get in Claude code where the context just overloads everything
00:05:41and breaks the whole thing.
00:05:43So instead of loading all the tools a website has, it's better to load only the tools a single
00:05:47page has.
00:05:48This concept is called contextual loading.
00:05:50So this is the Next.js app we had Claude code make.
00:05:53It's a fully functional small demo app with the backend implemented.
00:05:57Right now we're on the main homepage and this site only has 3 tools available.
00:06:01I went into the cart page and this time we had 4 tools and the names had also changed.
00:06:05The availability of tools changes based on the page you're on.
00:06:09This is where the registration functions come in.
00:06:11Whenever you land on a page, like the homepage, it runs the register home tools function and
00:06:15when you leave it runs unregister home tools.
00:06:18Based on which tools belong to that page, it just registers and then unregisters them.
00:06:23This is why it doesn't depend on the browser alone in this case, but the code also handles
00:06:27the integration.
00:06:28We're not actually using WebMCP with a browser agent, which is what Google wants and what
00:06:32each browser would implement themselves.
00:06:34We're actually using a bridge that connects Claude code to WebMCP and this is how we control
00:06:39websites.
00:06:40If you want to get more out of Claude code itself, we actually have a video on the 10
00:06:44most updated ways to gain an advantage with it.
00:06:46This bridge is a community project and with the imperative API, it has a problem where
00:06:51tool switching doesn't really work with this MCP server.
00:06:54When I opened the site, we were on the checkout page and initialized the Claude code session
00:06:58there.
00:06:59When we asked it to navigate back to the homepage, it couldn't see the tools available on the
00:07:03homepage.
00:07:04We were on the homepage and I went into the product page and we got an add to cart button.
00:07:08But when it was on the product page, it couldn't really see that button.
00:07:11So we had to manually add an item to the cart to demo this.
00:07:14But when we asked it to complete the checkout, it automatically filled in the details, placed
00:07:18the order and completed the whole shopping flow.
00:07:21So that's one limitation of this MCP, which brings us to another point.
00:07:25WebMCP is open source with major browser vendors and tech companies listed as participants.
00:07:30But right now, the only browser that supports it is Chrome Canary and the intended agent
00:07:34is Gemini, Google's own AI built directly into the browser.
00:07:38If you're a website owner and you implement WebMCP today, the only agent that can use
00:07:42your tools natively is Gemini.
00:07:44Claude code needs a community built bridge that breaks when contextual loading kicks in.
00:07:49Every non Google agent is at a disadvantage.
00:07:51Now could Claude catch up?
00:07:52Sure, they have their own browser extension.
00:07:55And since that's also a browser agent, it could potentially discover these tools the same way
00:07:59Gemini does.
00:08:00But the question is how many people are going to deliberately install a Claude browser extension
00:08:04versus just using the Gemini that's already built into Chrome.
00:08:08Chrome has billions of users, they don't need to install anything.
00:08:11In our opinion, Google isn't locking anyone out.
00:08:13They're just taking advantage of the architecture and the audience they already have.
00:08:17An open standard that works best inside the browser they already own with the agent they
00:08:21already ship.
00:08:22That doesn't mean you shouldn't implement it.
00:08:23The standard itself is genuinely useful and making your site agent accessible is smart
00:08:28regardless of which agent benefits first.
00:08:30There are a few things worth knowing if you implement this.
00:08:33The spec recommends no more than 50 tools per page.
00:08:36This isn't meant to expose your entire application.
00:08:38It's meant for focused, specific actions, the things someone would actually want to do on
00:08:42that page.
00:08:43Tool descriptions also matter more than you'd think.
00:08:46Agents read those descriptions to decide which tool to call.
00:08:49Vague descriptions mean the agent picks the wrong tool or skips it entirely.
00:08:53Write them like you're explaining the action to someone who's never seen your site.
00:08:57And this is still experimental.
00:08:58The API surface will change.
00:09:00Chrome 146 ships in March with broader support.
00:09:03But until then, this is a dev trial.
00:09:05Don't ship it to production yet.
00:09:06If you follow this channel, you know that keeping up with AI requires a strong technical foundation.
00:09:11That is why I love Brilliant.
00:09:13It's an interactive platform with hands on lessons crafted by world class teachers from
00:09:17MIT, Harvard, and Stanford.
00:09:19I highly recommend their clustering and classification and how AI works courses.
00:09:23They teach you to uncover hidden patterns and understand the logic behind large language
00:09:27models interactively.
00:09:28As you can see in the catalog on screen, they offer a massive variety of courses covering
00:09:33everything from foundational math to advanced data science and computer science.
00:09:37Brilliant is also giving our viewers 20% off an annual premium subscription, providing unlimited
00:09:42daily access to everything on the platform.
00:09:44To learn for free on Brilliant for a full 30 days, go to brilliant.org/ailabs, scan the
00:09:50QR code on screen, or click the link in the description.
00:09:53Build a real learning habit today and take your skills to the next level by heading over
00:09:56to Brilliant.
00:09:57That brings us to the end of this video.
00:09:59If you'd like to support the channel and help us keep making videos like this, you can do
00:10:03so by using the super thanks button below.
00:10:06As always, thank you for watching and I'll see you in the next one.