Docker Just Fixed 90% of AI Coding By Releasing This

AAI LABS
AI/미래기술컴퓨터/소프트웨어

Transcript

00:00:00One of the most important things in AI has been the MCP protocol,
00:00:03but six months later it has become a huge problem for us.
00:00:06When we started, people only had two to three MCP servers running locally,
00:00:11but MCP has evolved into so much more.
00:00:13Now people literally have hundreds of MCP servers with thousands of tools at a time,
00:00:18and it has become a huge problem.
00:00:20As you know, Cloudflare noticed this first,
00:00:22and Claude followed along by posting a research paper about this problem.
00:00:26But Docker actually came up with a solution for this
00:00:28and solved one of the most critical problems for MCPs
00:00:32by coming up with an entirely new way to use them.
00:00:34A dynamic mode which allows you to save so many tokens,
00:00:37speed up your agents, and make entirely new sorts of automations
00:00:41that I personally am really looking forward to.
00:00:43So Docker actually released an article on this
00:00:46in which they basically urge us to stop hardcoding our agents' environment.
00:00:50Now what do they mean by that?
00:00:51First of all, which MCP servers do we actually trust?
00:00:54The second one is, how do we avoid filling our context with tool definitions
00:00:59that we might not even use?
00:01:00For example, if you have a thousand tools, you might only use two or three in a single chat.
00:01:05The third one is, how do agents discover, configure,
00:01:09and then use these tools efficiently and autonomously?
00:01:12But I want you to focus on the second one, which is,
00:01:14how do we avoid filling our context with tool definitions that we might not even use?
00:01:19For example, if you have a thousand tools, you might only use two or three in a single chat.
00:01:23Anthropic also released a post about this, which we covered in one of our previous videos,
00:01:28and we got a really positive response from people who wanted the implementation.
00:01:32And Docker actually went ahead and implemented this.
00:01:34Now before we move further, you need to know that Docker actually set up
00:01:37the whole infrastructure for this way before it even became a problem.
00:01:41And for that, you need to know about their MCP catalog,
00:01:44in which they've listed verified MCP servers that you can actually trust.
00:01:48And it's really easy to connect to, you just connect them here in Docker.
00:01:52For example, I've connected Notion here, you can see that right now I have two servers,
00:01:56and my MCP client, which most of the time is Claude code only connects to Docker,
00:02:01and then Docker basically manages all my MCP servers.
00:02:04So this entirely solves the first problem about which MCP servers we actually trust.
00:02:09Now to actually enable our agents to use these MCPs dynamically,
00:02:14they've implemented this MCP gateway that already has pre-built tools
00:02:18to use the MCP servers inside the catalog and use them autonomously.
00:02:22So essentially, what happens is you only connect one MCP
00:02:26and this MCP has all the context of which tools it's connected to in the catalog.
00:02:31I've been connected to two and it knows which tool definitions
00:02:34to actually bring into the context window. So your context window does not get bloated.
00:02:38Now for this to actually work, they added some new tools which include MCP find,
00:02:43add and remove, which basically find MCP servers in the catalog by name or description.
00:02:48And as I'll show you, guide you through how to add them correctly.
00:02:51So for example, I'm using my GitHub MCP right here,
00:02:54and I'm telling it that I want to search for some interesting repos.
00:02:57After specifying what kind of repos, it doesn't actually call the tool itself,
00:03:01but rather uses the Docker MCP server, which then calls the tool with the correct information
00:03:06and obviously returns all the results. Now, I want you to notice one thing,
00:03:10the LLM is returning everything about the repos. It's returning the link, it's returning the stars,
00:03:16it's returning the description, and it even knows the date on which these repos were posted.
00:03:21I just want you to remember this because it's going to be an important thing moving forward.
00:03:25Now, moving on to dynamic tool selection. This is the most important part of the article,
00:03:30and this is what I was talking about when I mentioned a new way of using MCP servers.
00:03:35Again, referencing the Claude article, they talk about how Claude or any AI agent
00:03:39actually uses more tokens. One is the tool definitions in the context window,
00:03:43and the second is the intermediate tool results. This is where we talk about the raw results that
00:03:48are actually returned from MCP tool calls. So all of this detail that we searched using the
00:03:53GitHub tool was returned into the context window. That's why Claude knows every small detail about
00:03:59the repos, while I only wanted the description and the link of the repo. In this way, it only
00:04:04takes a few tool calls, like in my case, maybe it was 20 tool calls before the whole context window
00:04:09is actually filled. This is one thing they've improved in the MCP gateway project where they
00:04:14only give the tools that are actually useful. So for example, in my case, one way that the context
00:04:19could be saved is to only give me the search repo tool and not the other 40 tools that come inside
00:04:24this GitHub MCP. Because in this session, I only want to use the search repo tool. But again, once
00:04:30you do start selecting tools this way, it also opens up a new range of possibilities. And that
00:04:34leads us into code mode. Cloudflare basically outlined how we've been using MCP wrong, and that
00:04:40this is not the actual way. And this is where Docker is actually the first one to implement this new
00:04:44solution. I've played around with it a lot. And I must say I'm really surprised by how the execution
00:04:49turned out. So they say that by making it possible for agents to code directly by using MCP tools,
00:04:55meaning they take the tools and implement them in code, they can provide the agents with these code
00:05:00mode tools that use the tools in a completely new way. So what does code mode do? It creates a
00:05:05JavaScript enabled tool that can call other MCP tools. This might seem really simple, but the
00:05:11examples I'm going to show you will hopefully clear this up. Now, before we dive into an implementation,
00:05:16there are other things to consider. First of all, since this is code written by an agent,
00:05:21obviously, it's not tested and not secure. So Docker has planned for this to actually run in
00:05:26a sandbox. And since they already provide Docker containers, this was pretty much a no brainer for
00:05:31them. This approach ends up offering three key benefits. First of all, it's completely secure,
00:05:36because that's the main benefit of sandboxing. It doesn't do any actual damage to your system. Then
00:05:41there's all the token and tool efficiency that we've been talking about where the tool it uses does not
00:05:46have to be sent to the model on every request, the model just needs to know about one new code mode
00:05:52tool. So without code mode, if you're only using, let's say these three tools, and it's running these
00:05:57repeatedly, the context of those 47 other tools is also going alongside the three tools that we're
00:06:03actually using. But with code mode, what happens is the agent writes the custom analyze my repos tool
00:06:09using only the tools that we actually need to use. And every time it just references that one code
00:06:15mode tool. And in this way, it saves all that other context by not sending the tools that we don't
00:06:21actually need to use. And then we have state persistence as well, in which the volumes manage
00:06:26how the data is saved between these tool calls. And they're not actually sent to the model. A very
00:06:30simple example of this can be a data processing pipeline. So let's just say that we want to
00:06:35download a data set, the data set is downloaded and returned, but it's actually saved to the volume,
00:06:41and the model only gets to know that it was downloaded successfully. The model doesn't get
00:06:46flooded with five gigabytes of data. Then if we want to process the first 10,000 rows, the model
00:06:51can just read from the volume where the data is stored and return the actual summary. In this way,
00:06:56only the data that should go to the model, such as final results, summaries, any error messages,
00:07:02or answers to questions is transferred to the model and the context window remains clean.
00:07:07Now the reason I was searching these GitHub repositories is so that I can discover new
00:07:11open source tools to actually put into my videos. And what I normally do is run multiple calls using
00:07:17the find GitHub repo tool. I just write different keywords to search for tools. So I presented this
00:07:22to Claude code and it combined all those different tool calls into a single tool that searches repos
00:07:28based on whatever keywords I give it. You can see that even here without code mode, Docker actually
00:07:32runs multiple queries. And that's what I wanted to fix. The tool it made was called multi search repos.
00:07:37And after creating the tool, it used the MCP exec tool to actually run it. It basically gave me 29
00:07:43unique repos by searching with six different keywords, but the results were just returned
00:07:48directly in the response and in the terminal, meaning that all of the results were being
00:07:52returned inside the context window. To fix this, I told it that it should write everything to a file
00:07:58and the model should just get the description of the repos. No need to give any stars or anything
00:08:02else for that matter. And it changed the tool and wrote all of those results to a text file in my
00:08:07repository so that if I wanted to look up something specific about one repository, I could do so by
00:08:13referencing the text file. Now there is one thing that I'd like to see implemented a way to save and
00:08:18reuse this tool right now. The only option is to manually save it as a file. After that I asked it
00:08:24to search for the notion MCP. Once connected, I asked if it could make a tool that outputs the
00:08:29GitHub search results directly into notion instead of using a text file. And again, using code mode,
00:08:35it actually made the GitHub to notion tool that would allow me to paste the results into notion.
00:08:40And after it ran that there were basically some little problems that I had to fix. But essentially,
00:08:45I now have this database in notion. It's basically hard coded. So whatever query I provide,
00:08:50it'll just go ahead and input the results into this database according to the different fields.
00:08:55And it'll even include the date so that I can filter through them easily and only search for
00:09:00the results that I actually want. The model only gets to know the name and the description of the
00:09:04GitHub repository it gets at a time. It doesn't get anything else, but the rest of the information
00:09:08is all saved here. Honestly, if you just go through this catalog, you'll get at least a single idea of
00:09:14the MCPs that you could chain together to make these amazing workflows. And at the same time,
00:09:19you're saving tokens and preserving the performance of your own AI agent. Getting started with them is
00:09:24honestly pretty easy. You do need to update your Docker version. But if you still don't have it,
00:09:29they might be disabled in beta features. So do make sure that this Docker MCP toolkit is enabled.
00:09:34Other than that, you'll have your catalog and these new features are enabled by default. So
00:09:39all you have to do is connect your client and you can pretty much start with them. That brings us to
00:09:44the end of this video. If you'd like to support the channel and help us keep making videos like this,
00:09:49you can do so by using the super thanks button below. As always, thank you for watching,
00:09:53and I'll see you in the next one.

Key Takeaway

Docker's new dynamic mode and Code Mode for the MCP protocol significantly enhance AI agent efficiency, security, and token management by enabling dynamic tool selection, custom tool creation, and state persistence.

Highlights

Docker introduced a "dynamic mode" for the MCP protocol to address scaling issues with AI agents managing numerous tools.

The new dynamic mode significantly reduces token usage and speeds up AI agents by dynamically selecting and providing only necessary tool definitions.

Docker's MCP catalog and gateway manage trusted MCP servers and autonomously provide tools, effectively solving context bloat.

"Code Mode" allows AI agents to write custom JavaScript tools that can call other MCP tools, enhancing security and efficiency.

Key benefits of Code Mode include secure sandboxed execution, improved token efficiency by abstracting multiple tool calls, and state persistence via volumes.

Practical examples demonstrate creating custom tools like "multi search repos" and "GitHub to Notion" to streamline workflows and manage data efficiently.

Timeline

Problem with MCP Protocol Scaling

The video begins by highlighting the evolution of the MCP protocol from handling a few local servers to hundreds with thousands of tools, creating a significant problem for AI agents. This scaling issue leads to inefficient token usage and agent performance degradation, a challenge first identified by Cloudflare and further detailed in a research paper by Claude. Docker has developed a solution, introducing a "dynamic mode" that promises to save tokens, accelerate agents, and enable new automation possibilities. This new approach aims to fix critical problems associated with the widespread use of MCPs in AI coding.

Docker's Solution: Addressing Key Problems

Docker's solution addresses three core problems: identifying trustworthy MCP servers, preventing context windows from being filled with unused tool definitions, and enabling agents to efficiently discover, configure, and use tools autonomously. The speaker emphasizes the second problem, illustrating how an agent might only use 2-3 tools out of a thousand, yet all definitions bloat the context. Anthropic also previously discussed this issue, and Docker has now implemented a practical solution. This section sets the stage for understanding the specific mechanisms Docker employs.

MCP Catalog and Gateway for Trust and Management

Docker's solution leverages its existing infrastructure, specifically the MCP catalog and MCP gateway. The MCP catalog lists verified and trusted MCP servers, making it easy for users to connect them within Docker. The MCP client (e.g., Claude code) connects only to Docker, which then centrally manages all connected MCP servers. This setup entirely solves the initial problem of trusting MCP servers by providing a curated and managed environment, ensuring that agents interact only with reliable tools.

Dynamic Tool Selection and Context Window Optimization

To enable dynamic MCP usage, Docker implemented an MCP gateway with pre-built tools that autonomously utilize servers from the catalog. This means an agent connects to just one MCP, which then intelligently determines and brings only the necessary tool definitions into the context window, preventing bloat. New tools like MCP find, add, and remove allow agents to locate and manage servers dynamically. An example demonstrates searching GitHub repos, where the LLM initially returns excessive detail, highlighting the need for further context optimization.

Improving Token Efficiency with Dynamic Tool Provisioning

This section delves into dynamic tool selection as a crucial aspect of token efficiency, referencing Claude's article on token consumption by tool definitions and intermediate results. The previous GitHub example showed how raw, detailed results unnecessarily fill the context window. Docker's MCP gateway addresses this by only providing the specific tools useful for a given session (e.g., just the 'search repo' tool instead of all 40 GitHub tools). This targeted provisioning significantly reduces the context window size, leading to substantial token savings and improved agent performance.

Introducing "Code Mode" for Advanced Agent Capabilities

The video introduces "Code Mode," a groundbreaking feature implemented by Docker, based on Cloudflare's insights into the inefficiencies of traditional MCP usage. Code Mode allows AI agents to directly write and implement code using MCP tools, providing them with "code mode tools" that operate in novel ways. Essentially, it creates a JavaScript-enabled tool capable of calling other MCP tools. This innovative approach promises to unlock new possibilities for agent automation and efficiency, moving beyond static tool definitions.

Benefits of Code Mode: Security, Efficiency, and State Persistence

Code Mode offers three significant benefits for AI agents. Firstly, it ensures complete security by running agent-generated code in a Docker sandbox, preventing any damage to the system. Secondly, it drastically improves token and tool efficiency; the model only needs to be aware of one new code mode tool, rather than all underlying tools, saving substantial context. Thirdly, it provides state persistence through volumes, allowing data to be saved between tool calls without flooding the model with large datasets, as exemplified by a data processing pipeline where only summaries or final results are sent to the model.

Practical Application: Multi-Search Repos Tool

The speaker demonstrates Code Mode by creating a "multi search repos" tool. Initially, the agent combined multiple `find GitHub repo` calls into one, but the results were returned directly to the context window. To optimize this, the speaker instructed the agent to write all results to a text file, with the model only receiving a description of the repos, not the full details. This example showcases how Code Mode can streamline repetitive tasks and manage output efficiently, although the speaker notes the current limitation of manually saving and reusing such custom tools.

Practical Application: GitHub to Notion Workflow

Building on the previous example, the speaker further demonstrates Code Mode by creating a "GitHub to Notion" tool. After connecting to the Notion MCP, the agent developed a tool to directly output GitHub search results into a Notion database, complete with fields for name, description, and date. This automation streamlines the process of organizing discovered repositories, ensuring that the model only receives minimal information (name and description) while the comprehensive data is stored externally. This highlights the power of chaining different MCPs to create complex, context-efficient workflows.

Getting Started and Conclusion

The video concludes by encouraging viewers to explore Docker's MCP catalog to discover and chain together various MCPs for creating powerful workflows, emphasizing the benefits of saving tokens and preserving AI agent performance. Getting started is straightforward: users need to update their Docker version and ensure the Docker MCP toolkit is enabled in beta features. Once enabled, the catalog and new features are active by default, allowing users to connect their clients and immediately begin leveraging these advanced capabilities.

Community Posts

View all posts