00:00:00The Closco team have just fixed the biggest issue with MCP by adding tool search away
00:00:05to reduce context by up to 95% simply by searching for a tool name before using it instead of
00:00:11preloading all available tools into context, which could be tens of thousands of tokens
00:00:16used up even before writing your first prompt.
00:00:18But why wasn't this the way it worked before?
00:00:21And did they steal this technique from Cloudflare?
00:00:24Hit subscribe and let's get into it.
00:00:26MCP servers are absolutely everywhere, there's one for GitHub, Docker, Notion, there's
00:00:32even a better stack one which I've heard is really good.
00:00:35And with people using Clawed Code and LLMs for everything other than code, it seems like
00:00:40MCP isn't going anywhere anytime soon.
00:00:43But it has its problems, naming collisions, command injections, and the biggest of all
00:00:48token inefficiency, because all the tools from a connected server typically gets preloaded
00:00:53into the model's context window to give the model complete visibility.
00:00:57So tool names, tool descriptions, the full JSON schema documentation that contains optional
00:01:02and required parameters, their types, any constraints, basically a lot of data.
00:01:07The Redis team used 167 tools from four different servers, which took up over 60,000 tokens even
00:01:14before writing a prompt.
00:01:15Almost half of Opus' 200k context window, and this is even outside of skills and plugins.
00:01:21So if you have a lot of servers, that could take up a substantial amount of tokens.
00:01:25Yes, I know there are models out there, like Gemini, that have a 1 million token window,
00:01:31but models tend to perform worse the more things you add to their context.
00:01:35So what's the best way to fix this?
00:01:37Well, I've seen two popular paths online, the programmatic approach, which is what Cloudflare
00:01:42have done, and the search approach, which is what the Clawed Code team have done.
00:01:46I'll talk about the programmatic approach a bit later, but first, let's talk about the search process,
00:01:52which works like this.
00:01:53First, Clawed checks if preloaded MCP tools are more than 10% of the context.
00:01:59So that's 20k tokens if the context window is 200k tokens.
00:02:04If not, then no change happens, and the model uses the MCP tools as normal.
00:02:10But if yes, then Clawed dynamically discovers the correct tools to use using natural language
00:02:17and loads in three to five of the most relevant tools based on the prompt.
00:02:22It will fully load just these tools into context for the model to use as normal.
00:02:27This was actually their most requested feature on GitHub, and it works similar to AgentSkills,
00:02:32which only loads skill names and descriptions into context, and when it finds a skill it
00:02:37thinks is relevant or a skill that was mentioned in the prompt, then it goes ahead and loads
00:02:42all of that specific skill into the context window.
00:02:46Progressive disclosure in a nutshell.
00:02:47Both Anthropic and Cursor have seen great benefits when it comes to using this approach for MCP tools.
00:02:53But what about the programmatic approach?
00:02:55This works by models orchestrating tools through code instead of making API calls.
00:03:01So for these three tools that need to work one after the other based on the previous response,
00:03:06instead of making individual API tool calls, Clawed in particular can write a Python script
00:03:11to do all of this orchestration, then execute the code and present the result back to the model.
00:03:16Cloudflare have taken this one step further by getting the model to write typescript definitions
00:03:21for all the available tools and then running the code in a sandbox which is usually a worker.
00:03:27The Clawed code team actually tried the programmatic approach but found search to work better, which
00:03:32I find really hard to believe considering Clawed is very good at writing code.
00:03:37And also, the agent browser CLI headless chromium thing that Vacel have released works very well
00:03:44in Clawed code and I'm sure if you could convert all MCP tools into CLI commands using
00:03:50something like MCPorter, it would be much easier and context efficient for models to run a specific
00:03:56CLI command for a tool instead of loading things into context, but hey, that's just my opinion.
00:04:01Overall, I'm glad the issues with MCP servers are being looked into and maybe it might just
00:04:07convince me to have more than one server installed.