00:00:00These past two months, the AI community has already realized that MCPs have a huge problem.
00:00:04And due to this, the community has actually come up with some solutions.
00:00:08But all of the solutions have huge gaps.
00:00:10A while back we made a video on Docker's solution,
00:00:12which we considered the best solution to the MCP problem until now.
00:00:16Docker released code mode which lets agents write JavaScript code that calls MCP tools directly.
00:00:21And this solved the problem where MCP tools consume a lot of context
00:00:24by having the tool and description exposed in the context window.
00:00:27So if you are working with a lot of MCPs, your context window will be bloated with
00:00:32unnecessary tools out of which most of them aren't even needed most of the time.
00:00:36But with the Docker MCP gateway, you were locked into the MCPs Docker had configured
00:00:41and there were limits on local and remote MCPs.
00:00:43Also, you weren't able to save those custom tools as functions.
00:00:47All of this was triggered when Cloudflare identified this issue and proposed a solution
00:00:51to have these tools exist as executable code rather than having them sit in the context window.
00:00:56Anthropic, who were the original architects of this protocol, acknowledged this gap in
00:01:00their product and followed up by releasing a paper highlighting this exact issue.
00:01:04After this, people started taking this problem seriously and started exploring solutions.
00:01:09But their solution of converting every tool into a typescript file also has gaps.
00:01:13With a lot of MCPs connected, you have to convert each one to code individually and
00:01:18you also need to spend a lot of time to make sure that none of them fail in the process.
00:01:22But since this became an acknowledged problem,
00:01:24people are still trying to bring out better solutions.
00:01:26And this is when we found this new tool called MCP to CLI.
00:01:30MCP to CLI solves the context bloat problem that MCPs have by actually turning all of the
00:01:36MCP servers into CLI tools that you can run through bash commands.
00:01:40Now, we primarily use cloud code in our team, and it actually has a CLI flag that aims to
00:01:45solve a part of this problem that tool solves the initial MCP context bloat problem by not exposing
00:01:50all of the tools up front in the context window. But it allows cloud code to dynamically load
00:01:55each tool as needed. But that still leaves out the other issue in cloud code. As you probably know,
00:02:00MCPs return their outputs directly in the context window. And in case there is a large output
00:02:05returned by the MCP tool, it remains in the context window anyway, leading to unnecessary
00:02:10context window bloat. Now, you might also have heard about other open source tools such as CLI
00:02:15hub that target the same problem, but they are inefficient because they convert them at build time
00:02:20and not at runtime. So what does runtime conversion actually mean? It means the tool gets converted
00:02:25into a bash command at the moment it's actually called. Now, this might seem all right, but what
00:02:29happens when the original MCP itself gets updated? Because this tool builds up its MCP tools at
00:02:34runtime, any change in the actual MCP is automatically reflected in the converted tool.
00:02:39This wouldn't have been possible if we were building tools at build time. In that case,
00:02:43we would have had to manually fetch and update the tool ourselves every single time. But you might
00:02:48think that converting the same tool every time it's called would make repeated calls slow. That's where
00:02:53the caching mechanism comes in, which they've built into the tool. It saves all of the MCP tools in a
00:02:58cache with a one hour time to live by default. So all of the frequently used MCP tools go right into
00:03:03a cache and stay there for one hour. And from there the agent can get the tools with faster retrieval
00:03:08without sacrificing the runtime flexibility. Now this tool is built right on top of the MCP Python
00:03:13SDK, the same one every MCP server actually uses underneath. So with all of the MCP tool calls it
00:03:19runs, it simply executes them as bash commands and only injects the response into the agent's
00:03:24context window when it's asked to. It also handles open API and REST APIs through the same CLI
00:03:30interface, meaning any existing API that doesn't have an MCP server can still be used the exact
00:03:35same way. Without this tool, you're limited in what type of MCPs you can actually connect to.
00:03:39Other similar solutions don't usually give you flexibility to work with all types of MCPs all
00:03:44in one place. To back up their claims on token efficiency, they ran an automated test suite using
00:03:49Tik token, the Python library for counting tokens. When they tested it, the tool was much cheaper and
00:03:54had much faster execution. So you don't just have to take our word for this one. This one actually
00:03:59had the numbers behind it. You can either install it on your system using pip or run it without
00:04:03installing. We chose to run it without installing because it keeps the working environment clean.
00:04:07And they've also provided a skill that helps agents work with this tool better. It lays out the core
00:04:13workflow and gives examples of the bash commands for different tasks like authentication and caching
00:04:18that your agent doesn't have the context for. But before we move forwards, let's have a word
00:04:22by our sponsor Orchids. Most AI builders handle simple mockups well, but they fail when you need
00:04:27complex logic or multi file structures. That's where Orchids comes in. The first AI agent that
00:04:32can build and deploy any app on any stack directly from your environment. You can bring your own
00:04:36subscription to run models at cost using your existing chat GPT, Claude or Gemini accounts,
00:04:41even GitHub copilot. It is built to handle any app on any stack. You aren't limited to just the web.
00:04:47You can build and deploy everything from mobile apps and Chrome extensions to complex AI agents
00:04:52and Slack bots. Check out these builds. A fully working OpenClaw setup managing complex hardware
00:04:57level logic, a functional Bloomberg terminal processing massive live data feeds in real time
00:05:02and native mobile apps like this building identifier that leverages your device's camera
00:05:07feed directly. Click the link in the pinned comment and start building. Plus use the code March 15 for
00:05:1215% off your plan. Just like you, we also want to get rich. And one way is to notice a gap in the
00:05:17market. And that's when we came across this golden idea grinder, but for horses. But jokes aside,
00:05:22building large scale products requires a lot of MCP tools because they have a lot of dependencies
00:05:27and they blow the context window fast. We connected the agent to the super base MCP using MCP to CLI,
00:05:34because that was the backend infrastructure we were using. Now you don't have to configure
00:05:38anything manually because of the skill we installed earlier. That skill handles everything on its own
00:05:43and configures the MCPs for you. But before installing it head first, you need to get the
00:05:47access tokens of whichever MCP you are using. Because if you don't do so, you'll run into
00:05:52errors like we did after which we got our access token generated and gave it to Claude to add it.
00:05:57Once configured correctly, you should be seeing the tools available for use. Now you might think
00:06:01that if this tool runs as a bash command, it's not safe to have sensitive data like API keys and
00:06:06access tokens in it because they could be exposed when processes are listed. But this tool adds a
00:06:11protection layer. It doesn't put sensitive data in the command line arguments. Instead, it handles
00:06:15them through environment variables, or it references a file path where the access tokens are saved,
00:06:21or it uses a secret manager that injects them at runtime. So it's secure to run similar to the
00:06:26superbase connection, we connected the GitHub MCP for version control, the puppeteer MCP for browser
00:06:32testing, and the context seven MCP for grounding the agent with proper documentation so that it
00:06:37works with updated docs. Once all the MCPs were connected, we asked Claude to verify everything.
00:06:42It confirmed that we had all four MCPs connected 78 tools in total in our case. Also, if you are
00:06:47enjoying our content, consider pressing the hype button because it helps us create more content
00:06:52like this and reach out to more people. Now once we were actually connected, it was time to start
00:06:57implementing the app incrementally. We started with connecting the client side code to the superbase
00:07:02back end. When Claude ran the MCP to CLI command to create the project, we noticed that did not put
00:07:07the access token directly into the tool call. Instead, it referred to our dot env dot local at
00:07:12the project level for the token. It created the project set everything up and added in the logic
00:07:17for connecting in the code. But we noticed that it used the middleware file for the session refresh
00:07:22logic and it shouldn't have been using that because it's deprecated. The new version of Next.js used
00:07:27the proxy and we knew this would give us an error when we actually ran the app. This just shows that
00:07:31connecting tools is not enough to make the agent listen to the tools and actually use them when
00:07:36needed. So we created a Claude dot MD and told it to use the context seven MCP before writing any
00:07:42code. So this wouldn't happen again. That way it knows it should refer to the context seven MCP
00:07:47before implementation. Once it had finished adding tables and setting up authentication on superbase,
00:07:52we pointed out the deprecated middleware warning to Claude so that it could correct it. After we
00:07:57told it to, it finally used the context seven MCP to pull in the documentation and resolve the issue
00:08:03properly. But when we were exploring this tool further, we found out that there was a better
00:08:07way to handle these issues than creating a Claude dot MD file. Skills are better because their
00:08:11descriptions are loaded directly into the agent's context. So it already knows what tools are
00:08:16available and when to use them rather than us just dumping instructions into Claude dot MD and hoping
00:08:21it reads them. So we asked it to create a skill for all of the MCPs we had connected. Claude then
00:08:26created skills for each MCP. Each one detailed what tools it had, how to use them, and when to use them.
00:08:32With that in place, we moved to the next problem. But what we had was purely far from functional.
00:08:36The feedback from the horses told us that they were getting impatient because they were unable to chat
00:08:41directly on the platform. So we asked Claude to make the chat functional for the project on top of
00:08:46the UI. When we tested it ourselves, the messages didn't load and only showed a loading screen. So we
00:08:51asked it to use the Puppeteer MCP to test the message flow. We had it check itself because an
00:08:56agent that can click and scroll and interact with its own UI catches things that static code review
00:09:01never can. For testing purposes, it created two users. But it couldn't maintain session data across
00:09:06tool calls since each one spun up a new browser instance. The number of tools it used and the time
00:09:10it took to work in a headless browser made us realize something. A better option would be to just
00:09:15let the MCP handle it. It was much faster and took far less time than the seven minutes we wasted on
00:09:21such a simple task. We prefer using Claude's own browser extension that works better with
00:09:25more capabilities and is able to retain sessions better for end-to-end testings like these. And
00:09:30MCPs run as persistent processes which is why they're able to maintain state across the entire
00:09:35session. This tool also provides control over the output format like JSON and raw output. It also
00:09:40supports Toon, the token efficient coding format for LLM consumption. When we work with MCPs like
00:09:46Context 7 they usually return a huge chunk of output directly into the context window. To
00:09:51prevent that we added in the Claude.md file that whenever it uses the Context 7 MCP it should use
00:09:57the Toon format for output. It's an efficient format because it combines indentation and CSV
00:10:02style lists compacting large information into much smaller chunks as compared to JSON and YAML. This
00:10:07way you don't waste any tokens unnecessarily. But the biggest unlock came from something that
00:10:12wasn't even possible when MCPs were handled natively by agents. If you remember cursor
00:10:16released a context editing workflow inside its product. They treated MCP results as files and
00:10:22let the agent use bash scripts like grep for pattern matching to extract data. We covered that in our
00:10:27previous video. We tried to implement this idea in other coding agents but since MCPs are handled
00:10:32natively by agents we weren't able to get much out of it. But now with this CLI it's possible because
00:10:37MCPs are treated as bash command tools. So we added an instruction in the Claude.md file that whenever
00:10:43any MCP tool produces a large output instead of loading it into the context window it should
00:10:49redirect it to a file at the path we specified. We were tracking this project's progress through
00:10:54a progress.json file. After adding the instruction we asked Claude to implement one feature from the
00:10:59list. It then used the Context 7 MCP for tool calls but instead of dumping the output to the context
00:11:05window it piped it to a file and used grep to extract the data and complete the implementation.
00:11:10The Claude.md file with all the best practices to use this tool is available in AI Labs Pro. For
00:11:16those who don't know it's our recently launched community where you get ready to use templates
00:11:20that you can plug directly into your projects for this video and all previous ones. If you've found
00:11:25value in what we do and want to support the channel this is the best way to do it. The link's in the
00:11:29description. That brings us to the end of this video. If you'd like to support the channel and
00:11:33help us keep making videos like this you can do so by using the super thanks button below.
00:11:38As always, thank you for watching and I'll see you in the next one.