This Just Fixed The Greatest Problem Of AI Coding

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Português Русский 中文

Computing/SoftwareSmall Business/StartupsInternet Technology

Transcript

00:00:00These past two months, the AI community has already realized that MCPs have a huge problem.

00:00:04And due to this, the community has actually come up with some solutions.

00:00:08But all of the solutions have huge gaps.

00:00:10A while back we made a video on Docker's solution,

00:00:12which we considered the best solution to the MCP problem until now.

00:00:16Docker released code mode which lets agents write JavaScript code that calls MCP tools directly.

00:00:21And this solved the problem where MCP tools consume a lot of context

00:00:24by having the tool and description exposed in the context window.

00:00:27So if you are working with a lot of MCPs, your context window will be bloated with

00:00:32unnecessary tools out of which most of them aren't even needed most of the time.

00:00:36But with the Docker MCP gateway, you were locked into the MCPs Docker had configured

00:00:41and there were limits on local and remote MCPs.

00:00:43Also, you weren't able to save those custom tools as functions.

00:00:47All of this was triggered when Cloudflare identified this issue and proposed a solution

00:00:51to have these tools exist as executable code rather than having them sit in the context window.

00:00:56Anthropic, who were the original architects of this protocol, acknowledged this gap in

00:01:00their product and followed up by releasing a paper highlighting this exact issue.

00:01:04After this, people started taking this problem seriously and started exploring solutions.

00:01:09But their solution of converting every tool into a typescript file also has gaps.

00:01:13With a lot of MCPs connected, you have to convert each one to code individually and

00:01:18you also need to spend a lot of time to make sure that none of them fail in the process.

00:01:22But since this became an acknowledged problem,

00:01:24people are still trying to bring out better solutions.

00:01:26And this is when we found this new tool called MCP to CLI.

00:01:30MCP to CLI solves the context bloat problem that MCPs have by actually turning all of the

00:01:36MCP servers into CLI tools that you can run through bash commands.

00:01:40Now, we primarily use cloud code in our team, and it actually has a CLI flag that aims to

00:01:45solve a part of this problem that tool solves the initial MCP context bloat problem by not exposing

00:01:50all of the tools up front in the context window. But it allows cloud code to dynamically load

00:01:55each tool as needed. But that still leaves out the other issue in cloud code. As you probably know,

00:02:00MCPs return their outputs directly in the context window. And in case there is a large output

00:02:05returned by the MCP tool, it remains in the context window anyway, leading to unnecessary

00:02:10context window bloat. Now, you might also have heard about other open source tools such as CLI

00:02:15hub that target the same problem, but they are inefficient because they convert them at build time

00:02:20and not at runtime. So what does runtime conversion actually mean? It means the tool gets converted

00:02:25into a bash command at the moment it's actually called. Now, this might seem all right, but what

00:02:29happens when the original MCP itself gets updated? Because this tool builds up its MCP tools at

00:02:34runtime, any change in the actual MCP is automatically reflected in the converted tool.

00:02:39This wouldn't have been possible if we were building tools at build time. In that case,

00:02:43we would have had to manually fetch and update the tool ourselves every single time. But you might

00:02:48think that converting the same tool every time it's called would make repeated calls slow. That's where

00:02:53the caching mechanism comes in, which they've built into the tool. It saves all of the MCP tools in a

00:02:58cache with a one hour time to live by default. So all of the frequently used MCP tools go right into

00:03:03a cache and stay there for one hour. And from there the agent can get the tools with faster retrieval

00:03:08without sacrificing the runtime flexibility. Now this tool is built right on top of the MCP Python

00:03:13SDK, the same one every MCP server actually uses underneath. So with all of the MCP tool calls it

00:03:19runs, it simply executes them as bash commands and only injects the response into the agent's

00:03:24context window when it's asked to. It also handles open API and REST APIs through the same CLI

00:03:30interface, meaning any existing API that doesn't have an MCP server can still be used the exact

00:03:35same way. Without this tool, you're limited in what type of MCPs you can actually connect to.

00:03:39Other similar solutions don't usually give you flexibility to work with all types of MCPs all

00:03:44in one place. To back up their claims on token efficiency, they ran an automated test suite using

00:03:49Tik token, the Python library for counting tokens. When they tested it, the tool was much cheaper and

00:03:54had much faster execution. So you don't just have to take our word for this one. This one actually

00:03:59had the numbers behind it. You can either install it on your system using pip or run it without

00:04:03installing. We chose to run it without installing because it keeps the working environment clean.

00:04:07And they've also provided a skill that helps agents work with this tool better. It lays out the core

00:04:13workflow and gives examples of the bash commands for different tasks like authentication and caching

00:04:18that your agent doesn't have the context for. But before we move forwards, let's have a word

00:04:22by our sponsor Orchids. Most AI builders handle simple mockups well, but they fail when you need

00:04:27complex logic or multi file structures. That's where Orchids comes in. The first AI agent that

00:04:32can build and deploy any app on any stack directly from your environment. You can bring your own

00:04:36subscription to run models at cost using your existing chat GPT, Claude or Gemini accounts,

00:04:41even GitHub copilot. It is built to handle any app on any stack. You aren't limited to just the web.

00:04:47You can build and deploy everything from mobile apps and Chrome extensions to complex AI agents

00:04:52and Slack bots. Check out these builds. A fully working OpenClaw setup managing complex hardware

00:04:57level logic, a functional Bloomberg terminal processing massive live data feeds in real time

00:05:02and native mobile apps like this building identifier that leverages your device's camera

00:05:07feed directly. Click the link in the pinned comment and start building. Plus use the code March 15 for

00:05:1215% off your plan. Just like you, we also want to get rich. And one way is to notice a gap in the

00:05:17market. And that's when we came across this golden idea grinder, but for horses. But jokes aside,

00:05:22building large scale products requires a lot of MCP tools because they have a lot of dependencies

00:05:27and they blow the context window fast. We connected the agent to the super base MCP using MCP to CLI,

00:05:34because that was the backend infrastructure we were using. Now you don't have to configure

00:05:38anything manually because of the skill we installed earlier. That skill handles everything on its own

00:05:43and configures the MCPs for you. But before installing it head first, you need to get the

00:05:47access tokens of whichever MCP you are using. Because if you don't do so, you'll run into

00:05:52errors like we did after which we got our access token generated and gave it to Claude to add it.

00:05:57Once configured correctly, you should be seeing the tools available for use. Now you might think

00:06:01that if this tool runs as a bash command, it's not safe to have sensitive data like API keys and

00:06:06access tokens in it because they could be exposed when processes are listed. But this tool adds a

00:06:11protection layer. It doesn't put sensitive data in the command line arguments. Instead, it handles

00:06:15them through environment variables, or it references a file path where the access tokens are saved,

00:06:21or it uses a secret manager that injects them at runtime. So it's secure to run similar to the

00:06:26superbase connection, we connected the GitHub MCP for version control, the puppeteer MCP for browser

00:06:32testing, and the context seven MCP for grounding the agent with proper documentation so that it

00:06:37works with updated docs. Once all the MCPs were connected, we asked Claude to verify everything.

00:06:42It confirmed that we had all four MCPs connected 78 tools in total in our case. Also, if you are

00:06:47enjoying our content, consider pressing the hype button because it helps us create more content

00:06:52like this and reach out to more people. Now once we were actually connected, it was time to start

00:06:57implementing the app incrementally. We started with connecting the client side code to the superbase

00:07:02back end. When Claude ran the MCP to CLI command to create the project, we noticed that did not put

00:07:07the access token directly into the tool call. Instead, it referred to our dot env dot local at

00:07:12the project level for the token. It created the project set everything up and added in the logic

00:07:17for connecting in the code. But we noticed that it used the middleware file for the session refresh

00:07:22logic and it shouldn't have been using that because it's deprecated. The new version of Next.js used

00:07:27the proxy and we knew this would give us an error when we actually ran the app. This just shows that

00:07:31connecting tools is not enough to make the agent listen to the tools and actually use them when

00:07:36needed. So we created a Claude dot MD and told it to use the context seven MCP before writing any

00:07:42code. So this wouldn't happen again. That way it knows it should refer to the context seven MCP

00:07:47before implementation. Once it had finished adding tables and setting up authentication on superbase,

00:07:52we pointed out the deprecated middleware warning to Claude so that it could correct it. After we

00:07:57told it to, it finally used the context seven MCP to pull in the documentation and resolve the issue

00:08:03properly. But when we were exploring this tool further, we found out that there was a better

00:08:07way to handle these issues than creating a Claude dot MD file. Skills are better because their

00:08:11descriptions are loaded directly into the agent's context. So it already knows what tools are

00:08:16available and when to use them rather than us just dumping instructions into Claude dot MD and hoping

00:08:21it reads them. So we asked it to create a skill for all of the MCPs we had connected. Claude then

00:08:26created skills for each MCP. Each one detailed what tools it had, how to use them, and when to use them.

00:08:32With that in place, we moved to the next problem. But what we had was purely far from functional.

00:08:36The feedback from the horses told us that they were getting impatient because they were unable to chat

00:08:41directly on the platform. So we asked Claude to make the chat functional for the project on top of

00:08:46the UI. When we tested it ourselves, the messages didn't load and only showed a loading screen. So we

00:08:51asked it to use the Puppeteer MCP to test the message flow. We had it check itself because an

00:08:56agent that can click and scroll and interact with its own UI catches things that static code review

00:09:01never can. For testing purposes, it created two users. But it couldn't maintain session data across

00:09:06tool calls since each one spun up a new browser instance. The number of tools it used and the time

00:09:10it took to work in a headless browser made us realize something. A better option would be to just

00:09:15let the MCP handle it. It was much faster and took far less time than the seven minutes we wasted on

00:09:21such a simple task. We prefer using Claude's own browser extension that works better with

00:09:25more capabilities and is able to retain sessions better for end-to-end testings like these. And

00:09:30MCPs run as persistent processes which is why they're able to maintain state across the entire

00:09:35session. This tool also provides control over the output format like JSON and raw output. It also

00:09:40supports Toon, the token efficient coding format for LLM consumption. When we work with MCPs like

00:09:46Context 7 they usually return a huge chunk of output directly into the context window. To

00:09:51prevent that we added in the Claude.md file that whenever it uses the Context 7 MCP it should use

00:09:57the Toon format for output. It's an efficient format because it combines indentation and CSV

00:10:02style lists compacting large information into much smaller chunks as compared to JSON and YAML. This

00:10:07way you don't waste any tokens unnecessarily. But the biggest unlock came from something that

00:10:12wasn't even possible when MCPs were handled natively by agents. If you remember cursor

00:10:16released a context editing workflow inside its product. They treated MCP results as files and

00:10:22let the agent use bash scripts like grep for pattern matching to extract data. We covered that in our

00:10:27previous video. We tried to implement this idea in other coding agents but since MCPs are handled

00:10:32natively by agents we weren't able to get much out of it. But now with this CLI it's possible because

00:10:37MCPs are treated as bash command tools. So we added an instruction in the Claude.md file that whenever

00:10:43any MCP tool produces a large output instead of loading it into the context window it should

00:10:49redirect it to a file at the path we specified. We were tracking this project's progress through

00:10:54a progress.json file. After adding the instruction we asked Claude to implement one feature from the

00:10:59list. It then used the Context 7 MCP for tool calls but instead of dumping the output to the context

00:11:05window it piped it to a file and used grep to extract the data and complete the implementation.

00:11:10The Claude.md file with all the best practices to use this tool is available in AI Labs Pro. For

00:11:16those who don't know it's our recently launched community where you get ready to use templates

00:11:20that you can plug directly into your projects for this video and all previous ones. If you've found

00:11:25value in what we do and want to support the channel this is the best way to do it. The link's in the

00:11:29description. That brings us to the end of this video. If you'd like to support the channel and

00:11:33help us keep making videos like this you can do so by using the super thanks button below.

00:11:38As always, thank you for watching and I'll see you in the next one.

Key Takeaway

MCP to CLI solves the critical issue of AI context bloat by transforming MCP tools into bash commands, enabling more efficient, secure, and scalable AI coding workflows.

Highlights

The Model Context Protocol (MCP) faces a 'context bloat' problem where tool descriptions consume excessive tokens in the context window.
A new tool called 'MCP to CLI' addresses this by converting MCP servers into executable bash commands at runtime.
Runtime conversion allows for automatic updates and flexibility, while a built-in caching mechanism ensures performance for frequent calls.
The tool supports secure handling of sensitive data like API keys through environment variables and secret managers instead of command-line arguments.
Using the 'TOON' format and file piping allows agents to process massive outputs without flooding the context window, similar to Cursor's workflow.
The integration of 'Skills' provides a more reliable way to inform agents of available tools compared to static instructions in a Claude.md file.

Timeline

The Problem of MCP Context Bloat

The speaker introduces the primary challenge with the Model Context Protocol: the way tool descriptions and outputs consume vast amounts of the context window. While Docker released a 'code mode' to mitigate this, it remains limited by fixed configurations and a lack of custom function saving. Anthropic and Cloudflare have both acknowledged this gap, noting that having tools sit idle in the context window is inefficient. Previous attempts to convert every tool into individual TypeScript files were cumbersome and prone to failure during the conversion process. This section establishes the urgent need for a more dynamic and automated solution to manage AI tool integration.

Introduction to MCP to CLI and Runtime Conversion

MCP to CLI is presented as a superior alternative that turns MCP servers into CLI tools executable via bash commands. Unlike 'CLI Hub' which uses build-time conversion, this tool operates at runtime, meaning any updates to the original MCP are immediately reflected. To prevent speed issues during these dynamic calls, the developers implemented a caching mechanism with a default one-hour 'Time to Live' (TTL). This architecture ensures that frequently used tools are retrieved quickly without sacrificing the flexibility of live updates. The speaker emphasizes that this approach keeps the agent's context clean by only injecting data when specifically requested.

Technical Architecture and Security Features

The tool is built directly on top of the MCP Python SDK, ensuring native compatibility with existing servers. It extends beyond standard MCPs to handle OpenAPI and REST APIs through the same command-line interface, providing a unified workflow for developers. For security, the tool avoids placing sensitive API keys or access tokens directly into command-line arguments where they could be exposed in process lists. Instead, it utilizes environment variables, file paths, or secret managers to inject credentials safely at the moment of execution. The developers also provide a 'Skill' to help agents understand the core workflow and handle complex tasks like authentication and caching automatically.

Real-World Implementation: Supabase and Security

The video transitions into a practical demonstration by building an app for 'Grinder for Horses' using Supabase as the backend. The speaker demonstrates how to connect multiple MCPs, including GitHub for version control and Puppeteer for browser testing, totaling 78 tools in their specific case. A crucial setup step involves generating access tokens and providing them to the AI agent to avoid configuration errors. The demonstration highlights how the agent refers to a ".env.local" file for tokens rather than hardcoding them, maintaining a high security standard. This section proves that the CLI-based approach can handle complex, multi-dependency professional projects without overwhelming the LLM.

Advanced Workflows: Skills, Documentation, and Testing

The speaker addresses the challenge of making the agent actually 'listen' to the tools by using a 'Claude.md' file and the Context 7 MCP for documentation. They explain that 'Skills' are more effective than simple instruction files because their descriptions are loaded directly into the agent's active context. When a session refresh logic error occurred due to deprecated Next.js middleware, the agent used the documentation tool to find the correct proxy-based solution. The section also covers UI testing with Puppeteer, though it notes that Claude's native browser extension is often better for maintaining state across sessions. This illustrates the importance of choosing the right tool for specific tasks like end-to-end testing versus static code review.

Optimizing Token Efficiency with TOON and File Piping

To further optimize performance, the tool supports the 'TOON' format, which is more token-efficient than JSON or YAML by using compact indentation and CSV-style lists. A major 'unlock' is demonstrated: treating MCP outputs as files and using bash scripts like 'grep' to extract only the necessary data. This allows the agent to handle massive documentation outputs by piping them to a specified path instead of dumping the full text into the chat. The speaker concludes by highlighting that these best practices are available in their 'AI Labs Pro' community for users who want to implement these workflows. This final section underscores how the CLI approach allows for sophisticated data manipulation that was previously impossible with native MCP handling.

Community Posts

Write about this video