Claude Code's HIDDEN Agent Swarm (Better Than Kimi K2.5?)

BBetter Stack
Computing/SoftwareBusiness NewsInternet Technology

Transcript

00:00:00People are going crazy over Kimi 2.5. It's an open-source model that has some better benchmarks
00:00:05than Opus and an insanely clever Agent Swarm mode where an orchestrator can spawn up to 100
00:00:11specialized agents for a complex task. But did you know that this feature also exists in Claude's code
00:00:17behind a hidden flag and was discovered by a user on Twitter? How did someone discover this? And did
00:00:23Anthropic just steal this idea from Kimi? Hit subscribe and let's get into it. Anthropic announced
00:00:30custom sub-agents in July last year and since then people have been using them for all kinds of
00:00:35specialized tasks. We actually made a video about it back then too. But sub-agents themselves only
00:00:41have a snippet of the wider context since they're designed for a specialized task. So they do that
00:00:48task, return the data and have a fresh slate of memory. So people kind of implemented memory
00:00:54by getting the sub-agents to output their findings to a markdown file and also update a main context
00:01:01file. So if the same or a different sub-agent was asked to make an update they could just read those
00:01:06files and see where the other sub-agents had left off. But you still have to manually create a sub-agent
00:01:12giving it a role, access to specific skills, tools, permissions and so on. And this is why Kimi's new
00:01:19agent Swarm takes things to the next level because the orchestrator is the one that dynamically
00:01:25creates a specialized sub-agent for a specific task so you don't have to do anything. These sub-agents
00:01:31can work in parallel to complete an overall task and when they're done with their bit they can give
00:01:36it to the orchestrator who can decide if new sub-agents need to be spun up with that data in
00:01:42order to complete the complex task. Kimi's agent Swarm is still a research project but it's already
00:01:48showing great improvements compared to a single agent workflow. I mean look at this graph the more
00:01:53complex a task is it pretty much stays consistent because of the agents working in parallel to
00:01:58complete the same thing. Now if I'm being honest you can kind of already do this in Claude code
00:02:04so with the new-ish task feature you can create a list of tasks and fan them out to individual
00:02:10sub-agents. The problem is that these sub-agents are general purpose and not specialized for the
00:02:15specific task. I'm also not sure if Claude is automatically able to assign tasks to the correct
00:02:21custom sub-agent. Let me know in the comments if you've tried this already. But it looks like
00:02:25the Claude team have been working on a way for an orchestrator to automatically create sub-agents on
00:02:31the fly based on the task and this feature has been hidden behind a flag which was found by Mike Kelly
00:02:37who shows how it works in this tweet. And in the same tweet shares a link to a repo which is a fork
00:02:42of CC Mirror called Claude Sneak Peek. Let's try it. So this is a plan written by AI to create a
00:02:48web front-end for a tool called XDL that allows you to download videos from X or Twitter in the
00:02:55terminal. I've already installed and have Claude Sneak Peek running which you can see looks like
00:03:00a minimal version of Claude code. I'm going to ask it to read the plan.md file and create tasks that
00:03:05can be executed by a swarm of sub-agents. Then I'll leave it to create the tasks and now it's finished
00:03:11creating the tasks I'm going to ask it to execute the tasks using sub-agents. Now before I do that
00:03:16just to confirm I don't have any custom sub-agents in place I'm going to run the agent slash command
00:03:21and you can see that there are no specialized or custom sub-agents in place. So now it's
00:03:26executing the tasks and here it automatically added a front-end builder sub-agent for the front-end
00:03:32tasks. And you can see here we have a team if you press down to view the team we can see we have five
00:03:37agents in place a team lead, QA tester, back-end builder, component builder and front-end builder
00:03:42all working on tasks at the same time. And we can also see what each agent on our team is working on.
00:03:48So we can see the QA tester is searching for patterns the back-end builder is also searching
00:03:53for patterns and reading files and so is the component builder and front-end builder. If we
00:03:57want to see exactly what our agent is doing we can hit enter and now we're in the agent's view
00:04:02and we can see its system prompt. If we go back we can see we now have eight agents so a component
00:04:07creator, an API server, someone doing the feet setup, someone integrating the API and now we have
00:04:13someone doing CSS and our team of agents just seems to keep growing. If we hit enter on the team lead
00:04:18we can see we're back in the main cloud code view so the team lead is the main cloud code orchestrator.
00:04:24We can also see in the main view that each sub-agent is giving us its current status
00:04:29and if I zoom out a tiny bit and scroll up we can see the messages sent previously from all
00:04:34the different agents. And now that all the tasks are complete we get a swarm project complete file
00:04:41which tells us everything that was done but we also get a swarm execution report which gives us
00:04:47the number of specialised agents that were used, their role and if they completed the task. We can
00:04:52also scroll down to see in detail exactly what each agent did. Now based on how much work the
00:04:59Claude team have already put into this feature I don't think they copied Kimi. I think they saw
00:05:04implementations online like agents and be mad and wanted to add it to Claude code natively but I can
00:05:10totally understand why they haven't released it. Firstly I don't think this feature has had the many
00:05:16hours of training that the Kimi 2.5 orchestrator has and also things get really complicated for a
00:05:22user that already has some or even many sub-agents. For example if a user wants to complete a complex
00:05:28task how does the orchestrator know to create a brand new front-end sub-agent or use the user's
00:05:35existing sub-agent? What metrics or data is he using to make that judgment? And also skills adds
00:05:42more complication. If a user already has a bunch of downloaded skills how would the orchestrator know
00:05:49to use them for a new agent or to download its own ones which may even be more appropriate for the
00:05:56task at hand? I mean this orchestrator if anthropic ever release it will have to go through a bunch of
00:06:02user data already, agents, tools, skills just before it can decide if it needs to make its own sub-agent
00:06:10and what things it should add to it. I actually don't know if the team are working on this feature
00:06:16right now as I speak or if they've decided it's too complicated and won't release it. I don't know.
00:06:22Speaking of features if you're using an AI or a human to rapidly add features to a project and you
00:06:28want to make sure things don't break then you really need to check out Betastack because it's able to
00:06:33monitor logs on your servers and use anomaly detection to tell you if anything goes wrong
00:06:38before it does. And it also has AI native error tracking to let you know if anything goes wrong
00:06:44on your front end. So go and check out Betastack today.

Key Takeaway

A hidden feature in Claude Code reveals that Anthropic is developing a dynamic 'Agent Swarm' orchestrator capable of automatically spawning specialized sub-agents to solve complex tasks in parallel, similar to the highly-touted Kimi 2.5 model.

Highlights

Discovery of a hidden "Agent Swarm

Timeline

The Rise of Agent Swarms and the Claude Discovery

The speaker introduces the hype surrounding Kimi 2.5, an open-source model featuring an orchestrator that can spawn up to 100 specialized agents. This trend is contrasted with a recent discovery by a Twitter user who found a similar hidden flag within Claude's code. The video explores whether Anthropic is following a similar path to Kimi's swarm architecture or developing its own unique approach. This section sets the stage by questioning if Anthropic 'stole' the idea or had it in development. It establishes the importance of multi-agent orchestration in modern AI benchmarks.

Evolution of Claude's Sub-Agent Architecture

The video revisits Anthropic's July announcement of custom sub-agents, explaining how users previously managed specialized tasks. Early implementations required manual creation of agents, where users had to define specific roles, skills, and permissions. To maintain context, developers used markdown files to pass data between agents with 'fresh slates' of memory. This manual workflow served as a precursor to the current automated swarm experiments. The speaker explains the limitations of this 'snippet-based' context management compared to full dynamic orchestration.

Kimi vs. Claude: Orchestration and Task Fanning

The narrator breaks down why Kimi's swarm mode is revolutionary, highlighting how the orchestrator dynamically creates agents so the user doesn't have to. While Claude Code currently has a 'task' feature to fan out work, these agents remain general-purpose rather than specialized for specific sub-tasks. A Twitter user named Mike Kelly is credited with finding the hidden orchestrator flag and creating the 'Claude Sneak Peek' repository. This section highlights the technical gap between research projects and consumer-ready features. It emphasizes that parallel processing allows performance to stay consistent even as task complexity increases.

Live Demonstration of Claude's Hidden Swarm

A live demo showcases the 'Claude Sneak Peek' tool attempting to build a web front-end for a video downloader. The orchestrator automatically creates a team including a Team Lead, QA Tester, Back-end Builder, and CSS specialist without user intervention. Viewers see real-time updates as the team grows from five to eight agents, each working on distinct parts of the project like API integration and component creation. Upon completion, the system generates a 'Swarm Execution Report' listing every agent's role and success status. This segment proves the functional reality of the hidden feature and its ability to handle complex, multi-layered coding projects.

Technical Hurdles and Future Outlook

The speaker analyzes why Anthropic hasn't officially released the swarm feature yet, citing a lack of extensive training compared to Kimi's orchestrator. Significant complications arise when trying to integrate automated swarms with a user's existing custom sub-agents and downloaded skills. The orchestrator must decide whether to use a pre-existing agent or create a new one, which requires complex judgment based on user data. The video concludes with a sponsorship mention for Betastack, a tool for monitoring server logs and error tracking using AI. Ultimately, the future of this native Claude feature remains uncertain due to these integration complexities.

Community Posts

View all posts