00:00:00People are going crazy over Kimi 2.5. It's an open-source model that has some better benchmarks
00:00:05than Opus and an insanely clever Agent Swarm mode where an orchestrator can spawn up to 100
00:00:11specialized agents for a complex task. But did you know that this feature also exists in Claude's code
00:00:17behind a hidden flag and was discovered by a user on Twitter? How did someone discover this? And did
00:00:23Anthropic just steal this idea from Kimi? Hit subscribe and let's get into it. Anthropic announced
00:00:30custom sub-agents in July last year and since then people have been using them for all kinds of
00:00:35specialized tasks. We actually made a video about it back then too. But sub-agents themselves only
00:00:41have a snippet of the wider context since they're designed for a specialized task. So they do that
00:00:48task, return the data and have a fresh slate of memory. So people kind of implemented memory
00:00:54by getting the sub-agents to output their findings to a markdown file and also update a main context
00:01:01file. So if the same or a different sub-agent was asked to make an update they could just read those
00:01:06files and see where the other sub-agents had left off. But you still have to manually create a sub-agent
00:01:12giving it a role, access to specific skills, tools, permissions and so on. And this is why Kimi's new
00:01:19agent Swarm takes things to the next level because the orchestrator is the one that dynamically
00:01:25creates a specialized sub-agent for a specific task so you don't have to do anything. These sub-agents
00:01:31can work in parallel to complete an overall task and when they're done with their bit they can give
00:01:36it to the orchestrator who can decide if new sub-agents need to be spun up with that data in
00:01:42order to complete the complex task. Kimi's agent Swarm is still a research project but it's already
00:01:48showing great improvements compared to a single agent workflow. I mean look at this graph the more
00:01:53complex a task is it pretty much stays consistent because of the agents working in parallel to
00:01:58complete the same thing. Now if I'm being honest you can kind of already do this in Claude code
00:02:04so with the new-ish task feature you can create a list of tasks and fan them out to individual
00:02:10sub-agents. The problem is that these sub-agents are general purpose and not specialized for the
00:02:15specific task. I'm also not sure if Claude is automatically able to assign tasks to the correct
00:02:21custom sub-agent. Let me know in the comments if you've tried this already. But it looks like
00:02:25the Claude team have been working on a way for an orchestrator to automatically create sub-agents on
00:02:31the fly based on the task and this feature has been hidden behind a flag which was found by Mike Kelly
00:02:37who shows how it works in this tweet. And in the same tweet shares a link to a repo which is a fork
00:02:42of CC Mirror called Claude Sneak Peek. Let's try it. So this is a plan written by AI to create a
00:02:48web front-end for a tool called XDL that allows you to download videos from X or Twitter in the
00:02:55terminal. I've already installed and have Claude Sneak Peek running which you can see looks like
00:03:00a minimal version of Claude code. I'm going to ask it to read the plan.md file and create tasks that
00:03:05can be executed by a swarm of sub-agents. Then I'll leave it to create the tasks and now it's finished
00:03:11creating the tasks I'm going to ask it to execute the tasks using sub-agents. Now before I do that
00:03:16just to confirm I don't have any custom sub-agents in place I'm going to run the agent slash command
00:03:21and you can see that there are no specialized or custom sub-agents in place. So now it's
00:03:26executing the tasks and here it automatically added a front-end builder sub-agent for the front-end
00:03:32tasks. And you can see here we have a team if you press down to view the team we can see we have five
00:03:37agents in place a team lead, QA tester, back-end builder, component builder and front-end builder
00:03:42all working on tasks at the same time. And we can also see what each agent on our team is working on.
00:03:48So we can see the QA tester is searching for patterns the back-end builder is also searching
00:03:53for patterns and reading files and so is the component builder and front-end builder. If we
00:03:57want to see exactly what our agent is doing we can hit enter and now we're in the agent's view
00:04:02and we can see its system prompt. If we go back we can see we now have eight agents so a component
00:04:07creator, an API server, someone doing the feet setup, someone integrating the API and now we have
00:04:13someone doing CSS and our team of agents just seems to keep growing. If we hit enter on the team lead
00:04:18we can see we're back in the main cloud code view so the team lead is the main cloud code orchestrator.
00:04:24We can also see in the main view that each sub-agent is giving us its current status
00:04:29and if I zoom out a tiny bit and scroll up we can see the messages sent previously from all
00:04:34the different agents. And now that all the tasks are complete we get a swarm project complete file
00:04:41which tells us everything that was done but we also get a swarm execution report which gives us
00:04:47the number of specialised agents that were used, their role and if they completed the task. We can
00:04:52also scroll down to see in detail exactly what each agent did. Now based on how much work the
00:04:59Claude team have already put into this feature I don't think they copied Kimi. I think they saw
00:05:04implementations online like agents and be mad and wanted to add it to Claude code natively but I can
00:05:10totally understand why they haven't released it. Firstly I don't think this feature has had the many
00:05:16hours of training that the Kimi 2.5 orchestrator has and also things get really complicated for a
00:05:22user that already has some or even many sub-agents. For example if a user wants to complete a complex
00:05:28task how does the orchestrator know to create a brand new front-end sub-agent or use the user's
00:05:35existing sub-agent? What metrics or data is he using to make that judgment? And also skills adds
00:05:42more complication. If a user already has a bunch of downloaded skills how would the orchestrator know
00:05:49to use them for a new agent or to download its own ones which may even be more appropriate for the
00:05:56task at hand? I mean this orchestrator if anthropic ever release it will have to go through a bunch of
00:06:02user data already, agents, tools, skills just before it can decide if it needs to make its own sub-agent
00:06:10and what things it should add to it. I actually don't know if the team are working on this feature
00:06:16right now as I speak or if they've decided it's too complicated and won't release it. I don't know.
00:06:22Speaking of features if you're using an AI or a human to rapidly add features to a project and you
00:06:28want to make sure things don't break then you really need to check out Betastack because it's able to
00:06:33monitor logs on your servers and use anomaly detection to tell you if anything goes wrong
00:06:38before it does. And it also has AI native error tracking to let you know if anything goes wrong
00:06:44on your front end. So go and check out Betastack today.