00:00:00If you've been following the channel you must be familiar with the different types of context
00:00:04engineering workflows that we've covered here. Well, Google also released another one. I wish
00:00:08I could say that it's better than other workflows. But the truth is that it's not. And there are many
00:00:13problems with this. Even if you argue that it's better for the Gemini ecosystem, it's still not
00:00:17good. Before we dive into why there was no need to release this, let's take a quick break to talk
00:00:22about Automata. After teaching millions of people how to build with AI, we started implementing these
00:00:27workflows ourselves. We discovered we could build better products faster than ever before. We help
00:00:32bring your ideas to life, whether it's apps or websites. Maybe you've watched our videos thinking
00:00:37I have a great idea but I don't have a tech team to build it. That's exactly where we come in.
00:00:42Think of us as your technical co-pilot. We apply the same workflows we've taught millions directly
00:00:48to your project, turning concepts into real working solutions without the headaches of hiring or
00:00:53managing a dev team. Ready to accelerate your idea into reality? Reach out at hello@automata.dev.
00:00:59Now before I explain the reason as to why this is just another poor attempt at a context engineering
00:01:04workflow, let's first dive into how Conductor actually works. So this is the article and I'll
00:01:09have a link for this down in the description below. At the end, you'll get a command to actually install
00:01:13this as an extension in Gemini CLI. For those of you who don't know, extensions are sets of commands,
00:01:18MCPs, and other rules that are bundled together and made into a package that people can then
00:01:23host and share with others. Claude also has something similar called plugins. So to actually
00:01:27start the workflow, you use the command and it installs. After installation, you can use its
00:01:32slash commands in Conductor. You'll get these five commands that actually control Conductor and how
00:01:37you use the workflow. Now the very first command that you're going to use is the setup command.
00:01:41What this command does is first check if the existing Conductor files such as the setup state
00:01:46and the other files that tell it if a project has already been initialized are available or not.
00:01:51Instead of stories, it makes up these files called tracks and completes those one by one.
00:01:56After that, it initialized a new GitHub repo and asked what to build. To test it out, I created a
00:02:02simple project but I did want to test whether the architecture it made would actually be good. So just
00:02:07to actually test if it would recommend the things that I would actually need, I told it that it should
00:02:11be production ready and scalable to a larger number of users. After that, it created the product.md file
00:02:17which contained the actual concept of what I wanted to build. To actually refine and craft it, it
00:02:22started asking me questions and at the end, because the questions weren't actually leading anywhere and
00:02:27they were really simplistic, I just had it auto generate everything. After it approved and saved
00:02:32the product guide, it wanted to create another file which was the product guidelines which were mainly
00:02:36focused on the styling of the product and some design principles. It also approved that and saved
00:02:41the product guidelines as well. After that, it defined the technology stack and this is one of the
00:02:45reasons the workflow was not good. It messed up the tech stack that it was offering me because it knew
00:02:50what my whole project was and it still didn't really recommend what was appropriate. After I had that
00:02:55corrected, it also approved the tech stack and updated that MD file as well. It also has these
00:03:00files called code style guides. If I go into the actual folder, these are the only languages that it
00:03:05has and if it thinks we are going to be using any of these in the project, it adds them to our current
00:03:10project's code style guides during the initialization. The default workflow that it's using is actually
00:03:15pretty good. By default, it includes 80% code test coverage and while it was setting stuff up and
00:03:20writing the base components, it was making sure that the tests were being written as well and after
00:03:25completing tasks, it was testing them as well. At the same time, it was committing changes after every
00:03:30task and also using git notes so that we could actually track where or whenever something went
00:03:36wrong. After completing the initial setup, it created some high level product requirements so
00:03:40that we could get on the initial track. This is the first track that it was trying to implement.
00:03:45Again, this was too broad and needed to be broken into smaller tracks. This was too much to do in
00:03:50one track and there were a lot of chances to mess up if it was doing this much at the same time. So
00:03:55after you complete that, you can start your work by running the implement command and in the tracks
00:03:59folder, you have different tracks that it implements one by one. Each track has two files, a plan.md
00:04:05and a spec.md. The spec.md contains the objective and the technical details extracted from the tech
00:04:11stack and the information that we inputted at the start. The plan.md actually contains the tasks
00:04:16that it needs to implement one by one. When you're actually using the implement command, it looks at
00:04:20the tracks.md and basically looks at each track where based on the status, it actually knows what
00:04:25to do. So if it's empty, it's not started. This means that it's in progress and this means that
00:04:30the track has been completed. And as you can see, this current track is in progress. As for the other
00:04:34commands, the status command gives you a status report of what is currently going on and which
00:04:39tracks are being followed and which ones are not complete. If you use the new track command, it's
00:04:43going to ask you the different questions again for the new task. I also implemented it in a pre-existing
00:04:48repository and it went pretty much the same way. It was a little different because it would look at
00:04:52the existing files and just ask me clarifying questions and it didn't ask for a new track.
00:04:57I had to implement a new track myself as a new feature. And then there's revert, another really
00:05:02clever feature that actually mitigates any damage and is git aware. So it uses git to help out if the
00:05:08agent messes up anywhere. Now, currently the file management and structure isn't that bad. The way
00:05:13it implements new features or existing tasks into tracks and then keeps track of them is actually
00:05:18pretty good. But the way the instructions have been written or how these command files have been
00:05:22written does need work because they're not really properly managing the context loop where it has to
00:05:27check everything. And if there is a change, then how it needs to change that. Because even during
00:05:31this initial process, there were a lot of mistakes. The first mistake is that while it was asking for
00:05:36the creation of each document, it didn't really dissect my idea properly. And I had to guide it
00:05:41through a lot of the stuff. When I thought it was adequate, I just let it auto generate the rest of
00:05:46the content. And again, as I mentioned before, while defining the technology stack, it also missed a lot
00:05:50of things. Option B was good. But since I told it that I wanted a fully scalable app with a large
00:05:55number of users, it missed a lot of things that I had to clarify and explicitly tell it that it also
00:06:00needed and then it modified the plan. When the initial track was generated, I actually went in
00:06:05and looked at the plan and the specs that it had generated and the database schema was totally
00:06:10incomplete. It had missed a lot of things that were crucial to setting up the app and I had to
00:06:14guide it again and steer it in the right direction. Now, Gemini is actually a really good model. So I
00:06:19have to suspect that the commands that have been implemented are what's making it behave this way.
00:06:23And then the biggest reason I believe that even though the setup itself is actually good, there
00:06:27are a lot of problems in the main slash commands and especially the workflow dot MD is because it
00:06:33messed up a really big part after I told it that I wanted to change NPM. And instead, I wanted to use
00:06:38P NPM since I had forgotten to mention it earlier. For some reason, it tried to make a backup first.
00:06:43And while doing that, it stated that it needed to remove the files made with NPM. But it ended
00:06:48up removing the entire conductor folder itself, which contained all the planning files. After
00:06:52deleting that it was continuously looking for the folder. And when it couldn't find it, it said that
00:06:57it would reconstruct the conductor folder using its context and everything that it had in its memory.
00:07:02So basically, it had to rewrite everything as opposed to what a normal context workflow should
00:07:07do, where the change should only affect the main context files and the files related to that
00:07:12specific task, which is what be mad does to operate efficiently. Now, if I hadn't asked it to abruptly
00:07:17change something, maybe it would have gone well. But still, when it was initializing all the tasks,
00:07:21and I asked it to start implementing the first track it began and initialize the project and the
00:07:26other core services that I needed. Now when it came to configuring the environment variables for the
00:07:31super base connection, for some reason, it automatically marked the task as completed while
00:07:36clearly putting a dummy key in there. It didn't even ask me to set up the super base project or
00:07:40provide it with an actual key. And it automatically tried to push the database schema. Since there was
00:07:45no actual key, it failed. And then it asked me to double check the string. So even the tasks aren't
00:07:49being properly updated, and it wasn't really following them correctly. I honestly wouldn't use
00:07:54this right now for end to end spec development. Be mad is a much better option. And for small projects,
00:07:59I still make my own context files. That brings us to the end of this video. If you'd like to support
00:08:03the channel and help us keep making videos like this, you can do so by using the super thanks
00:08:08button below. As always, thank you for watching and I'll see you in the next one.