Transcript
00:00:00The main problem with AI agents is the limited context window,
00:00:03which restricts what they remember from previous actions.
00:00:06When we give Claude code a larger task,
00:00:08it compacts multiple times while attempting a single feature,
00:00:11forgetting the main task it was asked to implement,
00:00:14making it less effective for long running tasks.
00:00:17Anthropic just released a solution that is based on how real teams work
00:00:20in an actual engineering environment.
00:00:22They identified two key reasons for why it fails on long tasks.
00:00:26Many of us have tried to one shot entire applications
00:00:29or some big features,
00:00:30and doing too much causes the model to run out of its context.
00:00:34After repeated compaction,
00:00:35the context window is refreshed with the feature only half implemented
00:00:39with no memory of the feature's progress,
00:00:41and it leads to incomplete implementation.
00:00:43The second issue is that, due to less testing capabilities,
00:00:46Claude marks untested features as completed.
00:00:49It assumes the feature is complete, even if it doesn't actually work properly.
00:00:53Their solution was using an initializing agent and coding agent in Harmony,
00:00:57inspired by how real software teams work.
00:00:59This workflow is originally meant for agents you build yourself,
00:01:02but I realized it could apply to Claude code instances as well.
00:01:06The first agent focuses on properly initializing your coding agent,
00:01:09and you have to be patient here because it takes a little time.
00:01:12I have an empty Next.js project and I want to build an online Python compiler.
00:01:16Before starting, create a Claude.md file using the init command.
00:01:20This file is a document for your codebase and is at the root of your project,
00:01:24containing an overview and all important information.
00:01:27Next, generate the feature list JSON in the project root.
00:01:30It should list all features and their corresponding testing steps as well,
00:01:33with all tests marked as initially failing, so Claude is forced to test them.
00:01:38We use JSON instead of Markdown
00:01:40because JSON files are easier to manage in the context.
00:01:43Since Claude can only test the code, not the interface we see on the browser,
00:01:46I connected Puppeteer for browser testing.
00:01:49After that, create an init script to guide starting the dev server
00:01:52and a progress tracking file so the system is able to keep track of the project completion status.
00:01:57For guidelines, Claude needs to update progress.md after each run
00:02:02and test each feature after implementation.
00:02:04The most important practice is committing to Git.
00:02:07We underestimate how crucial it is to commit in a mergeable state.
00:02:10Git commits with clear logs show what's completed
00:02:13and let you revert if implementation fails.
00:02:15Finally, Claude should not change the features list
00:02:18beyond marking features as implemented.
00:02:20With the environment ready, we move to the coding part.
00:02:23The idea was to implement each feature one by one from the features JSON.
00:02:27Claude also made descriptive commit messages after each tested feature
00:02:31and also launched the browser when needed.
00:02:33Once it verified the app was working,
00:02:35it updated the JSON fields from false to true
00:02:37and updated progress.md with what had been completed so far.
00:02:41Finally, it committed the changes and verified the commit was successful.
00:02:45The advantage of this incremental approach is that even if the session terminates,
00:02:49you can resume exactly where you left off.
00:02:51Everything is tracked in the Git logs,
00:02:53so you don't have to worry about breaking code.
00:02:55Claude can understand the project from the Git logs and progress file,
00:02:59not from the code itself, so you can resume the session easily.
00:03:02Your next prompt is simply to implement the next feature marked "Not done".
00:03:06This approach also reduces Claude's tendency
00:03:09to mark features complete without proper testing.
00:03:11Each iteration ensures the app is built end-to-end with real testing,
00:03:15helping identify bugs that are not obvious from code alone.
00:03:19We repeat this cycle until all features are marked true.
00:03:22You might think this is similar to the BMAD method.
00:03:24It shares similarities, but I think Claude's workflow is better in some ways.
00:03:28It was easier since you didn't call agents separately,
00:03:31and context utilization was better too.
00:03:33After implementing so many features,
00:03:35it only used 84% of context,
00:03:37where BMAD would have already hit compact twice
00:03:40because of the large stories that it makes.
00:03:42That said, BMAD is still an out-of-the-box full system
00:03:45while this is still an idea that needs to be implemented.
00:03:48But BMAD could use some things from this, such as the Git system.
00:03:51After teaching millions of people how to build with AI,
00:03:54we started implementing these workflows ourselves.
00:03:57We discovered we could build better products faster than ever before.
00:04:00We helped bring your ideas to life, whether it's apps or websites.
00:04:04Maybe you've watched our videos thinking,
00:04:06"I have a great idea, but I don't have a tech team to build it."
00:04:08That's exactly where we come in.
00:04:10Think of us as your technical co-pilot.
00:04:12We apply the same workflows we've taught millions directly to your project,
00:04:17turning concepts into real, working solutions
00:04:19without the headaches of hiring or managing a dev team.
00:04:22Ready to accelerate your idea into reality?
00:04:25Reach out at hello@autometer.dev
00:04:27That brings us to the end of this video.
00:04:29If you'd like to support the channel and help us keep making videos like this,
00:04:33you can do so by using the super thanks button below.
00:04:36As always, thank you for watching, and I'll see you in the next one.
Community Posts
No posts yet. Be the first to write about this video!
Write about this video