Claude Code Just Got a Brain Upgrade Inspired By Beads

BBetter Stack
Computing/SoftwareSmall Business/StartupsInternet Technology

Transcript

00:00:00The Claude Code team have recently upgraded to do's to tasks which is huge news because
00:00:05it means each task has its own JSON file that can be updated and committed to GitHub.
00:00:11These tasks can run in parallel with sub-agents and multiple Claude Code sessions can share
00:00:16the same task list.
00:00:17Perfect for complex projects that have multiple tasks and need lots of sessions.
00:00:22But what does this mean for the super popular Ralph Wiggum loop?
00:00:26Does this make it obsolete?
00:00:28Not quite.
00:00:29Subscribe and let's get into it.
00:00:32Opus 4.5 changed the game in many ways.
00:00:35One thing it can do that you might not know about is its ability to run autonomously for
00:00:39much longer, keeping track of its state better than other models.
00:00:44Which means the classic to-do list you've seen in Claude Code before is pretty much
00:00:48not needed for small tasks.
00:00:50But for longer tasks, it still has a 200k context window, meaning it does have a smart zone and
00:00:56a dumb zone so it will output dumb results over the 80% mark.
00:01:02Check out my Ralph Wiggum video to learn more about the smart and dumb area for a model based
00:01:07on its context.
00:01:08Now at this stage you may grab a tool like Beads which stores tasks inside an SQLite
00:01:14database and puts them in a JSONL file to be committed for version control.
00:01:19The Beads tool is what greatly inspired the Claude Code team to upgrade to-do's to this
00:01:24new task management system which does a bunch of things from storing tasks in JSON files
00:01:30to letting you run them in multiple sessions and much more.
00:01:34But as cool as this upgrade is, it does work a bit differently from Beads and Ralph Wiggum.
00:01:39In fact, let me show you.
00:01:41So here is a planned file written by Claude Code containing three major changes I want
00:01:46to add to a tool called XDL to help you download videos from X or Twitter using a CLI.
00:01:54And for tasks to work you need to be on Claude Code version 2.1.6 or higher which contains
00:02:00these task management related tools.
00:02:03So I'm going to ask Claude to turn the planned file into a set of tasks to complete.
00:02:08And you can see here it's created the tasks, it's added some dependencies so tasks that
00:02:13are blocked by other tasks and it's put them here so it's highlighted in yellow the tasks
00:02:18that block the specific tasks.
00:02:20And if we go to the .claud directory on the root of our machine, we can see a tasks folder
00:02:26with another folder for our project.
00:02:29And if we open it, we can see all the tasks that have been created with an ID subject description
00:02:36and what other tasks block this task, all the tasks that are blocked by this task.
00:02:41And now we're going to ask Claude to run each task in a sub-agent which it's gone
00:02:45ahead and done.
00:02:46So task 1 is being done and so is tasks 8, 9 and 10 since they're not blocked by other
00:02:52tasks.
00:02:53And we can also see up here the different sub-agents working on different tasks.
00:02:57And now that all the tasks are done, I can check how much context was used and we can
00:03:01see only 18% was used because all the tasks were done in sub-agents.
00:03:06But here's something else you can do with the new task management system.
00:03:09If I wanted to run multiple sessions of Claude, in this case in different split panes but you
00:03:14could have them in different tabs or different servers, having access to the same task list,
00:03:19what I could do is run this environment variable, Claude code task list ID and give it the ID
00:03:26that matches the directory of the task list I want to use.
00:03:30So here Claude should have access to all the tasks in that directory and I could do the
00:03:34same in this session.
00:03:36So here I could ask one session to go through the tasks and another session to verify the
00:03:41task has been completed.
00:03:43If I run the session on the left, the session on the right should be able to see the progress
00:03:48of each task.
00:03:49And now that it's done on this side, this session over here can go ahead to verify that
00:03:53the task has been completed.
00:03:55This is actually really cool because you could start working on a task on one machine, stop,
00:04:00commit those tasks to GitHub or whatever version control system you prefer and then on another
00:04:06machine pull those tasks and continue from exactly where you left off.
00:04:10Which if you have experience with Beads, then you'll know this is similar to how it works,
00:04:15but not exactly because Beads stores the tasks in an SQLite database for very fast retrieval
00:04:23and it also syncs database tasks to a single JSONL file, not multiple JSON files.
00:04:29So you can add this single file to your project and share it with your team members.
00:04:33This is also a bit different from the Ralph Wiggum loop purely because of the philosophy.
00:04:39So with the Ralph loop, you have a single prompt and you have a list of tasks and these
00:04:43tasks are supposed to help you achieve that prompt, which you send to the model over and
00:04:48over again.
00:04:49But with this new task management system, you have a list of tasks and you ask the model
00:04:54to go ahead and pick the next one that it needs to do.
00:04:57So it reads through all the tasks to find out which one is next.
00:05:02This is somewhat alleviated if you have a sub agent working on a single task, but if you
00:05:07want an autonomous loop that can go for as long as you want, where the model follows a
00:05:12North Star, which is in your prompt MD file to continuously improve the project, even with
00:05:17tasks you haven't added, then the new task management system isn't for you.
00:05:22There's also the issue of documentation because at the time of recording, all the information
00:05:27about this feature is inside a single tweet.
00:05:30And compared to Beads, there isn't much in the way of a visualisation tool or a cam band
00:05:34type tool to see the progress of each task, but I'm sure the Claude code community is
00:05:40working on this right now.
00:05:42And with all these new tool management systems creating new software, you're going to need
00:05:47a way to make sure you're not shipping errors to your users.
00:05:50And this is where better stack comes in, which gives you a way to track errors on the backend
00:05:56and front end of your project using an AI native error tracker, as well as a status page to
00:06:02inform your users if your site goes down and a great incident management system.
00:06:08So go ahead and check out better flack today.

Key Takeaway

Claude Code's transition from simple to-do lists to structured, multi-session JSON tasks enables scalable, autonomous software development while optimizing context window efficiency.

Highlights

Claude Code version 2.1.6+ replaces simple to-do lists with a robust JSON-based task management system.

The update allows for parallel task execution using sub-agents

Timeline

Introduction to Claude Code Task Management

The speaker introduces a major upgrade in Claude Code where traditional to-do lists are replaced by task-based JSON files. These files are significant because they can be committed to GitHub for version control and shared across multiple development sessions. The update allows for complex projects to run tasks in parallel using sub-agents, improving workflow speed. The speaker poses a question about whether this makes the popular "Ralph Wiggum loop" obsolete. This section sets the stage for a deep dive into the technical evolution of the tool.

Model Capabilities and Inspiration from Beads

This segment discusses the role of Opus 4.5 and its ability to handle autonomous tasks for longer durations through superior state tracking. The speaker mentions the "smart zone" and "dumb zone" of context windows, noting that Claude stays effective until about the 80% mark of its 200k capacity. A tool called Beads is highlighted as the primary inspiration for the Claude Code team due to its SQLite and JSONL task storage. Beads allows for efficient version control by syncing database tasks into a single shareable file. This context explains the design philosophy behind the new task architecture.

Demonstration of Task Creation and Sub-Agents

The speaker demonstrates how to turn a plan file into a set of actionable tasks using Claude Code version 2.1.6. The system automatically handles dependencies, marking blocked tasks in yellow and creating a structured directory in the .claude folder. Each task is assigned a unique ID, description, and metadata regarding its relationship to other tasks. By running these tasks through sub-agents, the system completed the work while only consuming 18% of the context window. This practical example showcases the efficiency gains provided by the new parallel execution model.

Multi-Session Collaboration and Git Integration

This section explains how to use environment variables like CLAUDE_CODE_TASK_LIST_ID to share task lists between different terminal panes or machines. One session can be dedicated to executing tasks while another session is used simultaneously for verification and quality control. The speaker emphasizes the benefit of committing these JSON tasks to GitHub, allowing developers to pull the project on a different machine and resume exactly where they left off. This creates a seamless handover process for teams or individual developers moving between environments. The comparison to Beads is revisited to note differences in file formats and retrieval speeds.

Comparison with Ralph Wiggum and Conclusion

The speaker clarifies the philosophical difference between the new task system and the Ralph Wiggum loop, noting that the latter follows a "North Star" prompt for continuous improvement. While the new system is excellent for specific, pre-defined lists, the Ralph loop remains relevant for autonomous projects that need to evolve beyond a static task list. Current limitations are mentioned, such as the lack of visual Kanban tools and documentation being limited to social media. The video concludes with an endorsement of Better Stack for tracking errors in projects built with these AI tools. This summary helps users choose the right methodology for their specific software development needs.

Community Posts

View all posts