The Blueprint For The Complete Claude Operating System

AAI LABS
Computing/SoftwareInternet Technology

Transcript

00:00:00Imagine you're a medieval king. You've got a whole kingdom to run but you'd rather do absolutely
00:00:04nothing while other people handle it for you. The problem is you can't because your staff are used
00:00:10to being spoon-fed. What you really need instead is a system that runs the entire kingdom on its
00:00:15own and that is exactly what Claude Code has become. Ever since Anthropic has been shipping
00:00:19updates, it stopped being just a coding agent and turned into a full operating system, one that
00:00:25coordinates everything on your machine. But dynamic workflows are what actually tie it all together.
00:00:30So before our king hands his whole kingdom to an agent, let's see how this thing actually works.
00:00:35Ever since Anthropic started shipping new ways for us to waste tokens, which is really just their
00:00:40excuse to make more money off Claude Code, it's become way more than just a coding agent. It's
00:00:44basically a full operating system now. Just like how an operating system forms the foundation of every
00:00:50task and coordinates what you do on your machine, Claude Code now plays that same role. It coordinates
00:00:55and controls everything you do on it. But before we dive into how dynamic workflows complete this
00:01:00system, you need to know about the other components. The only difference between a
00:01:04computer operating system and the Claude Code operating system is that you don't have to work
00:01:08that hard on the setup unless you're actually using Arch. And no, you won't be installing a
00:01:12shitload of drivers just to make the microphone work so that you can voice prompt like a vibes god.
00:01:17And just like a real OS, it's made up of multiple parts. Each one is important enough that the
00:01:22system isn't complete without it. In an OS, the kernel is the most important layer and forms the core and
00:01:28controls all operations. The equivalent in Claude Code is the Claude.md file and your context files.
00:01:33We already made a complete video talking about how to structure your Claude.md file so that your agent
00:01:39performs at its best. That matters here because the kernel is the driving program of your whole agent.
00:01:44If it isn't set up properly, the agent can't figure out what your project actually wants. And the other
00:01:48parts fall apart with it. Kind of like how your whole life falls apart when you get married. Then there
00:01:53are the drivers, the pieces that let the system interact with external devices. The equivalent in
00:01:58Claude Code is MCP. So whenever Claude needs an external tool, it reaches for it through MCP and calls
00:02:04that tool to do the job. After that come the everyday programs, which in Claude Code are the skills and
00:02:09other commands. These hold structured instructions for repeatable tasks and you can invoke them whenever you
00:02:14need them. Every OS also needs a scheduler or cron job that runs a specific task at a scheduled time.
00:02:20In the same way Claude Code recently added loops and routines. These are basically its cron jobs and
00:02:25they remove the need for you to monitor it through a task. They automate the repetitive work you would
00:02:29otherwise do by hand. So even if your system goes off, the tasks keep running on their own. So you can
00:02:34sleep peacefully knowing that your B2B SaaS application that literally no one is using is being looked
00:02:40after. And last but most importantly, there's the one piece that ties all of them together into a
00:02:45complete operating system. That piece is the dynamic workflow, the new feature that shipped with Opus
00:02:504.8. You might already know that Claude Code has dynamic workflows. Basically, they're another attempt
00:02:55by Anthropic at simplifying long-running tasks. They work as repeatable instructions that spawn multiple
00:03:01agents to perform the task they're designed for. So how is it different from the other architectures you
00:03:06already have? To compare them, the first and simplest one is skills. Skills are repeatable instructions for
00:03:11tasks that need guided steps. But a skill is spawned by one agent and that same agent reads the instructions
00:03:17from it. It just guides the agent to do a task it already knows in a better way and doesn't help with
00:03:22long-running tasks. It's just one agent doing the whole thing. Then there is the goal command. It
00:03:27iterates toward a predefined end goal and the agent loops until the end condition is reached. This was an
00:03:32exceptional attempt at making long-running tasks better. We've been using it a lot in our own workflows ever
00:03:38since it was released. Both goal and workflow can coordinate multiple agents, but they're different.
00:03:43The core thing that separates them is determinism. Goal is non-deterministic, meaning the system decides
00:03:48what to do next. A workflow is deterministic and the code decides exactly what happens. You create your
00:03:54first workflow just by using the keyword workflow. From that word in your prompt, Claude identifies the
00:03:59dynamic workflow needed for the task, but this is a word we use all the time in prompts, so you might
00:04:04think it would trigger every time. It won't though, unless the prompt actually expresses the intention
00:04:09to create one. This is where workflows are actually different. Instead of the usual markdown that others
00:04:14use, it creates a JavaScript code. It lives inside the workflow directory within the dot Claude folder,
00:04:19and it uses that entire script to control the whole thing. So instead of your plan living in the
00:04:23context window, that plan is written down in code, defining how the sub-agents will work step by
00:04:28step. It defines strict schemas, which are basically forms for the sub-agents, so that they give the
00:04:33output in a strict format. Each agent is called with the prompt and the form it has to satisfy. It keeps
00:04:39working until the output matches that schema, then returns its findings. You invoke them with the slash
00:04:44command with the workflow name, then you can hand it the plan you want to stress test. It runs in the
00:04:49background so you can carry on with your own work, give it another prompt so that your project manager
00:04:53feels proud about your AI productivity for once. To check the progress, you just run the workflow
00:04:58command. There you can see every stage of each workflow and all the models each agent has invoked,
00:05:03and see how many tokens each task has burnt through. And if your session ends while a workflow is running,
00:05:08you don't have to worry about losing progress. It persists after you run the resume command. Each workflow
00:05:14keeps its own ID. And when you resume, it pulls all the cached agent work back from memory and picks up
00:05:19where it left off. Unlike my grandma, it doesn't just forget to pay the Claude AI bill and actually
00:05:24remembers what it needs to do. One thing to note before you use a workflow. Since this is in research
00:05:29preview, dynamic workflows consume way more tokens than a typical Claude code session. That's because
00:05:35they use multiple sub-agents under the hood and each one runs in its own separate context window. You need
00:05:40to carefully consider when you actually need them, or else you'll run out of your $200 plan in a few
00:05:45hours. There are a few key metrics that tell you whether a workflow is the best option. The first
00:05:50is that the task can be split into independent units. If the agents depend on each other's work,
00:05:55they end up waiting around, and there's no point spawning a workflow because you lose all the
00:06:00parallelism. This is why, if the tasks are less dependent on one another, you get better parallelism and
00:06:05faster results. Which your startup should learn from as it's still dependent on your parents' money
00:06:10to survive. The next reason to use dynamic workflows is if the task needs more than a single context
00:06:15window to run and needs to be divided into chunks. Workflows use multiple sub-agents, each with its
00:06:21own context window, so the task should be big enough to actually need those separate windows. Otherwise,
00:06:26you'll just be wasting time and tokens. Each sub-agent runs in its own fresh context and returns
00:06:31only the result. The rest of its reasoning stays in the code file and never enters the main context window
00:06:36unless you need it. The next reason is that the task is worth verifying. Use a workflow when a wrong answer
00:06:41is expensive enough that it needs cross-verification before you move forward. That includes things like
00:06:46security findings, bug claims, and migrations. But that verification costs extra agents which burn
00:06:52tokens and time. So make sure the task is actually worth it and you're not just spawning five agents
00:06:57because you recently heard an AI tech CEO say that more tokens equals more money. The last reason is that
00:07:03your task is deterministic. A workflow uses code to call agents in a fixed structure. So if the task is
00:07:09deterministic, go for it. If the task is not deterministic and needs an agent to evaluate what
00:07:14the next task would be at the runtime, workflows are not for that. So when you choose between workflow and
00:07:20goal, think about the shape of the task. A task can be wide or deep. Wide means it can be broken into many
00:07:25subtasks that can run at the same time. Deep means one task at a time, going step by step further into it.
00:07:32A workflow is wide, so instead of going deeper, it just calls the agents and lets them iterate. For deep
00:07:37tasks, the goal command takes one task at a time and does not run things in parallel the way workflows
00:07:43do. Only reach for a workflow once the task genuinely fits, so you don't waste tokens.
00:07:48Claude Code already ships with a built-in dynamic workflow called Deep Research. It's basically the
00:07:53multi-step research pipeline we used to build by hand with multiple context files and Claude.md. Now
00:07:58it's just a workflow you can invoke from any project. This research forms a key part of the whole OS you
00:08:04build. It makes sure the information sources behind that OS are trustworthy, so your mom can't feed you
00:08:09fake info from her boomer Facebook group and then scold you when you fact check her. It runs in five
00:08:14parts and each one leads into the next. First, it searches for information, then fetches the details
00:08:19from the sources it finds. After that comes adversarial verification to cross-validate the claims,
00:08:24and it synthesizes whatever survives into one final document. You can watch it work from the
00:08:29workflows command, where each sub-agent inherits its tools from the parent, and it's really token
00:08:34intensive, so it can burn through your whole limit in no time. This one run took a million tokens on a
00:08:39small topic. Aside from multi-step research, you can build other research workflows that become part of
00:08:45your research system. One we made for ourselves, researches competitors, checks how they're performing,
00:08:49and finds the competitive edge they have. This is an important piece if you're a product builder. You
00:08:54need to know how your competitors are performing in the market so that you can build something better.
00:08:59This one is split into four phases, like the research workflow, and once it finishes, it reports back
00:09:04the findings. Our run used 679,000 tokens and 34 agents and wrote a full markdown report with its findings.
00:09:11It also improves itself as it goes. When it hits an issue, it applies a fix, so the next time you run it,
00:09:17it doesn't run into the same issues it did the first time. The report comes with clearly defined
00:09:21comparison metrics and all of its findings, so when you build your product, you can use it as a source
00:09:26for analyzing the market before launching it. Also, if you are enjoying our content, consider pressing
00:09:30the hype button because it helps us create more content like this and reach out to more people.
00:09:35Every operating system needs its kernel, its drivers, and the pieces that make it complete. Together,
00:09:41they let it run without your input. One example of such a system is a second brain setup. This is
00:09:45definitely useful if your first one like ours got completely f***ed from sitting unused ever since
00:09:50our devices got blessed with LLMs. The kernel of this second brain becomes your Claude.MD,
00:09:55which holds the information on how to navigate the whole system. The everyday programs, the repeatable
00:10:01things are your skills. They carry the instructions for the tasks you do over and over. Here is the best
00:10:06way to set one up. When you are deep in a long session and realize this is something you will do often,
00:10:11just ask Claude to combine the learnings from that session into a skill. The memory of this OS is all
00:10:16the files you create and maintain in your vault. They record what you do and how you do it. That
00:10:21means it knows more about you than you do yourself and they give Claude context on everything you are
00:10:25working on. We often need the second brain to reach external sources, so we've configured the
00:10:29Google Calendar and Notion MCPs. That way it can access the project files in Notion and sync the data,
00:10:35read the schedule on the calendar, and create and update entries so that it can fit some touching grass in
00:10:41between your already busy schedule. We've documented the exact formats it should follow in the Claude.MD
00:10:46file and the most important part is creating the workflows for your setup. These let you parallelize
00:10:51your repeatable tasks and hand them to sub-agents. The morning brief workflow we built spins up sub-agents
00:10:57to gather information across multiple sources and returns a brief to start our day. Once all this is set
00:11:02up, you just give it a prompt. It loads the right skill and context, creates the files in the right places,
00:11:07and connects the information to the relevant parts on its own. If you've been using the second brain
00:11:12for a while, you should build an audit workflow. It checks for broken links and exposes every issue
00:11:17in the setup and reports them back. From there you can run the fixes and keep your second brain in top
00:11:22shape, but knowing what kind of man you are, you'll also be paying for its therapy sessions by next week.
00:11:27Similar to how you can set up a whole operating system for non-coding projects, you can do the same for
00:11:32your coding projects as well. You set up your claude.md as the kernel and put all the project
00:11:37information inside it. You configure the agents for your project which act as your everyday programs.
00:11:42You also set up hooks for different cases, like formatting a file after an agent finishes editing
00:11:46it, so that between the f***ing mess, you call your relationship and your code, at least one thing
00:11:51is organized. You create skills for different tasks, like adding a new endpoint. That way every endpoint
00:11:56follows the exact schema you want, and you can create workflows for things like reviewing changes before
00:12:01shipping, migrating the code base or the database, and running end-to-end tests to confirm the whole
00:12:07app works. Instead of you waking up by your manager calling at 2am that your prod is down again, the
00:12:12context for this OS becomes the files in your docs folder and the code itself. Workflows are exceptionally
00:12:17helpful for project migrations. You can build one that converts your whole project from one library to
00:12:22another and let the individual agents handle the conversion. We tested this before, and without a
00:12:27workflow it took more than an hour, but with a workflow it took just 21 minutes. So the time saved
00:12:32with workflows can go towards more important things, like scrolling through Dario's inappropriate deep
00:12:37fakes. This is how our operating system extends into coding use cases, so when you are building projects,
00:12:43you don't have to handle everything by hand. You let the operating system do it for you. If you want to
00:12:47found the next big AI B2B SaaS company but don't know where to start, you should be in AI Labs Pro.
00:12:53That's where you'll find the workflows used in this video, along with all the other resources,
00:12:57guides, and goodies we've put together. You'll also get to meet a bunch of like-minded nerds,
00:13:01including our team. The link's in the description, and you can check that out.
00:13:05That brings us to the end of this video. If you'd like to support the channel and help us keep making
00:13:09videos like this, you can do so by using the super thanks button below. As always, thank you for
00:13:14watching and i'll see you in the next one

Key Takeaway

Claude Code acts as a comprehensive operating system that uses dynamic workflows and sub-agent architectures to automate complex, multi-step tasks at scale.

Highlights

  • Claude Code functions as an operating system by coordinating project files, external tools, and task automation through a centralized architecture.

  • Dynamic workflows enable parallel processing by spawning multiple sub-agents, reducing task completion time significantly, such as a code migration finishing in 21 minutes versus over an hour.

  • The core components of the Claude Code OS include the Claude.md file as the kernel, MCP for external tool interaction, and loops for recurring automated tasks.

  • Dynamic workflows use JavaScript-based scripts within the .claude directory to define strict execution schemas, ensuring sub-agents output data in required formats.

  • Token usage for dynamic workflows is significantly higher than standard sessions because each sub-agent operates within its own dedicated context window.

  • Effective dynamic workflows require tasks that are deterministic, decomposable into independent units, and large enough to justify separate context windows for sub-agents.

Timeline

Components of the Claude Code Operating System

  • The Claude.md file acts as the kernel, driving the agent's understanding of project requirements.
  • MCP integration allows the system to interact with external tools and devices.
  • Loops and routines automate repetitive tasks by functioning like cron jobs.

Claude Code has evolved from a simple coding agent into an operating system that coordinates machine operations. The system's architecture mimics a standard OS with a kernel for core operations, drivers via MCP for external tools, and programs for specific skills. Automated loops remove the need for manual monitoring of long-running tasks.

Dynamic Workflows and Deterministic Execution

  • Dynamic workflows are deterministic, using code to define exactly how sub-agents process tasks.
  • These workflows use JavaScript-based scripts to manage step-by-step execution and enforce strict output schemas.
  • Parallelization occurs when tasks are broken into independent units processed by separate sub-agents.

Dynamic workflows, introduced in Opus 4.8, differ from standard skills or goal-based commands by using programmatic control rather than non-deterministic agent decision-making. These workflows operate in the background and persist across sessions. They are specifically suited for tasks that can be broken into independent, wide-reaching subtasks rather than deep, sequential processes.

Implementation Strategies and Constraints

  • Dynamic workflows consume substantial tokens because each sub-agent utilizes a fresh, separate context window.
  • Tasks involving verification or migration benefit most from the structure and parallelization of workflows.
  • The built-in Deep Research workflow automates information gathering through a five-part, multi-step pipeline.

Due to high token consumption, workflows should be reserved for complex tasks requiring cross-verification or significant context capacity. When a task requires multiple agents that do not depend on each other's immediate output, parallelism yields faster results. The built-in Deep Research tool demonstrates this by automating search, fetch, verification, and synthesis processes.

Building Custom Operating Systems for Projects

  • Second brain systems utilize Claude.md and Notion or Google Calendar MCPs to manage personal or professional workflows.
  • Coding projects can implement specific workflows for tasks like reviewing changes, migrating databases, or running end-to-end tests.
  • Workflow automation can reduce migration times significantly, exemplified by a reduction from over 60 minutes to 21 minutes.

Users can architect custom operating systems for both coding and non-coding projects by defining kernels, skills, and workflows. For coding, this includes setting up hooks for file formatting and migration tasks. Consistent use of these systems creates an environment where the agent manages complex maintenance, drastically reducing manual effort.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video