00:00:00You already know about these AI coding frameworks like Beemad, Speckit and others, but these
00:00:04are not the only ones.
00:00:06There are hundreds of people experimenting and launching their own workflows, but when
00:00:09you try them out, you'll notice that they often fail to deliver on their promise.
00:00:13It's not because their methods are bad, it's because they don't fit your specific use case.
00:00:18When we build apps, the majority of the time we create our own workflows instead of relying
00:00:22on pre-made ones.
00:00:23This is because workflows should be built around your specific use case and only work if they
00:00:28align with the project you're trying to build.
00:00:30So how do you build a workflow for your own process?
00:00:32For that, you need to know certain principles.
00:00:34These are the principles that every framework uses in one way or another.
00:00:38Before discussing the main principles, it is essential for you to know what's inside the
00:00:42context window of these AI tools.
00:00:44It's really important, as managing context is basically what these frameworks do.
00:00:48The context window is basically the amount of information the model can remember at once.
00:00:53Anything that goes out of the model's context window goes out of its working memory, and
00:00:57it has no way to recall it.
00:00:59Models have a limited context window.
00:01:00For example, Anthropic models have a 200k token context window, and Gemini models have 1 million.
00:01:06Even though these might look like really big numbers in terms of the messages you send,
00:01:10they actually are not that huge, because in these AI tools, the context window does not
00:01:14only consist of your system prompt and user messages, but also includes a lot of other
00:01:18things like your past messages, memory files, tools, MCP calls and so on.
00:01:23You need to learn how to make the most out of this limited working space, so that when
00:01:27you build your workflows, the model does exactly what you want it to do.
00:01:30I will be using Claude Code as my primary coding tool throughout the video, but you can build
00:01:35your workflow with any platform, as they all have the tools needed for these principles.
00:01:39The most important principle and the key to any workflow design is progressive disclosure.
00:01:44That means revealing to the LLM only what matters, and keeping the model's attention focused
00:01:48on what is actually needed right now, rather than filling the context window with everything
00:01:53it might need in the future.
00:01:54Now, more advanced models like Sonnet 4.5 have a context editing feature built right
00:02:00in, where they can understand what's noise and try to filter it out on their own, and
00:02:04they use grep commands to narrow down what you want.
00:02:07But that alone is not enough.
00:02:08When we give vague instructions, even these newer models load a lot of things that are
00:02:12not needed and pollute the window.
00:02:14Instead of asking Claude to fix the error in your backend, it is better to ask it to check
00:02:19the endpoints one by one, rather than asking it to fix everything at once.
00:02:23The skills feature in Claude is now open source and all tools can use it.
00:02:27Skills are pretty much the embodiment of progressive disclosure.
00:02:29Their description provides just enough information for your AI coding platform to know when each
00:02:34skill should be used without loading everything into the context.
00:02:37A huge mistake people make is using MCPs for everything.
00:02:41You should only use MCPs when external data is required and use skills for everything else.
00:02:46The second equally important principle is that information not needed right now should not
00:02:50belong in the context window.
00:02:52To achieve this, the tools use structured note-taking.
00:02:55And we can use this to our advantage by providing your AI tool with external files that it can
00:03:00use to document any decisions, issues or technical debt.
00:03:03This approach allows your agent to maintain critical context that might otherwise be lost
00:03:07when building something really complex.
00:03:09These tools also have a compaction feature to manage the context window.
00:03:13And when the context resets, you don't have to rely solely on the compaction summary.
00:03:17For example, your agent can use these notes to gain context on what has already been done
00:03:22and what still needs to be done.
00:03:23This approach is particularly helpful for long horizon tasks, which are inherently complex.
00:03:28You might be familiar with the agent .md.
00:03:30It's a standard context file that all agents read before starting the session.
00:03:34Some agents don't follow this and have their own, such as the claud .md, and I use them
00:03:38to guide the agent on how the external files are structured and what to write in each one
00:03:43of them.
00:03:44Sometimes these agents randomly pause in the middle of a long-running task.
00:03:47A lot of the time this happens because the context has gone above 70% of its limit.
00:03:52This is where the concept of attention budget comes in.
00:03:55Your context window is what the model pays attention to while generating output.
00:03:59When it goes over 70%, the model has to focus more and there's a higher chance of hallucinations.
00:04:04In terms of AI agents, it stops them from using their tools effectively and oftentimes they
00:04:09just choose to ignore them.
00:04:10To solve this, there are several built-in tools you can use.
00:04:14As you already know, compaction allows the model to start afresh with a proper summary
00:04:18of what has happened as the starting prompt and a reduced context window.
00:04:21So instead of letting it fill up to 90% and triggering the auto-compact feature, try to
00:04:26keep an eye on the context window and do it yourself.
00:04:28If you're experimenting, use claud's built-in rewind so that you can delete the unnecessary
00:04:32parts instead of continuing them and asking claud for changes.
00:04:36You should also clear or start a new context window for any new task so that the previous
00:04:40context doesn't slow down the model.
00:04:42Another thing that stems from the principle of progressive disclosure is the ability of
00:04:46these agents to run tasks in the background without polluting the main context window.
00:04:51Sub-agents work in their own isolated context window and only report the output back to
00:04:55the main agent.
00:04:56This is particularly helpful when working on tasks that are isolated from each other because
00:05:01your main context window is protected from being bloated with the tool calls and searches
00:05:05that the sub-agent makes, ensuring the information remains in its dedicated working zone.
00:05:10Since these agents run in the background, you can continue interacting with your main agent
00:05:14and let it work on something that actually requires your attention.
00:05:17Whenever I want something researched, such as the rules of a new framework that I'm
00:05:21working with, I just use these sub-agents.
00:05:23This way, their tool calls and searches are isolated and they just return the answer to
00:05:27the main agent.
00:05:28If you understand the principle of note-taking, you should also know which file format to use
00:05:33for which task.
00:05:34Since these files have different formats, they affect the token count and hence the efficiency
00:05:39of your workflow.
00:05:40YAML is the most token efficient, so I mainly use it for database schemas, security configs
00:05:45and API details.
00:05:46Its indentation helps models structure information properly.
00:05:49Markdown is better for documentation like your claud.md because the heading levels make it
00:05:53easy for the model to navigate between sections.
00:05:56XML is specifically optimized for claud models.
00:05:59Anthropic states that their models are fine-tuned to recognize these tags as containers and separators,
00:06:04which is useful when you have distinct sections like constraints, summaries or visual details.
00:06:10Other models generally prefer Markdown and YAML over XML.
00:06:13And lastly, JSON.
00:06:14It's the least token efficient because of all the extra braces and quotes, so I only use
00:06:18it for small things like task states and don't really recommend using it for the most part.
00:06:23Git is one of the most basic things you're taught when starting programming.
00:06:26We've seen another trend with these context workflows in which people actually use the
00:06:30git commit history as a reminder to the model of the progress that's been made, whether
00:06:34across the whole project or on a single task.
00:06:37Even if you don't want to use it to store progress, you should generally use these context engineering
00:06:42workflows in a git initialized repository.
00:06:44Having a context engineering workflow means that you don't allow the model to do everything
00:06:48at once, but instead act on planned steps one by one.
00:06:51If at any stage you encounter a problem, git lets you control which version to revert to
00:06:56and helps in evaluating which change is causing problems.
00:06:59People have also implemented parallelism with git worktrees.
00:07:02I've also shown plenty of workflows where sub-agents work in dedicated worktrees for
00:07:06parallel work.
00:07:07Whatever workflow you end up making, there are always going to be cases where you end
00:07:10up repeating instructions for common procedures.
00:07:13A good example is how you ask the AI tools to do git commits or update your documentation.
00:07:18In almost all of these AI tools, there are ways to reuse your most repeated prompts.
00:07:22I often use custom/commands in my own projects because they basically give Claude a reusable
00:07:27guide.
00:07:28I often use a catchup command that contains instructions on how I structure memory outside
00:07:33the context window, so Claude knows how to catch up with the project instead of reading
00:07:37every file.
00:07:38They are also good at enforcing structure.
00:07:40For my commits and documentation to follow a defined format, I use a commit/command that
00:07:45follows a specific structure for how it should write commit messages and what pre-commit checks
00:07:49it should make before committing.
00:07:51This way the /commands keep everything standardized, and I don't have to instruct Claude again and
00:07:55again to perform tasks the way I prefer.
00:07:58As you know, MCPs should be used whenever external data is required.
00:08:01Jira is the most widely used team management software.
00:08:04If you want to get information from tickets, you can use the Jira MCP so it can access tickets
00:08:09directly and start implementing changes.
00:08:11Similarly, I use the Figma MCP to provide Claude code with the app's style guide which it then
00:08:16uses to construct the design.
00:08:18For tasks where the model's built-in capabilities fall short, MCPs are essential for interacting
00:08:23with external sources efficiently.
00:08:25You can include these MCPs directly in your /commands so that they become part of your
00:08:30whole workflow.
00:08:31That brings us to the end of this video.
00:08:32If you'd like to support the channel and help us keep making videos like this, you can do
00:08:36so by using the super thanks button below.
00:08:39As always, thank you for watching and I'll see you in the next one.