I've Used Claude Code for 2,000+ Hours - Here's How I Build Anything With It

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Português Русский 中文

Computing/SoftwareAdult EducationInternet Technology

Transcript

00:00:00Clod Code was made generally available May 22nd of last year, along with the release of Clod4.

00:00:06But there was also a research preview before this, and so I've been using the tool

00:00:11for a bit over a year now, and I actually did the math. If you count all the time

00:00:15it took for me to prompt Clod, review the code, monitor it, I have used the tool for over 2000

00:00:21hours now. So yeah, I have a thing or two to teach you. That's what I want to do in this video.

00:00:27So right now, I want to share with you all of my battle-tested strategies that will take you

00:00:31from a basic Clod Code user all the way to a power user. I've bundled everything up together into what

00:00:37I call the WISC framework. And here's the thing, these strategies are legit. I am not one of those

00:00:43AI content creators that has just jumped on the Clod Code bandwagon the past few months. I've been

00:00:48using this tool, like I said, daily for over a year now. And so these strategies are going to work on

00:00:54any code base, even massive ones, even projects that have multiple code bases. I've seen all of this

00:01:00applied at an enterprise level, and so no matter what you're working on, this is for you. This also

00:01:05really works for any AI coding assistant. I'm just focused on Clod Code because it is the best right

00:01:10now. And so I am assuming here that you have at least a basic understanding of Clod Code, and now

00:01:15you want to take things to the next level. If you want the basics of building a system for AI coding,

00:01:21I'll have a video that I'll link to right here. All of these strategies, this is when we want to

00:01:25work on real code bases that get messy because we have a bunch of strategies here around context

00:01:32management. This is important because context rot is the biggest problem with AI coding assistants

00:01:38right now. It doesn't matter that we have the new one million token limit for Clod Code, we still

00:01:43need to treat our context as the most precious resource that has to be engineered very carefully

00:01:49with our AI coding assistants. And so the W, I, S, and C for the framework, all these strategies

00:01:56apply to that, and these are all things that you can take and apply to your projects immediately.

00:02:00So I'm going to break it down nice and simple for you here. Now, the question you might be asking

00:02:05yourself is, Cole, why are we focusing so much on context management? Over 2,000 hours of using

00:02:11Clod Code, and this is what you want to focus on? And my answer is yes. I know this is very specific,

00:02:17but we need to lean right now into context rot and how to avoid this. I would go so far as to say

00:02:23that about 80% of the time when your coding agent messes up in your code base, it's because you

00:02:28aren't managing your context well enough. And so I want to start with the problem of context rot,

00:02:33and then we'll very quickly get practical diving into every part of the WISC framework. But I want

00:02:38to start with context rot as a precursor so you can really see why. Once you apply the WISC framework,

00:02:45you're going to immediately see jumps in reliability with your AI coding, even on messier

00:02:50code bases. And I keep emphasizing larger, messier code bases because that's where we see context rot

00:02:56becoming more and more of a problem. Now, there has been a lot of research in the industry on context

00:03:02rot, but my favorite, this is the most practical and probably most popular as well, is the Chroma

00:03:07Technical Report covering how increasing input tokens impacts LLM performance. And the main idea

00:03:13here is just because you can fit a certain amount of tokens into an LLM's context window doesn't mean

00:03:18that you should. And yes, this applies to Claude code with the new 1 million token limit as well.

00:03:24Because large language models get overwhelmed with information just like people do. It is called the

00:03:30needle in the haystack problem. So when you have a very specific piece of information or with coding

00:03:35agents, a specific file that it's read that you need it to recall, it will do a good job recalling

00:03:41that information in its short-term memory, but only if you don't have a super filled context window.

00:03:47When you start to have a massive amount of context loaded, you start to get what are called

00:03:52distractors. And so these are pieces of information that are close or similar to what you need the LLM

00:03:58to recall, but not quite right. And we see this a lot with AI coding, especially with larger code

00:04:04bases. We're following the same patterns for things throughout our code base. We have a lot of

00:04:09similarity in how different parts of our code base are implemented. And so large language models will

00:04:14pull the wrong information and be very confident about their fix or implementation. I'm sure you've

00:04:19seen this all of the time. We have this needle in the haystack problem applying all of the time

00:04:24to AI coding. This is the idea of context rod. The larger our window gets, the more the large

00:04:30language model has a hard time pulling out exactly what we need for the current turn with our coding

00:04:36agent. So going back to the diagram, let me get super specific for you. What we're addressing with

00:04:42all of these strategies is the question, how do we keep our context window as lean as possible

00:04:48while still giving the coding agent all of the context it needs? That is the context engineering

00:04:53that we are doing here. And so I'm going to go through every single strategy. And I even have

00:04:57an example for each of them that I'll go through live with you on a complicated code base and all

00:05:02of the commands and rules and docs that I use as an example, I have in this folder that I'll link

00:05:06to in the description. So you can use all of these strategies conceptually, but also with these

00:05:12commands as an example that I have in the dot clod folder right here. All right, so let's get into the

00:05:17individual strategies now. So W stands for write, I for isolate, S for select and C for compress. And

00:05:24of course we will start with the W here, which is writing, externalizing our agent's memory.

00:05:30As much as possible, we want to capture key decisions and what the agent has been working on

00:05:34so that in future sessions we can catch our agent up to speed a lot faster and have to spend less

00:05:40tokens upfront, having the agent understand what we really need it to do. And so the first strategy

00:05:46here is to use the git log as long-term memory. And I absolutely love this because there are so

00:05:52many people that love to over engineer and have super complicated memory frameworks for their

00:05:56coding agents, but really everyone's already using git and GitHub for version control. And so we can

00:06:01take advantage of a tool that we're already using to provide long-term memory to our agent. Let's go

00:06:07into our code base and I'll show you what I mean. So the code base I'm going to be using for all the

00:06:12examples here is the new Archon. And I've been working my butt off on this the last few months

00:06:18behind the scenes. This is your AI command center where you can create, manage and execute longer

00:06:23running AI coding workflows. And we're even working on a workflow builder. It's going to be like the

00:06:28N8N for AI coding. And so we can kick off workflows. We can view the logs and monitor them in our

00:06:33mission control. We can look at past runs to see exactly what happened. Like this is a very long

00:06:39workflow that I have to validate entire pull requests in my code base. So yeah, you can tell

00:06:44from looking at this and a lot more in Archon coming soon, by the way, but you can tell from

00:06:47looking at this that there are a lot of moving parts. This is a very complicated code base. So

00:06:51it makes for a good example for everything I'm going to cover with you here, all of the strategies.

00:06:57And so going to get as long-term memory, I'll show you an example right here of a one-liner

00:07:03for all of my recent commit messages. And what I want to point out here is that we have a very

00:07:09standard way of creating these commit messages. So we have our merges, but we also have all these

00:07:13feature implementations and fixes. And so I have things very standard because that way I can rely

00:07:19on the commit messages to tell my coding agent what I've worked on recently, because a lot of

00:07:24the time that will guide us for what we want to work on next. And the reason I have this so

00:07:29standard is because there is a commit command that I run. Now, running a get commit is very easy,

00:07:36but if we want to standardize the message and have the coding agent help us with that,

00:07:40having a specific command is very powerful. So I have this full implementation that I did here

00:07:46in a single context window with the coding agent. I'm at the end now where I am ready to run my

00:07:51commit. And so if I just run slash commit, that's all I have to do. It's running this command that

00:07:55has the standardization for how I document any work that I did. And then also anything I did to improve

00:08:01my rules or command. So it's a two-part command. Here's what we built. Here's how we improve the

00:08:06AI layer. And so it's going to make this commit and I'll show you what it looks like after.

00:08:10All right. So now looking at our commit message, we can see that we made some test improvements

00:08:14to the CLI. So a really nice prefix then getting into the details. And then also, so the coding

00:08:19agent knows how its own rules and commands are evolving over time. We include that in the commit

00:08:23message whenever we find an opportunity to improve, let's say our plan command, for example. And of

00:08:29course this commit command is one of the resources that I have for you in the repository. If you want

00:08:33to use this as a starting point, but I also encourage you to customize what your commit

00:08:37messages look like. The important thing here is we standardize the messages. We make them very

00:08:41detailed so we can use it as long-term memory. All right. So the second right strategy is to

00:08:47always start a brand new context window whenever you are writing any code. No matter what I'm working

00:08:53on, my workflow is always, I have one conversation to plan with the coding agent. I'll create some

00:08:57kind of markdown that has my structured plan. And then I'll send that in as the only context to a new

00:09:03session going into the implementation. And so it's very important here that your spec has all of the

00:09:08context the agent needs to write the code and do the validation. So for example, in this conversation,

00:09:14I am just doing planning. So I run my prime to start. I'll talk about this in a little bit.

00:09:18I load in context and then I create my plan with this command. So it's another one that I have as

00:09:24the resource for you. This essentially walks through for the coding agent. Here's the exact structure

00:09:28that we want to create for our single markdown document. So going from our short-term memory

00:09:33into a single document. And then we end the session here. We go to a brand new context window

00:09:38and we go with our implementation. So I have my execute command. And then this is where I can

00:09:42specify the path to my structure plan. No other context because this should have everything that

00:09:48it needs. This is very important because it keeps our coding agent extremely focused on the task at

00:09:53hand. There can be a lot of research and other things that just muddles the context window.

00:09:57If we implement in the same place that we plan. So the last W strategy that I have for externalizing

00:10:03agent memory is progress files and decision logs. You'll see this all the time with more elaborate

00:10:08AI coding frameworks where you have like a handoff.md or a todo.md communicating between

00:10:13different sub-agents or agent teams, even just between different agent sessions. When you're

00:10:17running low on context, a lot of times you want to create this summary of what was just done. So

00:10:22you can go to a fresh session because you're starting to see that context rot with the agent

00:10:27hallucinating as you have these longer conversations. Now, obviously it's ideal to just avoid these longer

00:10:33conversations, but sometimes you need to have them. For example, something I do with Archon a lot is

00:10:38I'll have it use the Vercel agent browser CLI to perform end-to-end testing within the browser. And

00:10:44so I have it go through a bunch of different user journeys and testing edge cases. It takes a lot of

00:10:49context. You can see at the bottom here, I ran a slash context and we're already at 200,000 out of

00:10:56the new 1 million limit. This fills up so quickly. And once you start to have a few hundred thousand

00:11:01tokens in the context window, that's when you see the performance start to degrade for the agent. So

00:11:05I can simply run a slash handoff. This command is going to create a summary that it can now pass into

00:11:11another session so that agent can continue the work. But now it doesn't have hundreds of thousands of

00:11:16tokens of tool calls and things like that sitting in its window. And this handoff command is really

00:11:21just walking through a process of here's exactly what we want to put in this document. So the next

00:11:25agent has what it needs. All right. So that wraps up our W and each one of these strategies is very

00:11:31important because we are logging key decisions for future agent sessions to quickly pick up on.

00:11:36And I know I'm going quick here. So let me know in the comments, if there's any one of these

00:11:40strategies that you want me to make an entire video on, cause I definitely could for each of these.

00:11:45And so now we get into the I for isolate using sub agents. I love using sub agents for all things

00:11:52research, using them pretty much every single session. The important thing here is keeping

00:11:56your main context clean. We can use sub agents to perform tens or even hundreds of thousands of

00:12:03tokens of research across our code base or the web. And then just giving the needed summary to our main

00:12:10clod code at context window. So instead of loading in tens of thousands of tokens of research into our

00:12:16main context window, it is now only something like 500 tokens. So we still get the core information

00:12:21that we need, but we have a 90.2% improvement according to some entropic research using sub

00:12:28agents to load in context upfront for our research, instead of having our main agent taking care of

00:12:33everything. So let me give you an example of this really quick. It's always at the start of the

00:12:38conversation or before that structure plan I covered earlier, like I'm in the planning process. That is

00:12:43when I use sub agents very heavily. Watch this. I want to build a workflow builder into our con.

00:12:50So I want you to spin up two sub agents, one to do extensive research in the code base to see how we

00:12:55would build in a workflow builder and what that means for our con and then spin up another sub agent

00:13:01to do web research on best practices for the tech stack. Like if I want to use react, what library

00:13:06should we use? And generally, how do we build workflow builders like Diffie or N8N? So I'm just

00:13:12using my text to speech tool here. Send off the prompt. There we go. And so not only do we get to

00:13:16the benefit of isolation, but also speed because it's going to use these sub agents in parallel,

00:13:21come back with a summary, and then my main agent will synthesize all that and give me the final say.

00:13:26So there we go. Both of the sub agents are running in parallel behind the scenes. We can go and view

00:13:31the logs for each of them as well. And then it'll come back at the end once they're done with the

00:13:36final report. All right, our sub agents finished. And instead of using hundreds of thousands of tokens

00:13:41in our main context window, which that is how much the sub agents did with their research,

00:13:46we only used 44,000 tokens, only 4% of our window so far. That is the power of sub agents. I don't

00:13:53recommend them for implementation because usually you want all the context of what you did. But for

00:13:57research, it is very powerful. So yeah, isolation and sub agents are very important for your planning

00:14:04process. The other way that we can use sub agents is with what I like to call the scout pattern. We

00:14:09want to send scouts ahead before you commit your main context. There might be parts of your code

00:14:14base or documentation that you want sub agents to explore to see if it is relevant to load into your

00:14:21main Claude code session. So it can kind of make the decision ahead of time. Like yes, we should

00:14:25bring this in for our larger planning or no, we should skip it. It isn't relevant. For example,

00:14:30with Archon, I have a few markdown documents that are very deep dives into certain parts of the code

00:14:36base, not the kind of context we want in our rules because we don't need it all the time. But sometimes

00:14:41you might want to load this and you can imagine this being something in Confluence or Google Drive,

00:14:45like wherever you store your context. And so going back to this main conversation,

00:14:48I can just say, spin up a sub agent to research everything in my dot Claude slash docs. Are there

00:14:54any pieces of documentation here that we would care about loading into our main context for planning?

00:14:59And I can send this in, it'll make the decision and then load in what I care about. So right here,

00:15:04we kicked off an explore sub agent. It found all of our documentation, recommended loading one.

00:15:09And then I said, yep, go ahead and load it. This is really important for what we're planning here.

00:15:13So instead of just doing sub agents for research, sometimes we have entire pieces of documentation

00:15:18that we think are crucial for our main context window. That's when we want to use the scouting

00:15:23pattern. So that is everything for isolation. Remember to use sub agents for your research

00:15:28and planning very extensively. And now that brings us into the S4 select. Load your context just in

00:15:34time, not just in case. And what I mean by that is if you're not 100% confident that a piece of

00:15:40information is important to your coding agent right now, then you shouldn't bother loading it. And we

00:15:46have a layered approach to help with this. And so we start with our global rules. These are our

00:15:51constraints and conventions that we always want our coding agent to be aware of. And so you want this

00:15:57file to be pretty concise, usually between 500 and 700 lines long as what I go for. A lot of people

00:16:02advocate for even less, but you have things like your architecture, the commands to run, the things

00:16:08like you're testing a logging strategy. This is my example from Archon, but these are the things that

00:16:12you want your coding agent to be aware of all of the time. And then we have our layer two. So our

00:16:18on-demand context, as I call it, these are rules that apply only to specific parts of the code base.

00:16:23Like if we're working on the front end, which you aren't always, but if you are, here are the global

00:16:28rules for the front end, or here are the global rules for building API endpoints. So we add this

00:16:33onto our global rules for specific task types, because we aren't always going to be working on

00:16:38the front end, for example. To show you one example of this, we have the workflow YAML reference that

00:16:43I pulled just a little bit ago with the Explorer sub-agent. So when we are working on the workflows,

00:16:48then we care about this, but we don't want this in our global rules because most of the time

00:16:52when we're working on Archon, we're not actually working on this specific part of the code base. And

00:16:57so it's on-demand context. Then the third layer that we have here is skills. This is very popular

00:17:05with Clod Code and beyond right now. We have the different stages here where the agent is going to

00:17:10explore the instructions and capabilities in the skill as it deems that it actually needs it. So

00:17:15we start with the description. This is a very small amount of tokens loaded in upfront with our global

00:17:20rules. If the agent decides it wants to use this skill, then it'll load the full skill.md,

00:17:25which can also point to other scripts or reference documents that we'd want to load if we're going

00:17:29even deeper into the skill. And so as an example of that, I have my agent browser skill. This is

00:17:35what I use for my browser automation for all my end-to-end testing I was showing earlier. I use

00:17:40this every single day. And so whenever I am doing my end-to-end testing, then I want to load this

00:17:46instruction set so the agent understands how to use the agent browser. And then finally for the fourth

00:17:52layer here, I have prime commands. So everything else I've covered here is static documentation

00:17:57that we're going to update every once in a while. But sometimes we need our agent to do exploration

00:18:02of our live code base. We need to make sure that all of its information is completely up to date

00:18:07and we're willing to spend some tokens with sub-agents upfront making that happen. That's

00:18:11what the prime command does is we are exploring our code base at the start of our planning process

00:18:16so it understands our code base going into what we want to build next. And as you can see in my

00:18:22commands folder I have many different prime commands because there are different parts of the code base

00:18:27I want the agent to understand depending on what I want to build. And so my generic prime command is

00:18:32this one we're looking at right here. I just tell it to get an understanding of the Archon code base

00:18:36at a high level. And so step by step here is what I want it to read through including the git log

00:18:41because that is important for using our git log as long-term memory. I also have a specialized one

00:18:47prime workflows for when I know that I'm working on the workflow engine in Archon. So a very similar

00:18:53command but just more specialized. So I use this at the start of the conversation so that my agent can

00:18:58quickly load everything it needs. I can confirm it understands my code base then I get into the

00:19:03planning process that I was showing you earlier. So as a super quick summary global rules are always

00:19:09loaded. On-demand context when you know you're about to work on a part of the code base that

00:19:13is documented separately. Skills when you need different capabilities like okay it's time to do

00:19:18end-to-end testing let's load the skill for the agent browser. And then prime commands I will

00:19:22usually run at the very start of a conversation to set the stage for my planning. So that is

00:19:28everything for select. Now we'll go to compress and this is actually the fastest section to cover

00:19:34because you shouldn't need to compress often if you're doing the right isolate and select

00:19:39well. If we are doing all the other strategies to keep our context lean we are avoiding this and

00:19:46this is good because you want to avoid compressing as much as possible. If you must compress then

00:19:52there are a couple of strategies to cover here. And those two strategies are the handoff and a

00:19:56focus compaction. So let's get into cloud code and take a look at this. So the handoff we already

00:20:02covered it's one of our right strategies. We summarize everything that we just did to hand

00:20:06off to another agent or the same agent after memory compaction. And then we have the built-in

00:20:12compact command in clod code. This is going to summarize our conversation then wipe the

00:20:18conversation and put the summary at the top of our context window. Now the handoff is really

00:20:23powerful because that's where we get to define our own workflow for how we remember information. But

00:20:28the slash compact is very useful as well especially because we can optionally provide summarization

00:20:34instructions. When I absolutely have to compact I will use this every single time. For example focus

00:20:41on the edge cases that we just tested right. So now it's going to when it creates that summary pay

00:20:48more attention to that part of its short term memory. I didn't spell it right that's totally

00:20:53good. It'll run the compaction here. And so the handoff and slash compact are kind of either or.

00:20:58But I definitely find times where I want to use both. The handoff especially when you run into a

00:21:03compaction more than twice usually that conversation is getting way too bloated so you want to start a

00:21:09fresh session with the handoff. But if I'm just doing it once a lot of times I am okay running a

00:21:14slash compact once. But usually after a compact I will still ask the agent to summarize what it

00:21:19remembers so I can make sure that it truly understands right like what do you remember

00:21:24here something like that. And so yeah it really isn't ideal. Avoid compaction as much as possible.

00:21:30The best compression strategy is not needing compression. All right so that is the Whisk

00:21:36framework. I know it was a lot so I hope that you found this helpful and let me know if there's any

00:21:41one strategy that you want me to dive into deeper because I could make an entire video on any one of

00:21:46these strategies. But this is the Whisk framework. I hope that you can use this to take you to the

00:21:52next level of cloud code or really any AI coding assistant. And so if you found this video helpful

00:21:59and you're looking forward to more content on AI coding and being able to apply these kinds of

00:22:04frameworks in practice I would really appreciate a like and a subscribe. And with that I will see you

00:22:09in the next video. Psst! I've got one last thing for you really quick that you don't want to miss.

00:22:14On April 2nd I am hosting a free AI transformation workshop live on my YouTube channel along with

00:22:20Lior Weinstein the founder of CTOX and this is a big deal. Lior is going to teach us how to

00:22:27restructure our entire organization for AI and then I'll teach you how to master the AI coding

00:22:32methodology that I use to build reliable and repeatable systems for my coding agents. And so

00:22:38I'll have a link in the description to this page. It's going to be live on my YouTube channel so you

00:22:42can enable notifications for it by clicking on this button right here. I will see you there!

Key Takeaway

Mastering the WISC framework allows developers to maintain high AI reliability on complex codebases by treating context as a finite, engineered resource rather than just a large storage bin.

Highlights

The WISC framework (Write, Isolate, Select, Compress) is designed to combat "context rot," which causes 80% of AI coding errors.
Using Git logs as long-term memory and standardized commit messages allows agents to recall past project decisions efficiently.
The "Scout Pattern" and sub-agents can perform massive research in parallel, keeping the main context window lean and high-performing.
A layered context approach—Global Rules, On-Demand Context, Skills, and Prime Commands—ensures only relevant data is loaded.
Context engineering is more important than raw token limits, as performance degrades significantly once a window becomes bloated with distractors.

Timeline

Introduction to Claude Code and Context Rot

The speaker introduces his extensive experience with Claude Code, totaling over 2,000 hours since its research preview. He argues that the biggest hurdle in AI-assisted development is not the token limit but a phenomenon called "context rot," where an AI loses accuracy as its memory fills up. Research indicates that just because a model can hold 1 million tokens doesn't mean it should, as performance drops significantly under heavy loads. He introduces the WISC framework as a battle-tested strategy to help users transition from basic users to power users. This section establishes the foundational theory that 80% of agent failures stem from poor context management.

The 'Write' Strategy: Externalizing Agent Memory

The 'W' in WISC stands for Write, which focuses on documenting key decisions to save tokens in future sessions. The speaker demonstrates using Git logs as long-term memory by standardizing commit messages through a custom command that tracks both code changes and AI rule improvements. He emphasizes starting fresh context windows for implementation by passing in a structured Markdown plan created in a separate planning session. This isolation prevents the implementation phase from being muddled by the messy research and brainstorming phase. Additionally, using handoff files or progress logs allows developers to switch sessions without losing critical state information.

The 'Isolate' Strategy: Leveraging Sub-Agents

Isolation involves using sub-agents to perform heavy lifting and research outside the main conversation window. By spinning up parallel sub-agents for codebase exploration and web research, the user can reduce a 100,000-token research task into a 500-token summary for the main agent. This section introduces the "Scout Pattern," where sub-agents pre-screen documentation or files to determine if they are worth loading into the primary context. This method achieved a 90.2% improvement in context efficiency according to cited industry research. It ensures the main agent stays fast and focused on the core logic rather than being distracted by raw data.

The 'Select' Strategy: Just-in-Time Context

Selecting context requires a layered approach where information is loaded only when absolutely necessary. This starts with a concise Global Rules file (500-700 lines) for core architecture and conventions followed by On-Demand context for specific modules like front-end or API work. The speaker also details 'Skills,' which are specialized instruction sets for tools like browser automation that the agent only accesses when needed. Finally, 'Prime Commands' are used at the start of sessions to give the agent a high-level update on the live codebase state. This hierarchy prevents the AI from being overwhelmed by 'distractors' that are similar but irrelevant to the current task.

The 'Compress' Strategy and Final Framework Summary

The 'C' for Compress is described as a last resort because the best compression strategy is to avoid needing it entirely through the first three WISC steps. If compression is necessary, the speaker suggests using the built-in '/compact' command with specific instructions to prioritize certain memories, such as edge cases. He warns that repeated compaction leads to information loss and that starting a fresh session with a handoff file is usually superior. The video concludes with a summary of the framework and an invitation to a live AI transformation workshop. The speaker encourages viewers to treat their context window as a precious resource that requires active engineering to maintain productivity.

Community Posts

Write about this video