I've Used Claude Code for 2,000+ Hours - Here's How I Build Anything With It

CCole Medin
Computing/SoftwareAdult EducationInternet Technology

Transcript

00:00:00Clod Code was made generally available May 22nd of last year, along with the release of Clod4.
00:00:06But there was also a research preview before this, and so I've been using the tool
00:00:11for a bit over a year now, and I actually did the math. If you count all the time
00:00:15it took for me to prompt Clod, review the code, monitor it, I have used the tool for over 2000
00:00:21hours now. So yeah, I have a thing or two to teach you. That's what I want to do in this video.
00:00:27So right now, I want to share with you all of my battle-tested strategies that will take you
00:00:31from a basic Clod Code user all the way to a power user. I've bundled everything up together into what
00:00:37I call the WISC framework. And here's the thing, these strategies are legit. I am not one of those
00:00:43AI content creators that has just jumped on the Clod Code bandwagon the past few months. I've been
00:00:48using this tool, like I said, daily for over a year now. And so these strategies are going to work on
00:00:54any code base, even massive ones, even projects that have multiple code bases. I've seen all of this
00:01:00applied at an enterprise level, and so no matter what you're working on, this is for you. This also
00:01:05really works for any AI coding assistant. I'm just focused on Clod Code because it is the best right
00:01:10now. And so I am assuming here that you have at least a basic understanding of Clod Code, and now
00:01:15you want to take things to the next level. If you want the basics of building a system for AI coding,
00:01:21I'll have a video that I'll link to right here. All of these strategies, this is when we want to
00:01:25work on real code bases that get messy because we have a bunch of strategies here around context
00:01:32management. This is important because context rot is the biggest problem with AI coding assistants
00:01:38right now. It doesn't matter that we have the new one million token limit for Clod Code, we still
00:01:43need to treat our context as the most precious resource that has to be engineered very carefully
00:01:49with our AI coding assistants. And so the W, I, S, and C for the framework, all these strategies
00:01:56apply to that, and these are all things that you can take and apply to your projects immediately.
00:02:00So I'm going to break it down nice and simple for you here. Now, the question you might be asking
00:02:05yourself is, Cole, why are we focusing so much on context management? Over 2,000 hours of using
00:02:11Clod Code, and this is what you want to focus on? And my answer is yes. I know this is very specific,
00:02:17but we need to lean right now into context rot and how to avoid this. I would go so far as to say
00:02:23that about 80% of the time when your coding agent messes up in your code base, it's because you
00:02:28aren't managing your context well enough. And so I want to start with the problem of context rot,
00:02:33and then we'll very quickly get practical diving into every part of the WISC framework. But I want
00:02:38to start with context rot as a precursor so you can really see why. Once you apply the WISC framework,
00:02:45you're going to immediately see jumps in reliability with your AI coding, even on messier
00:02:50code bases. And I keep emphasizing larger, messier code bases because that's where we see context rot
00:02:56becoming more and more of a problem. Now, there has been a lot of research in the industry on context
00:03:02rot, but my favorite, this is the most practical and probably most popular as well, is the Chroma
00:03:07Technical Report covering how increasing input tokens impacts LLM performance. And the main idea
00:03:13here is just because you can fit a certain amount of tokens into an LLM's context window doesn't mean
00:03:18that you should. And yes, this applies to Claude code with the new 1 million token limit as well.
00:03:24Because large language models get overwhelmed with information just like people do. It is called the
00:03:30needle in the haystack problem. So when you have a very specific piece of information or with coding
00:03:35agents, a specific file that it's read that you need it to recall, it will do a good job recalling
00:03:41that information in its short-term memory, but only if you don't have a super filled context window.
00:03:47When you start to have a massive amount of context loaded, you start to get what are called
00:03:52distractors. And so these are pieces of information that are close or similar to what you need the LLM
00:03:58to recall, but not quite right. And we see this a lot with AI coding, especially with larger code
00:04:04bases. We're following the same patterns for things throughout our code base. We have a lot of
00:04:09similarity in how different parts of our code base are implemented. And so large language models will
00:04:14pull the wrong information and be very confident about their fix or implementation. I'm sure you've
00:04:19seen this all of the time. We have this needle in the haystack problem applying all of the time
00:04:24to AI coding. This is the idea of context rod. The larger our window gets, the more the large
00:04:30language model has a hard time pulling out exactly what we need for the current turn with our coding
00:04:36agent. So going back to the diagram, let me get super specific for you. What we're addressing with
00:04:42all of these strategies is the question, how do we keep our context window as lean as possible
00:04:48while still giving the coding agent all of the context it needs? That is the context engineering
00:04:53that we are doing here. And so I'm going to go through every single strategy. And I even have
00:04:57an example for each of them that I'll go through live with you on a complicated code base and all
00:05:02of the commands and rules and docs that I use as an example, I have in this folder that I'll link
00:05:06to in the description. So you can use all of these strategies conceptually, but also with these
00:05:12commands as an example that I have in the dot clod folder right here. All right, so let's get into the
00:05:17individual strategies now. So W stands for write, I for isolate, S for select and C for compress. And
00:05:24of course we will start with the W here, which is writing, externalizing our agent's memory.
00:05:30As much as possible, we want to capture key decisions and what the agent has been working on
00:05:34so that in future sessions we can catch our agent up to speed a lot faster and have to spend less
00:05:40tokens upfront, having the agent understand what we really need it to do. And so the first strategy
00:05:46here is to use the git log as long-term memory. And I absolutely love this because there are so
00:05:52many people that love to over engineer and have super complicated memory frameworks for their
00:05:56coding agents, but really everyone's already using git and GitHub for version control. And so we can
00:06:01take advantage of a tool that we're already using to provide long-term memory to our agent. Let's go
00:06:07into our code base and I'll show you what I mean. So the code base I'm going to be using for all the
00:06:12examples here is the new Archon. And I've been working my butt off on this the last few months
00:06:18behind the scenes. This is your AI command center where you can create, manage and execute longer
00:06:23running AI coding workflows. And we're even working on a workflow builder. It's going to be like the
00:06:28N8N for AI coding. And so we can kick off workflows. We can view the logs and monitor them in our
00:06:33mission control. We can look at past runs to see exactly what happened. Like this is a very long
00:06:39workflow that I have to validate entire pull requests in my code base. So yeah, you can tell
00:06:44from looking at this and a lot more in Archon coming soon, by the way, but you can tell from
00:06:47looking at this that there are a lot of moving parts. This is a very complicated code base. So
00:06:51it makes for a good example for everything I'm going to cover with you here, all of the strategies.
00:06:57And so going to get as long-term memory, I'll show you an example right here of a one-liner
00:07:03for all of my recent commit messages. And what I want to point out here is that we have a very
00:07:09standard way of creating these commit messages. So we have our merges, but we also have all these
00:07:13feature implementations and fixes. And so I have things very standard because that way I can rely
00:07:19on the commit messages to tell my coding agent what I've worked on recently, because a lot of
00:07:24the time that will guide us for what we want to work on next. And the reason I have this so
00:07:29standard is because there is a commit command that I run. Now, running a get commit is very easy,
00:07:36but if we want to standardize the message and have the coding agent help us with that,
00:07:40having a specific command is very powerful. So I have this full implementation that I did here
00:07:46in a single context window with the coding agent. I'm at the end now where I am ready to run my
00:07:51commit. And so if I just run slash commit, that's all I have to do. It's running this command that
00:07:55has the standardization for how I document any work that I did. And then also anything I did to improve
00:08:01my rules or command. So it's a two-part command. Here's what we built. Here's how we improve the
00:08:06AI layer. And so it's going to make this commit and I'll show you what it looks like after.
00:08:10All right. So now looking at our commit message, we can see that we made some test improvements
00:08:14to the CLI. So a really nice prefix then getting into the details. And then also, so the coding
00:08:19agent knows how its own rules and commands are evolving over time. We include that in the commit
00:08:23message whenever we find an opportunity to improve, let's say our plan command, for example. And of
00:08:29course this commit command is one of the resources that I have for you in the repository. If you want
00:08:33to use this as a starting point, but I also encourage you to customize what your commit
00:08:37messages look like. The important thing here is we standardize the messages. We make them very
00:08:41detailed so we can use it as long-term memory. All right. So the second right strategy is to
00:08:47always start a brand new context window whenever you are writing any code. No matter what I'm working
00:08:53on, my workflow is always, I have one conversation to plan with the coding agent. I'll create some
00:08:57kind of markdown that has my structured plan. And then I'll send that in as the only context to a new
00:09:03session going into the implementation. And so it's very important here that your spec has all of the
00:09:08context the agent needs to write the code and do the validation. So for example, in this conversation,
00:09:14I am just doing planning. So I run my prime to start. I'll talk about this in a little bit.
00:09:18I load in context and then I create my plan with this command. So it's another one that I have as
00:09:24the resource for you. This essentially walks through for the coding agent. Here's the exact structure
00:09:28that we want to create for our single markdown document. So going from our short-term memory
00:09:33into a single document. And then we end the session here. We go to a brand new context window
00:09:38and we go with our implementation. So I have my execute command. And then this is where I can
00:09:42specify the path to my structure plan. No other context because this should have everything that
00:09:48it needs. This is very important because it keeps our coding agent extremely focused on the task at
00:09:53hand. There can be a lot of research and other things that just muddles the context window.
00:09:57If we implement in the same place that we plan. So the last W strategy that I have for externalizing
00:10:03agent memory is progress files and decision logs. You'll see this all the time with more elaborate
00:10:08AI coding frameworks where you have like a handoff.md or a todo.md communicating between
00:10:13different sub-agents or agent teams, even just between different agent sessions. When you're
00:10:17running low on context, a lot of times you want to create this summary of what was just done. So
00:10:22you can go to a fresh session because you're starting to see that context rot with the agent
00:10:27hallucinating as you have these longer conversations. Now, obviously it's ideal to just avoid these longer
00:10:33conversations, but sometimes you need to have them. For example, something I do with Archon a lot is
00:10:38I'll have it use the Vercel agent browser CLI to perform end-to-end testing within the browser. And
00:10:44so I have it go through a bunch of different user journeys and testing edge cases. It takes a lot of
00:10:49context. You can see at the bottom here, I ran a slash context and we're already at 200,000 out of
00:10:56the new 1 million limit. This fills up so quickly. And once you start to have a few hundred thousand
00:11:01tokens in the context window, that's when you see the performance start to degrade for the agent. So
00:11:05I can simply run a slash handoff. This command is going to create a summary that it can now pass into
00:11:11another session so that agent can continue the work. But now it doesn't have hundreds of thousands of
00:11:16tokens of tool calls and things like that sitting in its window. And this handoff command is really
00:11:21just walking through a process of here's exactly what we want to put in this document. So the next
00:11:25agent has what it needs. All right. So that wraps up our W and each one of these strategies is very
00:11:31important because we are logging key decisions for future agent sessions to quickly pick up on.
00:11:36And I know I'm going quick here. So let me know in the comments, if there's any one of these
00:11:40strategies that you want me to make an entire video on, cause I definitely could for each of these.
00:11:45And so now we get into the I for isolate using sub agents. I love using sub agents for all things
00:11:52research, using them pretty much every single session. The important thing here is keeping
00:11:56your main context clean. We can use sub agents to perform tens or even hundreds of thousands of
00:12:03tokens of research across our code base or the web. And then just giving the needed summary to our main
00:12:10clod code at context window. So instead of loading in tens of thousands of tokens of research into our
00:12:16main context window, it is now only something like 500 tokens. So we still get the core information
00:12:21that we need, but we have a 90.2% improvement according to some entropic research using sub
00:12:28agents to load in context upfront for our research, instead of having our main agent taking care of
00:12:33everything. So let me give you an example of this really quick. It's always at the start of the
00:12:38conversation or before that structure plan I covered earlier, like I'm in the planning process. That is
00:12:43when I use sub agents very heavily. Watch this. I want to build a workflow builder into our con.
00:12:50So I want you to spin up two sub agents, one to do extensive research in the code base to see how we
00:12:55would build in a workflow builder and what that means for our con and then spin up another sub agent
00:13:01to do web research on best practices for the tech stack. Like if I want to use react, what library
00:13:06should we use? And generally, how do we build workflow builders like Diffie or N8N? So I'm just
00:13:12using my text to speech tool here. Send off the prompt. There we go. And so not only do we get to
00:13:16the benefit of isolation, but also speed because it's going to use these sub agents in parallel,
00:13:21come back with a summary, and then my main agent will synthesize all that and give me the final say.
00:13:26So there we go. Both of the sub agents are running in parallel behind the scenes. We can go and view
00:13:31the logs for each of them as well. And then it'll come back at the end once they're done with the
00:13:36final report. All right, our sub agents finished. And instead of using hundreds of thousands of tokens
00:13:41in our main context window, which that is how much the sub agents did with their research,
00:13:46we only used 44,000 tokens, only 4% of our window so far. That is the power of sub agents. I don't
00:13:53recommend them for implementation because usually you want all the context of what you did. But for
00:13:57research, it is very powerful. So yeah, isolation and sub agents are very important for your planning
00:14:04process. The other way that we can use sub agents is with what I like to call the scout pattern. We
00:14:09want to send scouts ahead before you commit your main context. There might be parts of your code
00:14:14base or documentation that you want sub agents to explore to see if it is relevant to load into your
00:14:21main Claude code session. So it can kind of make the decision ahead of time. Like yes, we should
00:14:25bring this in for our larger planning or no, we should skip it. It isn't relevant. For example,
00:14:30with Archon, I have a few markdown documents that are very deep dives into certain parts of the code
00:14:36base, not the kind of context we want in our rules because we don't need it all the time. But sometimes
00:14:41you might want to load this and you can imagine this being something in Confluence or Google Drive,
00:14:45like wherever you store your context. And so going back to this main conversation,
00:14:48I can just say, spin up a sub agent to research everything in my dot Claude slash docs. Are there
00:14:54any pieces of documentation here that we would care about loading into our main context for planning?
00:14:59And I can send this in, it'll make the decision and then load in what I care about. So right here,
00:15:04we kicked off an explore sub agent. It found all of our documentation, recommended loading one.
00:15:09And then I said, yep, go ahead and load it. This is really important for what we're planning here.
00:15:13So instead of just doing sub agents for research, sometimes we have entire pieces of documentation
00:15:18that we think are crucial for our main context window. That's when we want to use the scouting
00:15:23pattern. So that is everything for isolation. Remember to use sub agents for your research
00:15:28and planning very extensively. And now that brings us into the S4 select. Load your context just in
00:15:34time, not just in case. And what I mean by that is if you're not 100% confident that a piece of
00:15:40information is important to your coding agent right now, then you shouldn't bother loading it. And we
00:15:46have a layered approach to help with this. And so we start with our global rules. These are our
00:15:51constraints and conventions that we always want our coding agent to be aware of. And so you want this
00:15:57file to be pretty concise, usually between 500 and 700 lines long as what I go for. A lot of people
00:16:02advocate for even less, but you have things like your architecture, the commands to run, the things
00:16:08like you're testing a logging strategy. This is my example from Archon, but these are the things that
00:16:12you want your coding agent to be aware of all of the time. And then we have our layer two. So our
00:16:18on-demand context, as I call it, these are rules that apply only to specific parts of the code base.
00:16:23Like if we're working on the front end, which you aren't always, but if you are, here are the global
00:16:28rules for the front end, or here are the global rules for building API endpoints. So we add this
00:16:33onto our global rules for specific task types, because we aren't always going to be working on
00:16:38the front end, for example. To show you one example of this, we have the workflow YAML reference that
00:16:43I pulled just a little bit ago with the Explorer sub-agent. So when we are working on the workflows,
00:16:48then we care about this, but we don't want this in our global rules because most of the time
00:16:52when we're working on Archon, we're not actually working on this specific part of the code base. And
00:16:57so it's on-demand context. Then the third layer that we have here is skills. This is very popular
00:17:05with Clod Code and beyond right now. We have the different stages here where the agent is going to
00:17:10explore the instructions and capabilities in the skill as it deems that it actually needs it. So
00:17:15we start with the description. This is a very small amount of tokens loaded in upfront with our global
00:17:20rules. If the agent decides it wants to use this skill, then it'll load the full skill.md,
00:17:25which can also point to other scripts or reference documents that we'd want to load if we're going
00:17:29even deeper into the skill. And so as an example of that, I have my agent browser skill. This is
00:17:35what I use for my browser automation for all my end-to-end testing I was showing earlier. I use
00:17:40this every single day. And so whenever I am doing my end-to-end testing, then I want to load this
00:17:46instruction set so the agent understands how to use the agent browser. And then finally for the fourth
00:17:52layer here, I have prime commands. So everything else I've covered here is static documentation
00:17:57that we're going to update every once in a while. But sometimes we need our agent to do exploration
00:18:02of our live code base. We need to make sure that all of its information is completely up to date
00:18:07and we're willing to spend some tokens with sub-agents upfront making that happen. That's
00:18:11what the prime command does is we are exploring our code base at the start of our planning process
00:18:16so it understands our code base going into what we want to build next. And as you can see in my
00:18:22commands folder I have many different prime commands because there are different parts of the code base
00:18:27I want the agent to understand depending on what I want to build. And so my generic prime command is
00:18:32this one we're looking at right here. I just tell it to get an understanding of the Archon code base
00:18:36at a high level. And so step by step here is what I want it to read through including the git log
00:18:41because that is important for using our git log as long-term memory. I also have a specialized one
00:18:47prime workflows for when I know that I'm working on the workflow engine in Archon. So a very similar
00:18:53command but just more specialized. So I use this at the start of the conversation so that my agent can
00:18:58quickly load everything it needs. I can confirm it understands my code base then I get into the
00:19:03planning process that I was showing you earlier. So as a super quick summary global rules are always
00:19:09loaded. On-demand context when you know you're about to work on a part of the code base that
00:19:13is documented separately. Skills when you need different capabilities like okay it's time to do
00:19:18end-to-end testing let's load the skill for the agent browser. And then prime commands I will
00:19:22usually run at the very start of a conversation to set the stage for my planning. So that is
00:19:28everything for select. Now we'll go to compress and this is actually the fastest section to cover
00:19:34because you shouldn't need to compress often if you're doing the right isolate and select
00:19:39well. If we are doing all the other strategies to keep our context lean we are avoiding this and
00:19:46this is good because you want to avoid compressing as much as possible. If you must compress then
00:19:52there are a couple of strategies to cover here. And those two strategies are the handoff and a
00:19:56focus compaction. So let's get into cloud code and take a look at this. So the handoff we already
00:20:02covered it's one of our right strategies. We summarize everything that we just did to hand
00:20:06off to another agent or the same agent after memory compaction. And then we have the built-in
00:20:12compact command in clod code. This is going to summarize our conversation then wipe the
00:20:18conversation and put the summary at the top of our context window. Now the handoff is really
00:20:23powerful because that's where we get to define our own workflow for how we remember information. But
00:20:28the slash compact is very useful as well especially because we can optionally provide summarization
00:20:34instructions. When I absolutely have to compact I will use this every single time. For example focus
00:20:41on the edge cases that we just tested right. So now it's going to when it creates that summary pay
00:20:48more attention to that part of its short term memory. I didn't spell it right that's totally
00:20:53good. It'll run the compaction here. And so the handoff and slash compact are kind of either or.
00:20:58But I definitely find times where I want to use both. The handoff especially when you run into a
00:21:03compaction more than twice usually that conversation is getting way too bloated so you want to start a
00:21:09fresh session with the handoff. But if I'm just doing it once a lot of times I am okay running a
00:21:14slash compact once. But usually after a compact I will still ask the agent to summarize what it
00:21:19remembers so I can make sure that it truly understands right like what do you remember
00:21:24here something like that. And so yeah it really isn't ideal. Avoid compaction as much as possible.
00:21:30The best compression strategy is not needing compression. All right so that is the Whisk
00:21:36framework. I know it was a lot so I hope that you found this helpful and let me know if there's any
00:21:41one strategy that you want me to dive into deeper because I could make an entire video on any one of
00:21:46these strategies. But this is the Whisk framework. I hope that you can use this to take you to the
00:21:52next level of cloud code or really any AI coding assistant. And so if you found this video helpful
00:21:59and you're looking forward to more content on AI coding and being able to apply these kinds of
00:22:04frameworks in practice I would really appreciate a like and a subscribe. And with that I will see you
00:22:09in the next video. Psst! I've got one last thing for you really quick that you don't want to miss.
00:22:14On April 2nd I am hosting a free AI transformation workshop live on my YouTube channel along with
00:22:20Lior Weinstein the founder of CTOX and this is a big deal. Lior is going to teach us how to
00:22:27restructure our entire organization for AI and then I'll teach you how to master the AI coding
00:22:32methodology that I use to build reliable and repeatable systems for my coding agents. And so
00:22:38I'll have a link in the description to this page. It's going to be live on my YouTube channel so you
00:22:42can enable notifications for it by clicking on this button right here. I will see you there!

Key Takeaway

Mastering the WISC framework allows developers to maintain high AI reliability on complex codebases by treating context as a finite, engineered resource rather than just a large storage bin.

Highlights

The WISC framework (Write, Isolate, Select, Compress) is designed to combat "context rot," which causes 80% of AI coding errors.

Using Git logs as long-term memory and standardized commit messages allows agents to recall past project decisions efficiently.

The "Scout Pattern" and sub-agents can perform massive research in parallel, keeping the main context window lean and high-performing.

A layered context approach—Global Rules, On-Demand Context, Skills, and Prime Commands—ensures only relevant data is loaded.

Context engineering is more important than raw token limits, as performance degrades significantly once a window becomes bloated with distractors.

Timeline

Introduction to Claude Code and Context Rot

The speaker introduces his extensive experience with Claude Code, totaling over 2,000 hours since its research preview. He argues that the biggest hurdle in AI-assisted development is not the token limit but a phenomenon called "context rot," where an AI loses accuracy as its memory fills up. Research indicates that just because a model can hold 1 million tokens doesn't mean it should, as performance drops significantly under heavy loads. He introduces the WISC framework as a battle-tested strategy to help users transition from basic users to power users. This section establishes the foundational theory that 80% of agent failures stem from poor context management.

The 'Write' Strategy: Externalizing Agent Memory

The 'W' in WISC stands for Write, which focuses on documenting key decisions to save tokens in future sessions. The speaker demonstrates using Git logs as long-term memory by standardizing commit messages through a custom command that tracks both code changes and AI rule improvements. He emphasizes starting fresh context windows for implementation by passing in a structured Markdown plan created in a separate planning session. This isolation prevents the implementation phase from being muddled by the messy research and brainstorming phase. Additionally, using handoff files or progress logs allows developers to switch sessions without losing critical state information.

The 'Isolate' Strategy: Leveraging Sub-Agents

Isolation involves using sub-agents to perform heavy lifting and research outside the main conversation window. By spinning up parallel sub-agents for codebase exploration and web research, the user can reduce a 100,000-token research task into a 500-token summary for the main agent. This section introduces the "Scout Pattern," where sub-agents pre-screen documentation or files to determine if they are worth loading into the primary context. This method achieved a 90.2% improvement in context efficiency according to cited industry research. It ensures the main agent stays fast and focused on the core logic rather than being distracted by raw data.

The 'Select' Strategy: Just-in-Time Context

Selecting context requires a layered approach where information is loaded only when absolutely necessary. This starts with a concise Global Rules file (500-700 lines) for core architecture and conventions followed by On-Demand context for specific modules like front-end or API work. The speaker also details 'Skills,' which are specialized instruction sets for tools like browser automation that the agent only accesses when needed. Finally, 'Prime Commands' are used at the start of sessions to give the agent a high-level update on the live codebase state. This hierarchy prevents the AI from being overwhelmed by 'distractors' that are similar but irrelevant to the current task.

The 'Compress' Strategy and Final Framework Summary

The 'C' for Compress is described as a last resort because the best compression strategy is to avoid needing it entirely through the first three WISC steps. If compression is necessary, the speaker suggests using the built-in '/compact' command with specific instructions to prioritize certain memories, such as edge cases. He warns that repeated compaction leads to information loss and that starting a fresh session with a handoff file is usually superior. The video concludes with a summary of the framework and an invitation to a live AI transformation workshop. The speaker encourages viewers to treat their context window as a precious resource that requires active engineering to maintain productivity.

Community Posts

View all posts