This Is Why Claude Code Fails In AI Coding

AAI LABS
Computing/SoftwareManagementInternet Technology

Transcript

00:00:00One single file decides whether the product you get is actually the right implementation
00:00:04you needed.
00:00:05For claud code users that is claud.md and others have their own files but most commonly
00:00:10they use the agents.md.
00:00:11But no matter which you are using unless you set it up properly you will keep fighting your
00:00:15agent on every task.
00:00:17And if you think that running a simple init command is enough for you, you are actually
00:00:20wrong here.
00:00:21You need to follow a structured pattern tailored to the project that actually makes your agent
00:00:26perform better.
00:00:27For that reason we compiled the best practices you need to follow from other credible sources
00:00:31as well as from us spending hours with claud code so that you can plug them directly into
00:00:35your workflow.
00:00:36And the last one is important because it determines how your agent actually follows your instructions
00:00:41and if not followed the rest instructions in your file won't be that impactful.
00:00:45The first thing that you need to add in your claud.md file is something that's coming straight
00:00:49out of Andre Karpathy's skills repo which contains the best claud.md patterns he talks about.
00:00:54You need to add in an explicit instruction to claud to think before coding.
00:00:58It makes claud state assumptions explicitly.
00:01:01If multiple interpretations exist it should present them all so that we can decide from
00:01:05the set of implementations.
00:01:07This pushes claud to think from a different perspective before it actually dives into the
00:01:11solution.
00:01:12This ensures that the solution it implements is aligned with what you wanted.
00:01:15This line cuts a huge amount of course correction out of our workflow which is why we found it
00:01:20so helpful.
00:01:21With this instruction added, whenever you ask claud to implement a feature it will basically
00:01:25ask a set of questions related to the task you gave it so that your answers guide it on
00:01:29how to actually do the task.
00:01:31This part in particular will be helpful because now claud will not guess the implementations
00:01:35and dive straight in from the pattern it has memorized from the training data.
00:01:39It will think thoroughly about what the right implementation is and confirm with you if that's
00:01:43the intended implementation and then actually work on the feature instead of just wildly
00:01:48guessing it and you interrupting claud because it did not follow the correct implementations
00:01:52that you had in mind which occurs so much more often and requires you to course correct a
00:01:56lot.
00:01:57The next rule is choosing simplicity first.
00:01:59It's something so simple yet it still needs to be specifically stated in claud.md so that
00:02:04the agent gets properly reminded of this principle.
00:02:07Claud or any other agent tends to write large solutions for problems that can be solved with
00:02:11simple ones.
00:02:12But this isn't only problematic because it causes delays.
00:02:15It also makes it hard to refactor the code later on and even harder to add features because
00:02:19the implementation is so verbose that it consumes a lot of tokens to implement simple things.
00:02:24So this line literally pushes claud to iterate towards simplicity.
00:02:27We also add this on every project and especially when working on large scale applications because
00:02:32at that scale doing this becomes more important.
00:02:35This specifically tells claud not to add any features beyond what is asked and to ensure
00:02:39proper error handling for the implementation.
00:02:41The disciplined framework around it is basically a hard threshold.
00:02:44If the solution to any problem you ask can be handled in 200 lines and could be refactored
00:02:49to 50, then claud needs to rewrite the solution because its approach is wrong.
00:02:54This will actually prevent claud from writing a lot of useless overhead code with things
00:02:58that aren't even implementable and reflect the wrong sense of direction claud selected.
00:03:03The third part of claud.md is implementing surgical changes or in simple words, touching
00:03:08only those parts that the agent absolutely must touch.
00:03:11This addresses something we face a lot when claud is writing a large amount of code at
00:03:15once and the fix needs to be explicitly stated in the claud.md file.
00:03:19Claud or agents in general when asked to do one task tend to try to improve the things
00:03:24around that task too.
00:03:25These improvements might look like adjacent code changes or formatting the codebase which
00:03:29we don't actually want it to focus on at the moment.
00:03:32But this is annoying because claud's attention gets divided across the multiple things it's
00:03:36trying to implement at once.
00:03:37Setting these kinds of changes is not good because claud is basically including things
00:03:41we didn't want it to do right now.
00:03:43So we need to state in claud.md the instructions explicitly to not do that.
00:03:47If the agent notices any unrelated dead code, it should mention it instead of fixing it itself.
00:03:52At times these kinds of things are there for specific reasons that are to be addressed later,
00:03:56not at the stage the app is currently in.
00:03:58The mental framework that lets claud decide how to act properly is to check every change
00:04:03and see if it actually traces back to what the user asked.
00:04:06If it does, then it should make that change.
00:04:08If it does not, it should not touch that feature.
00:04:10If this line is added, then whenever claud has any issue in implementation, it will basically
00:04:14change only the thing the user asked it to fix.
00:04:17Therefore it will tell you all the other issues it found in the same file and you can decide
00:04:21from there if you actually want it to fix those or not.
00:04:24The last pattern that was extracted from Andrej Karpathy is goal-driven execution.
00:04:29Agents do not know what the correct output looks like, which is the core problem.
00:04:33They would work much more effectively if they did, and that's what this rule fixes.
00:04:36In the claud.md file, we need to explicitly state claud to define the success criteria
00:04:41for each task we give it.
00:04:43Therefore, for any task we hand over, claud needs to convert it into a verifiable goal.
00:04:47For example, if you give it a task to add validation, it will write tests for the invalid inputs
00:04:52and ensure those test cases actually pass with the right return values for the right inputs.
00:04:57So the whole idea is to have the agent implement test cases and then iterate until all the test
00:05:01cases pass and at the end, the project has the same behavior we actually need from it.
00:05:06If you give it any prompt on a task, it will set the verifiable goal and plan out the implementation.
00:05:11Then it will verify the work for you by adding all the test cases and showing how it will
00:05:15handle the whole app in essence.
00:05:17Now this might work for logical reasoning, but if you want the agent to verify how your
00:05:21UI looks, the agent cannot write test cases for that.
00:05:23So for that you can add in the Claud Chrome extension or Puppeteer MCP so that it can verify
00:05:28how the UI looks using those tools.
00:05:30This helps because UI changes are hard to judge by looking at the code itself and giving the
00:05:35agent a verifiable way to let it see the current app's visuals and then using that it can fix
00:05:40the issues.
00:05:41Therefore, you can explicitly add a line so it knows that after the UI implementation it
00:05:45also needs to verify the result through the MCP.
00:05:48If you have created a claud.md file using claud code's own init command, you would see that
00:05:53it adds commands for running the dev server and the build server.
00:05:57But those are already in its training data and claud already knows those commands and
00:06:01we don't need to explicitly waste lines in claud.md telling it what it already knows.
00:06:05So in your file you only need to mention the tools you want claud to use instead of the
00:06:09ones it defaults to.
00:06:11There are certain CLI tools that make the workflow faster but are not in claud's default training
00:06:16data or the patterns it already relies on.
00:06:18Therefore you have to add those explicitly so that claud knows those tools are installed
00:06:22and doesn't fall back to whatever it uses on its own all the time.
00:06:26For example if you have installed the github cli instead of using git for working, you can
00:06:30add an instruction in claud.md to use its cli instead of the default git commands for all
00:06:36the operations.
00:06:37Similarly you can add in more commands which are not the default ones.
00:06:41You also need to add in the running instructions for the project in this file if they are different
00:06:45from the usual ones.
00:06:46For example most projects in the default setup are run by npm and if your project runs with
00:06:51pnpm, you need to add this information so the agent knows what commands are actually meant
00:06:56to be run.
00:06:57Anything else beyond commands that claud already knows should not be included in the claud.md
00:07:01file.
00:07:02The next mention in claud.md is inspired by the creator of claud code and the workflow
00:07:07he revealed.
00:07:08He talked about how claud.md is not a write once and use forever file.
00:07:12It is something that constantly needs to be changed, updated and improved over the course
00:07:16of building as an ongoing process that needs to be iterated on again and again.
00:07:20So you need to add an instruction that if claud had to be told by the user that its implementation
00:07:25was not correct, it should first apply corrections as pointed out by the user.
00:07:29Once claud has applied those corrections, it should also add those learnings to a dedicated
00:07:33file so that claud can gradually build a knowledge base of what it should not do and what the
00:07:38correct way of doing things is, which it can reference later on as required.
00:07:42But before we move forwards, let's have a word by our sponsor.
00:07:45Klaus, you've probably heard about AI agents.
00:07:47Maybe you've tried setting one up yourself, 15 minutes in you're staring at a terminal
00:07:51pasting API keys into config files, wondering if you just leaked something important.
00:07:56Klaus skips all of that.
00:07:57Klaus runs OpenClaw, the open source AI agent on the cloud.
00:08:00You sign up, you get $15 in open router credits and you start prompting.
00:08:04No terminal, no docker, no API key scavenger hunt.
00:08:07I tested it by asking Klaus to scrape a startup directory, organize the results into a table
00:08:12and email it to me.
00:08:13One prompt in the chat window, done.
00:08:15No code, no browser extensions.
00:08:17It comes with built-in tools like Exa and Apollo and connects to Slack, WhatsApp, even
00:08:21iMessage.
00:08:22Everything runs on a firewalled machine, completely isolated from your personal accounts.
00:08:27If something breaks, their autofix agent clawbert patches it without you touching anything.
00:08:31Click the link in the pinned comment and try Klaus for free.
00:08:35Since most coding projects are managed by Git, you need to explicitly add an instruction
00:08:39in claud.md that claud should not run commands that are irreversible without confirmation.
00:08:44And if there is a need to run such a command, the agent must ask for permission first.
00:08:48These commands are dangerous because once they are executed, the consequences are irreversible
00:08:53and they can cause damage to production.
00:08:55Things like force pushing, resetting the head, merging branches or running remove with force
00:09:00commands.
00:09:01You also need to add in an instruction that if claud is unsure whether a command is destructive
00:09:04or not, it should ask instead of assuming.
00:09:07This will save you a lot of trouble.
00:09:08For example, if claud accidentally tries to merge a branch that you do not want it to merge,
00:09:12it will ask for permission before doing so and you can then deny it so that your work
00:09:16stays safe.
00:09:17There is no need to put all aspects of information into a single claud.md file because that will
00:09:22just bloat it unnecessarily and distract the agent from what it actually needs to do.
00:09:27So you need to create path scoped rule files that declare their scope on the first line
00:09:31and contain instructions tailored toward those exact files.
00:09:34You also need to mention the location of these files in claud.md so claud knows they exist.
00:09:40For example, if you want claud to follow certain specific instructions when writing APIs, you
00:09:44can add those in a rule file for them so that when claud is working on them, it can load
00:09:48those instructions and use them directly.
00:09:50But just as importantly, this also ensures that API related instructions do not interfere
00:09:55when claud is not working on them.
00:09:56You can have multiple rule files for different parts of the project, each containing instructions
00:10:00tailored to that specific area.
00:10:02This way, claud only loads the relevant instructions when it is working on that part.
00:10:06Therefore it prevents context bloat and keeps the agent focused on its current task instead
00:10:11of being distracted by unrelated rules.
00:10:13Most large scale applications are in a mono repo, which is a single large repository where
00:10:18all the different components are kept together with each folder acting as a separate part
00:10:22of its own and each part being managed independently while contributing to a different aspect of
00:10:27the main application.
00:10:28So if you are running a project from a mono repo, you need to make sure that each sub
00:10:32repo contains its own claud.md file so that it actually contains instructions specific
00:10:37to it and does not have to rely only on the instructions from the global claud.md.
00:10:42The global file should only consist of instructions that are broadly applicable to all parts of
00:10:47the system.
00:10:48But scoped claud.md files work better because they can contain instructions that are specific
00:10:52to that particular app or module.
00:10:54This allows the agent to perform better because it will have more focused guidance.
00:10:58Therefore placing all large project instructions in the main file is the wrong move.
00:11:02It will actually bloat the file with information and when claud passes through areas with instructions
00:11:07that don't concern the current task, it can cause its attention to diverge from what it
00:11:11actually needs to do.
00:11:12Also if you are enjoying our content, consider pressing the hype button because it helps us
00:11:16create more content like this and reach out to more people.
00:11:19You also need to add the project description in your claud.md file and ensure that this
00:11:24instruction is placed at the very start of it, not buried down inside the rest of the
00:11:29instructions.
00:11:30Because the agent gets the gist of what the whole app is about by reading it first.
00:11:33So it understands the context of how the app is structured, what it does in general, what
00:11:38the different services and dependencies are and how the app runs.
00:11:41This way, it knows from the start, instead of looking at the code to deduce what the
00:11:45app does.
00:11:46Another section that we need to add in your claud.md file is that claud needs to verify
00:11:50not only that the feature exists but also that it functions correctly as intended before reporting
00:11:55any task as complete.
00:11:57It should use all available verification mechanisms to confirm that the build and tests pass properly,
00:12:02but the point of this section is making sure the task is actually complete by using real
00:12:07verification steps, not just by checking that the code for the feature exists.
00:12:11Therefore this instruction pushes claud to report more faithfully and to use multiple
00:12:15types of checks like unit tests, linting and type checks to make sure that the app is implemented
00:12:20correctly and works as intended.
00:12:23Last but not least, the way you order your instructions in the claud.md file is also
00:12:27very important for ensuring high agent performance.
00:12:29You have to order them by priority.
00:12:31The first instructions should be hard rules, meaning always non-negotiable, with no exceptions
00:12:36whatsoever.
00:12:37These hard rules should always come first, before any other rules.
00:12:40Then comes the medium priority rules which are not as strict as the previous ones.
00:12:44They are somewhat negotiable but still important and should not be violated.
00:12:48After that come the low priority instructions which mainly include references and conveniences,
00:12:52so that the agent does not need to go back and use this section as a core decision source.
00:12:57One more important thing is that you need to make sure the claud.md file is kept short.
00:13:01A best practice is to keep it under a strict limit of 300 lines, which is considered optimal
00:13:06for agent performance.
00:13:07But once it gets longer than that, performance starts to degrade.
00:13:10The claud.md file talked about here and all other resources mentioned here are available
00:13:15in AI Labs Pro for this video and for all our previous videos from where you can download
00:13:20and use it for your own projects.
00:13:21If you found value in what we do and want to support the channel, this is the best way
00:13:25to do it.
00:13:26The link is in the description.
00:13:27That brings us to the end of this video.
00:13:29If you'd like to support the channel and help us keep making videos like this, you can do
00:13:33so by using the super thanks button below.
00:13:35As always, thank you for watching and I'll see you in the next one.

Key Takeaway

Optimizing agent performance requires a modular, priority-ordered claud.md file under 300 lines that enforces explicit thought processes, verifiable goals, and surgically scoped instructions to prevent inefficient token consumption and hallucinated implementations.

Highlights

  • A properly configured claud.md file acts as a mandatory instruction set that prevents AI coding agents from guessing implementations or diverging from project goals.

  • Instruction sequences within claud.md must be ordered by priority, placing non-negotiable hard rules at the top to ensure consistent agent behavior.

  • Implementing 'think before coding' instructions forces the agent to explicitly state assumptions and present multiple implementation options before acting.

  • Defining success criteria as verifiable goals, such as mandatory test case implementation, forces agents to validate features functionally rather than just checking code existence.

  • Maintaining a claud.md file under 300 lines is optimal, as exceeding this length causes agent performance to degrade.

  • Scope-specific rule files prevent context bloat by loading only relevant instructions for specific project modules instead of cramming all details into a single global file.

Timeline

Structuring the claud.md Foundation

  • A structured claud.md file is necessary to prevent persistent friction with AI coding agents during tasks.
  • Explicit instructions to 'think before coding' force agents to state assumptions and present multiple implementation options.

Running an initial setup command is insufficient for high-performance AI agents. Directing agents to pause and articulate their reasoning before writing code reduces the need for constant course correction. This approach prevents agents from relying solely on training data patterns and encourages alignment with specific user requirements.

Efficiency Principles and Surgical Changes

  • Agents must be instructed to prioritize simple solutions over verbose, complex implementations that consume excess tokens.
  • Explicit instructions to perform only requested changes prevent agents from wasting capacity on unauthorized codebase formatting or unrelated improvements.

Default agent tendencies often lead to bloated code that is difficult to refactor. Setting a hard line in claud.md—such as requiring a rewrite if a solution exceeds 200 lines—enforces simplicity. Additionally, directing agents to ignore unrelated dead code unless explicitly asked to fix it preserves context and focus.

Goal-Driven Execution and Verification

  • Agents should convert every task into verifiable goals by implementing test cases that must pass before a task is reported complete.
  • UI-specific verification can be achieved by integrating tools like the Claud Chrome extension or Puppeteer MCP.

The absence of predefined success criteria causes agents to guess what constitutes a finished task. Requiring the implementation of test cases ensures the final output matches intended behavior. For visual elements where code tests are insufficient, auxiliary tools provide the agent with the necessary visibility to verify results.

Tool Configuration and Iterative Knowledge

  • Instructions should only specify tools and commands that deviate from the agent's default training data.
  • Learning files allow agents to build a persistent knowledge base of corrections to avoid repeating previous errors.

Redundant instructions for standard operations like running dev servers only bloat the configuration. Instead, focus on defining specific CLI tools like the GitHub CLI or project-specific package managers like pnpm. Storing implementation corrections in a dedicated file helps the agent build an evolving, project-specific knowledge base.

Advanced Scaling and File Maintenance

  • Use path-scoped rule files and repository-specific claud.md files to prevent global file bloating in mono-repo environments.
  • Organize instructions by priority, starting with hard rules, followed by medium-priority rules, and ending with low-priority convenience references.
  • Maintain the total line count of claud.md under 300 to avoid performance degradation.

Managing complex projects requires moving away from a single, all-encompassing file. By scoping rules to specific directories, instructions remain relevant and lightweight. Strict adherence to a 300-line limit and priority-based ordering ensures the agent correctly interprets the most important constraints first.

Community Posts

View all posts