Is Your AI Code Producing CRAP? (Here's How To Fix It)

BBetter Stack
컴퓨터/소프트웨어경영/리더십AI/미래기술

Transcript

00:00:00Today, I want to talk about CRAP. And no, it's not that kind of CRAP. I'm talking about the
00:00:05abbreviation, which stands for Change Risk Anti-Patterns Index. And it's designed to find
00:00:12risky functions in your code, which are highly complex, but poorly tested. It's not a particularly
00:00:18new concept, but one that has gotten my attention recently, thanks to a package released by
00:00:24Alexander Prokhoranko called Cargo CRAP, which identifies these key functions in Rust code.
00:00:31The original idea for the CRAP metric comes from Alberto Savoia and Bob Evans, who invented the
00:00:37metric back in 2007 while experimenting with automated developer testing tools. But Alexander
00:00:44recently reignited attention to this forgotten metric by writing this insightful blog post about
00:00:49it and how nowadays when almost all of the code is written by AI agents, it's more important than ever
00:00:55to scan your code base for these hidden issues. It's a very cool concept and we're going to explore it
00:01:01in more detail in today's video. So let's dive into it. To understand why this matters, let's look at
00:01:11this function on my screen. It handles a multi-step data transformation with deeply nested match
00:01:16statements, a few loops, and plenty of error handling paths. So it's cyclomatic complexity is quite high, around 15.
00:01:24Now if you're not familiar with the term cyclomatic complexity, it's basically a fancy way of measuring
00:01:30how many different paths a piece of data can take through your code. Every time you write an if statement
00:01:36or a match or a while loop or a catch block, you're creating a fork in the road and the more forks you have,
00:01:43the higher the complexity score goes. And the harder it becomes for a human brain to map out every single possible outcome
00:01:51in a function. So that's why whenever possible, we try to split functions into smaller tasks.
00:01:57And for this function on my screen, a complexity of 15 means that there are 15 entirely separate paths
00:02:04this logic can execute from start to finish. Now if this function is fully covered by unit tests,
00:02:09it's crap score stays at 15. It's complex, but it's safe because we're validating its behavior.
00:02:16And this is what we should expect when we run Alexander's tool, CargoCrap. And here we see that the score is 13,
00:02:23not 15. It's probably because the library didn't take the error handlers into account.
00:02:27But anyway, 15 and 13 are quite close. So basically this tool is doing what we expect.
00:02:33But let's look at what happens if someone deletes those tests or if an AI agent generates this function
00:02:39from scratch and skips writing tests completely. So in this example, I'm just going to comment out my tests.
00:02:45And if we rerun CargoCrap again, suddenly that score shoots past 100.
00:02:51So it uses a straightforward formula to calculate risk balancing cyclomatic complexity,
00:02:57the number of linear execution paths through your code against test coverage.
00:03:03So if we look at this function, C is the cyclomatic complexity of the function and COV is the test coverage
00:03:10expressed as a fraction between zero and one. And the math heavily penalizes complex code that lacks tests.
00:03:17So if your coverage is a hundred percent, the entire first path of the equation drops to zero
00:03:23and your CRAPS score simply equals your cyclomatic complexity.
00:03:26But if your coverage drops, the cubic exponent on the left side causes the risk score to skyrocket.
00:03:33And a function with a complexity of 10 and zero coverage yields a CRAPS score of 110.
00:03:39And this is good because if you want a code base where let's say, for example,
00:03:43no function exceeds a cyclomatic complexity of five, then this is your base metric to watch out for.
00:03:49And if any function gets higher than five, you know that this is the area to pay attention to.
00:03:55Normally you would allow for a higher complexity. CargoCrap has a default set at 30.
00:04:00But I'm just saying that you can set this metric on your own how you please.
00:04:05And so basically this is a good middle ground metric to pay attention to.
00:04:09And this becomes critical in the era of AI generated code.
00:04:13Because AI agents are incredibly good at spitting out these highly complex syntactically correct code
00:04:20blocks that handle edge cases you didn't even think of, but they're notoriously bad at writing
00:04:25meaningful, robust integration tests unless explicitly forced to.
00:04:30So tools like CargoCrap are meant to be run as a second check after running all the unit tests
00:04:37to assess the overall code quality.
00:04:39So basically it acts like a heat map for your technical debt, pointing you directly to the code
00:04:44that is most likely to break during a refactor.
00:04:47And this is also especially helpful if you want to keep your code base well structured
00:04:52in case you need to onboard a new engineer on your team.
00:04:56And we know from anecdotal accounts how insanely out of whack code bases can get with all this
00:05:02AI generated code nowadays if we don't pay attention to it.
00:05:06And sometimes these AI agents also have a tendency to duplicate the same function across multiple files.
00:05:13So I think tools like these and more importantly methodologies like these are important to be aware of
00:05:19and not to forget in our modern coding landscape to keep our code quality high.
00:05:24The same authors who came up with this methodology also released a crap metric tool for Java.
00:05:30And honestly, until recently reading Alexander's blog post, I wasn't even aware of this metric.
00:05:34So I'm thankful that his tool made me notice this long forgotten engineering practice.
00:05:40And I'm sure that other programming languages would also benefit from such a tool.
00:05:44So if you're thinking of a fun project to build on your free time,
00:05:48go read Alberto Savoia's original post and build one for another coding language
00:05:53because it could be a very useful utility for a lot of developers.
00:05:57So there you have it, folks.
00:05:58That is crap in a nutshell.
00:06:01That sounded funny.
00:06:02Anyway, what are some other long forgotten engineering practices or metrics you know of
00:06:08that we should be paying more attention to in this new age of agentic coding?
00:06:13Let us know in the comments section down below.
00:06:15And folks, if you like these types of technical breakdowns,
00:06:18please let me know by smashing that like button underneath the video.
00:06:21And also don't forget to subscribe to our channel.
00:06:24This has been Andrus from BetterStack and I will see you in the next videos.
00:06:28We'll see you in the next videos.

Key Takeaway

Integrating the CRAP index into development workflows identifies high-risk, untested code paths and serves as a vital safeguard against technical debt caused by AI-generated functions.

Highlights

  • The CRAP (Change Risk Anti-Patterns) index quantifies code risk by balancing cyclomatic complexity against test coverage.

  • Cyclomatic complexity measures the number of distinct linear execution paths through a function, including every if statement, loop, and catch block.

  • The formula penalizes complex, untested code using a cubic exponent on the complexity score, causing the risk index to skyrocket as coverage decreases.

  • Cargo CRAP is a tool for the Rust language that identifies high-risk functions by calculating their CRAP score.

  • AI-generated code often reaches high syntactical correctness and complexity but frequently lacks meaningful integration tests, increasing technical debt.

Timeline

Defining the CRAP Metric

  • CRAP stands for Change Risk Anti-Patterns Index.
  • The metric highlights functions that are both highly complex and poorly tested.
  • Cargo CRAP is a specific implementation for Rust codebases.

Originally developed in 2007 by Alberto Savoia and Bob Evans, the CRAP index identifies risky code. Modern development practices, especially the use of AI agents that generate code without adequate testing, necessitate tools like Cargo CRAP to scan for these hidden vulnerabilities.

Complexity and Risk Calculation

  • Cyclomatic complexity counts the distinct logical paths through a function, such as forks, loops, and conditional statements.
  • A function with a complexity of 15 contains 15 independent execution paths.
  • The CRAP formula heavily penalizes complex functions that lack sufficient unit test coverage.
  • A function with a complexity of 10 and zero test coverage yields a CRAP score of 110.

Cyclomatic complexity increases every time a developer introduces an if statement, match block, or loop. While a complex function is manageable if fully covered by tests, removing those tests causes the risk score to escalate rapidly due to the cubic exponent used in the calculation. Developers can set custom thresholds to flag areas of the code that require immediate attention.

Managing AI-Generated Technical Debt

  • AI agents frequently produce complex, syntactically correct code without including necessary integration tests.
  • Running the CRAP tool after unit tests acts as a heat map for technical debt and refactoring risk.
  • The methodology helps maintain a well-structured codebase for future engineering onboarding.

Tools like Cargo CRAP serve as a secondary check to ensure code quality in an era where AI-generated output is prevalent. Because AI models often neglect robust testing, these metrics identify functions that are likely to break during future refactoring. Expanding these metrics to other programming languages remains a useful opportunity for developers looking to improve modern coding standards.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video