00:00:00This is Shannon, an open-source autonomous AI pen tester that performs code analysis and executes
00:00:05live exploits using browser automation to find all kinds of security vulnerabilities from server-side
00:00:11request forgery to cross-site scripting to SQL injection and much much more, giving you a detailed
00:00:17comprehensive security report with zero false positives. But with the announcement of Claude
00:00:22code security and also the fact that Shannon is built on the Claude SDK, meaning you can't use
00:00:27your subscription, is there any point in learning at all that might not be around for that long?
00:00:31Hit subscribe and let's get into it.
00:00:32In one of my previous jobs, we pay thousands of dollars to external pen testers before a
00:00:38major release only to find out there were bugs we needed to fix and then we'd get it tested again,
00:00:43costing us lots of time and of course money. But this is exactly what Shannon exists to fix.
00:00:48You can run Shannon as many times as you want. You can even put it in a CI/CD pipeline and run
00:00:53it automatically. And because it's open source, it's completely free. Well, there is a paid version,
00:00:58which we'll talk about later. But as someone who's not a security expert, I'd rather run my project
00:01:03through Shannon than boot up Kali Linux. In fact, let's see Shannon in action. So Shannon is built
00:01:08using the Anthropic Agent SDK. So you're going to need a Claude API key for it to work. Unfortunately,
00:01:13the subscription won't work either, but I've got it installed on a VPS using a non-root user and I'm
00:01:20going to run it against the OASP juice shop, which is an app designed to have loads of vulnerabilities
00:01:25for testing reasons. Now I've already cloned the Shannon repo, which you'll need if you want to run
00:01:30it. And for this to work, you'll have to have your repository you want to test inside the Shannon
00:01:34repos directory. So I've got juice shop inside here. And with the juice shop project running,
00:01:39I'm going to run this command, which will connect to the app running locally for browser testing
00:01:44and connect to the repo inside the repo directory to scan through the code. Now, if this is the first
00:01:50time you're running Shannon, because it uses Docker compose, it will first have to pull a bunch of
00:01:54images from Docker hub. But because I've already gone through that process, it just jumps straight
00:01:58to here. We get a link to the temporal workflow and we can view it using the web UI, which looks
00:02:03like this, showing all the steps that need to take place. Or we could run this command to view the logs
00:02:07in real time, which I sometimes prefer since the web UI doesn't always show the most information.
00:02:12But wait, what's temporal? I thought we were talking about Shannon. Well,
00:02:16Shannon pen tests can take one or many hours depending on the size of the project and temporal
00:02:21ensures durable execution, no matter the scenario. So if your computer crashes midway through a pen
00:02:26test or you run out of cloud credits and need to top up, you don't lose any progress. Temporal
00:02:32remembers exactly where you left off and restart Shannon from that checkpoint. Let me know in the
00:02:36comments if you want a dedicated video about temporal, but it also orchestrates all of Shannon's phases
00:02:42and activities. And even though there are only five phases, a lot of things happen inside them.
00:02:47Let me show you. Starting with the pre-flight phase that makes sure API credentials are valid.
00:02:53Docker containers are ready and the repo actually exists. Then the pre-recon stage, which analyses
00:03:00the code to understand how the app works. So architecture, mapping entry points and security
00:03:05patterns. Next is the actual recon stage, which is very different from the pre-recon because here
00:03:12playwright is used to navigate through the app. So it will click buttons, fill in forms and use
00:03:18that to observe network requests, take screenshots, look at cookies, basically map out all the
00:03:24functionality of the app. And then phase four does five pipelines in parallel. So here we have
00:03:31injection related vulnerabilities and exports, then cross-site scripting, vulnerabilities and exports,
00:03:38then authentication, server-side request forgery. And finally we have authorization. So accessing
00:03:45privileged data or other people's information. And all of this happens in parallel on five different
00:03:52agents for vulnerability and then another five for exploits. And finally we have phase five, which
00:03:59compiles everything into a comprehensive pen test report by combining the last five checks. Speaking
00:04:07of report, let's see how our pen test is coming along. So after almost two and a half hours, the
00:04:12full process is complete and we can see here it started with the pre-flight validation before
00:04:17moving on to the pre-recon and then the recon agent. And then here it runs all the vulnerability
00:04:25checks. So we've got the injection vulnerability agent, cross-site scripting, authorization, SSRF.
00:04:31And you can see for some of these, the green line is not solid. This is because it had to retry
00:04:36because I ran out of cloud credits. So you can see there's a two here and for these ones, there were
00:04:40any retry. So it may have been faster than two and a half hours if it wasn't for these retries, but
00:04:46I don't think it would have been less than two hours. Anyway, after it's done the five vulnerability
00:04:51checks, it then moves on to do the five exploit checks. So we can see SSRF here, we've got the
00:04:56auth exploit, we've got the injection and so on. And once it's done all of those, we can see the
00:05:02auth exploit takes the longest. It then wraps everything up using the report agent. Now, of
00:05:07course, if we wanted to, we could expand all of this to see more information about each stage, but
00:05:13I'm no expert on temporal and I'm sure if we go through the documentation, there'll be a lot more
00:05:17on how to use the platform. But let's now take a look at the final report Shannon has generated.
00:05:22So here in the deliverables directory of our juice shop project, we can see the list of all the
00:05:28reports it's generated. And it's a lot more than I thought it would do. So let's first take a look at
00:05:33this report, which is the auth analysis. And you can see it has a summary at the top. And over here,
00:05:37it's noted that there are 11 critical vulnerabilities identified and we can see what they are.
00:05:43So zero out of six authentication endpoints enforced HTTPS, which makes sense because I was
00:05:47running it locally. And then we also have the proper cusp control, which it was missing.
00:05:52And the authentication endpoints didn't have adequate rates. They're missing. This is really
00:05:56detailed. I mean, if you scroll down, we can see exactly what the problems were, where they were,
00:06:01and the endpoints that caused them. Now I'm not going to bore you and go through every single
00:06:05report, but let's go through the summary, which is called comprehensive security assessment reports.
00:06:10And inside here, we have details about the model that was used, the scope of the project. And now
00:06:15if we scroll down, we can see that it found four critical auth vulnerabilities that were fully
00:06:21exploited and lists them down here. Wow, this is very thorough, but take a look at this. If we
00:06:26scroll down even further, it gives us a summary of the report. So this is the first idle one
00:06:31and scroll down even more. We can see exactly how an attacker could exploit this. So the exact curl
00:06:38command they could run with the details and the type of information they could extract. And this
00:06:43level of detail exists for every single vulnerability, which goes to show how much
00:06:48detail went into the assessment. Now, if you're interested, I'm going to leave a link to all the
00:06:54reports inside this description. But two and a half hours is a really long time for Claude Sonnet to
00:06:59scan through a repo. Is there anything Shannon Pro could have done to help? So it doesn't look like
00:07:04Shannon Pro could help with the speed, but it does do some other things like provide CSVV scoring,
00:07:09which the basic or free version does not contain. It has CI/CD pipeline support, API access. And more
00:07:16importantly, if you're an enterprise user, you get all the things you'd expect, including OASP
00:07:22compliance reporting, as well as SOC 2 and PCI DSS. So even though two and a half hours is a really
00:07:27long time, I've done some research and found out the first run of Shannon takes the longest and then
00:07:32subsequent runs are much faster. Now I know what you're thinking. Almost two and a half hours running
00:07:37Claude Sonnet 4.6 on a single pen test. How much did that cost in credits? Let's just say a lot.
00:07:43I topped up about $66 and ended up having this much left. So almost $60 in Claude credits was
00:07:50spent running this pen test, which is cheaper than hiring a human tester, but it's still a lot of
00:07:55money. And I would have loved to use my Claude Pro or Max subscription, which would have made the whole
00:08:00thing a lot cheaper, which is hopefully what Claude's code security will allow you to do when
00:08:05it's properly released, unless the team at Keygraph end up rewriting Shannon in something like the open
00:08:10AI agent's SDK or use the vacel AI SDK, which allows you to use many more models. But overall,
00:08:16if you're a startup and don't want to spend a lot of money on a human pen tester, then Shannon is a
00:08:21good alternative. If you're an indie hacker with even less money, then maybe hold off and just
00:08:26release the products to see if people actually use it. And while we're on the topic of AI and security,
00:08:30if you want to know how to safely install OpenClaw on a VPS,
00:08:34then check out the next video where I go through step-by-step exactly how to do that.