Nvidia's New Tool Just Fixed Agent Skills

Englishالعربية Deutsch Español Français हिन्दी Bahasa Indonesia 日本語 한국어 Русский 中文

Computing/SoftwareSmall Business/StartupsInternet Technology

Transcript

00:00:00Right now, AI agent skills are everywhere. Every agent runs them and you trust them without any

00:00:05checks. But here's the scary part. Researchers studied over 30,000 of these skills and more than

00:00:10a quarter of them had a security vulnerability. So NVIDIA built a tool called Skill Spectre that

00:00:15scans any skill before you install it and tells you exactly how dangerous it is. But here's where

00:00:20it gets interesting. One type of attack can slip right past it and the setting that actually

00:00:24catches it is off by default, so most people never even know it's there. Turning that on normally

00:00:29costs money, but we found a way around it. And by the end, we didn't just scan skills. We built a

00:00:34whole workflow that changes how you find and install them for good. Now before we get into the full

00:00:39workflow, let's give you a quick tour of the tool and what you need to use it. So these are the install

00:00:44commands in the GitHub repo. You can just copy them and hand them to Claude Code and it'll basically

00:00:49install and set up the whole thing for you. Claude Code's gonna install all the dependencies you can

00:00:54see right here. And once all that's done, you can start using Skill Spectre. Inside the GitHub repo,

00:00:59there's this test folder and inside that they've got some dangerous skills you can actually run it on to

00:01:04confirm the tool works. So we ran it on these skills and with every one of them, it tells you not to

00:01:09install. The higher the score, the more dangerous the skill. And with each test, it doesn't just give

00:01:14you a number. It shows you the exact line number, the exact location and the file name where the conflict

00:01:19is, which is basically what pushed the score up. Now this isn't the only way to use the tool, it's got

00:01:24another mode. But before you get why we even need that second mode, you need to know two things: how a skill

00:01:30even attacks you and how this tool actually catches that attack. Now there are 14 categories,

00:01:34but to keep it simple, we've grouped them into six similar ones. So the first way a skill can attack

00:01:39you is with hidden instructions. See, a skill is just a text file full of instructions and your agent reads

00:01:45the whole thing and treats it as orders. The problem is, a bad skill can hide extra instructions in there that

00:01:50you'll never see, but the agent does. They tuck them inside comments, or they use invisible characters,

00:01:55or they scramble the text into a code that looks like nonsense to you, but the AI reads it just fine.

00:02:01So the scanner is built specifically to hunt these hidden instructions down and find them. The second

00:02:06way is impersonation. So your agent has tools it trusts and reaches for by name. Say there's one just

00:02:12called "read" that reads a file for it. So a malicious skill gives its own tool that exact same name,

00:02:17and your agent grabs the bad one thinking it's the safe one it already knows. And the way they pull

00:02:22it off is sneaky. They swap one letter for a lookalike from another alphabet. So they name it "read",

00:02:27but the "A" is actually a Russian letter that looks identical to ours. To you and to your agent at a

00:02:33glance, it's the same word, but underneath it's a completely different tool. And the scanner catches

00:02:38this by checking the real identity of every single character, so it spots that one fake letter and

00:02:43flags it. The third way is when the skill just lies about what it does. The description says one thing,

00:02:48the code does another. So it calls itself a simple formatter and then quietly reaches out to the

00:02:53internet in the background. Or it says it only needs permission to read your files, but the code is

00:02:58actually writing files and running commands too. And this one's way harder to catch. This is where that

00:03:03second mode comes in, but we'll get to that later. The fourth way is the skill steals your credentials.

00:03:08This could be your API keys, your passwords. So a skill goes through all the keys saved on your

00:03:13machine, scoops them up, and sends them off to some server. The fifth way is the skill just runs

00:03:18straight up malware. This includes things like a reverse shell, which basically hands a stranger

00:03:23remote control of your whole computer. And because this kind of malware has known fingerprints,

00:03:28the scanner just matches the code against a big library of those fingerprints. And the sixth way is

00:03:32poison dependencies. So a skill will often use a CLI tool, basically a small outside program it runs in

00:03:39the terminal to handle part of its job. And a bad skill grabs a piece that's actually malicious.

00:03:44Maybe it's a fake package with a name that's one typo off a real popular one. So you pull the wrong

00:03:49one and it runs malware like the last type. So the scanner checks every package the skill pulls in

00:03:54against a live database of known bad ones. And it flags the fake names and those download and run

00:03:59commands to keep your system safe. So in that first mode, it's just matching patterns without any context,

00:04:05which means it ends up flagging stuff that's completely fine. And those are what we call false

00:04:09positives. So that's where the second mode comes in the AI scan and turning it on is simple. You just

00:04:14drop this no LLM flag and it does the second scan here. But if you look inside the code, you'll find out

00:04:20that to run an AI check on a skill, you need to plug in an open AI key. So to get around that cost,

00:04:26we just use Claude Code itself to run that AI check. Now the main agent in Claude Code doesn't actually

00:04:32do it itself. We use Claude's headless mode, which is basically Claude Code running in the background

00:04:38with no chat window, just executing commands on its own. And we're sure most of you know it isn't free,

00:04:43but you do get monthly credits for it with your Anthropic plans. And you can just ask Claude Code to

00:04:48make the change we just talked about and it'll do it for you. Of course you might hit a bug or two,

00:04:52but it's just a single line prompt Claude can set up for you. And if you're enjoying the video so far,

00:04:57subscribe to the channel and hit the hype button. This small gesture of support goes a long way for us.

00:05:03So they've also got dangerous skills in their test folder that actually need the AI check. When you

00:05:07run the no LLM check on one of them, the score comes out as zero, which means it's perfectly safe.

00:05:12But the second you run it with the AI check, the score jumps to 100, it tells you not to install,

00:05:17and it lays out exactly why. But what if instead of just detecting the problems in a skill,

00:05:22the scanner also helped you fix them. So that's exactly why we turned the scanner into a skill. And

00:05:27you might be wondering why is it called Discover Skills? Well, because we didn't just make one

00:05:31separate skill. We made a whole process that helps us discover more skills and make sure they're safe

00:05:36before we install them. So we've been using skills.sh to find new skills for a while now. It's basically a

00:05:42git repo built specifically for skills. So one big shared library you can pull from. And we think they

00:05:47recently shipped a CLI update. So now Claude can just run search queries straight through the command

00:05:53line and pull the best skills it needs before installing anything. And we wanted our scanner

00:05:57running on top of that. So in here, we've got scan.sh, which is the script that actually runs

00:06:02skill specter. Since skill specter is a CLI tool, it has to be run as a command. So we made a whole

00:06:08script and we baked the Claude headless mode fix right into it. So by default, it runs the normal

00:06:13check, but if you want, it'll run the AI check too. And if you open up skill.md, you can see the basic

00:06:19steps laid out. It identifies the target, then scans it, then it shows you the findings. And once it knows

00:06:24what the problems are, it goes ahead and fixes them, then runs the whole loop again after to make

00:06:28sure everything's clean. So for example, this folder we're showing you right now is our AI labs design

00:06:34folder. It's basically our whole design process compressed into one folder with a bunch of skills

00:06:39inside. We've got a whole video on this. And on top of that, the whole system's available in AI labs

00:06:44pro, which is our community. So if you want to support the channel and grab this whole design system,

00:06:49go check it out. And this discovery skill is going to be uploaded in there too. The link's going to be

00:06:54in the description, but we're building on top of this here. So we're adding a new make design.md skill,

00:06:59which lays out the fastest way to pull design tokens out of an app you've already built, basically the

00:07:04colors, fonts, and spacing rules, and merge them into a design.md file. So here we wanted to create

00:07:10the design.md file. So we told it that we wanted to improve it and that it should go search for other

00:07:15tools out there. So it used skills.sh, then we loaded the discovery skill and that pulled back a

00:07:21handful of skills. These are the skills it brought back and the first two looked interesting. So we wanted

00:07:26to dig in. We asked it to install and test both of them. And just like the discover skills workflow

00:07:31says, it won't install any skill without scanning it first. So it installed them and read through them

00:07:36and told us straight up that neither one was going to help with the make design.md skill. But from a

00:07:41safety point of view, the first one got a score of 10, which meant it was safe, and the second got a

00:07:46100, which meant don't install it. So we told it to run the AI check on that second skill. It ran it again

00:07:52through Claude's headless mode and this time the score came back as zero. This means that the skill

00:07:56was safe to use. And that's the whole point of this system. You're not just grabbing skills blindly off

00:08:01the internet. You have a whole process that you can kick off just by using a skill. Now let's have a

00:08:06word from our sponsor. Nimblist. If you use Claude code or codex, you know the problem. You've got multiple

00:08:12sessions running, files changing everywhere, and you're constantly switching between terminal, browser,

00:08:17and editor just to keep track of what your agents are doing. Nimblist is an open source visual workspace

00:08:23that puts everything in one place. I had three agents working on different parts of a project at

00:08:28the same time and instead of jumping across windows, I could see all of them on a Kanban board, jump into

00:08:33any session, review code changes as red and green diffs, and approve or reject them individually. I was

00:08:38editing markdown docs, UI mockups, and architecture diagrams visually right alongside my agent. When I was

00:08:45done, I didn't have to clean up commits manually because it generated git commit messages automatically

00:08:50based on what changed. Tasks stayed connected to the actual sessions and there's even a mobile app to

00:08:56continue the session while you're away from your desk. Nimblist is completely free and open source

00:09:00and you can check it out by using the link in the pinned comment. That brings us to the end of this

00:09:05video. If you'd like to support the channel and help us keep making videos like this, you can do so by

00:09:10using the super thanks button below. As always, thank you for watching and I'll see you in the next one.

Key Takeaway

Implementing a automated security scanning workflow like Skill Spectre, which combines pattern matching with AI intent analysis, prevents the installation of malicious AI agent skills that often harbor hidden instructions or deceptive file permissions.

Highlights

Security analysis of over 30,000 AI agent skills revealed that more than 25% contain security vulnerabilities.
Skill Spectre scans potential AI skills for risks before installation, flagging specific file locations and line numbers for identified conflicts.
Malicious skills often employ hidden instructions in comments, character spoofing using lookalike letters from other alphabets, and unauthorized API key exfiltration.
The tool utilizes a two-mode scanning system: pattern matching for known malware signatures and an AI-driven check for intent-based deception.
Claude Code's headless mode provides an alternative to paid OpenAI keys for performing the AI-driven security validation of skills.
A structured 'Discover Skills' workflow automates the search, verification, and installation process for AI agent skills to eliminate manual, blind reliance on third-party code.

Timeline

Vulnerability landscape of AI agent skills

Researchers identified that over 25% of 30,000 examined AI skills contained security vulnerabilities.
Skill Spectre functions as a scanning tool to evaluate the safety of AI skills prior to local installation.
The scanner provides granular feedback, including exact file names, line numbers, and danger scores for detected threats.

AI agents rely on external skills often installed without adequate security checks. Skill Spectre mitigates this risk by analyzing code structure and providing clear, actionable danger reports. The tool integrates easily via GitHub and command-line instructions, allowing users to verify potentially dangerous code before deployment.

Threat vectors and detection mechanisms

Hidden instructions embedded in comments or scrambled text allow malicious skills to manipulate AI agent behavior.
Impersonation attacks utilize character spoofing, replacing standard characters with lookalikes from other alphabets to deceive agents.
Malicious skills frequently perform unauthorized actions, such as stealing API keys, executing reverse shells, or pulling poisoned dependencies.

Skills attack users through deceptive methods ranging from hidden code execution to spoofing tool names. The scanner identifies these threats by validating character identity, matching code against known malware fingerprints, and checking external package requests against databases of known malicious sources.

AI-driven validation and automated workflows

Pattern matching alone can produce false positives, requiring a secondary AI-based scan to verify intent.
Claude Code's headless mode allows for AI-driven security checks without incurring additional OpenAI API costs.
Integrating the scanner into a 'Discover Skills' process ensures that every skill is verified before it is permitted to run in the environment.

Standard scanners can misinterpret safe code as malicious. By adding an AI-driven scan, the system can distinguish between safe functions and genuine threats. This entire process is automated into a workflow where skills are searched, scanned, and only approved if they pass safety protocols.

Community Posts

No posts yet. Be the first to write about this video!

Write about this video