Bumblebee: The Open-Source Scanner for Messy Dev Machines
BBetter Stack
Computing/SoftwareManagementInternet Technology
Transcript
00:00:00You know what's annoying about supply chain attacks? By the time everyone is panicking,
00:00:04the question is not, is production safe? It's, did anyone install this thing locally?
00:00:09This is Bumblebee. It's a new open source tool from Perplexity that scans your dev machine for
00:00:15packages, extensions, and MCP configs without running your package managers or executing
00:00:21project code. So instead of looking around manually, you get a local inventory in seconds.
00:00:26I'm going to run it live. Then we'll talk about where it actually works and where it doesn't.
00:00:36Now, the old model was simple. Scan the repo, scan the container, scan production.
00:00:41But that's not how many of us work anymore. Today, one laptop can have package managers,
00:00:46browser extensions, editor extensions, AI coding tools, local agents, all of this living together.
00:00:53That is a lot of trust packed into one machine. Perplexity built Bumblebee internally for this
00:00:58exact reason, then open sourced it just a few days back. Bumblebee is a read-only single binary scanner
00:01:05that inventories packages, editor extensions, browser extensions, and AI tool convicts from local
00:01:11metadata. No MPMLS, no pip show, no running random project code, just metadata. Let's try running it.
00:01:19If you enjoy coding tools that speed up your workflow, be sure to subscribe. We have videos coming out all
00:01:24the time. All right. First up to the plate, we got to install this thing with go install from GitHub.
00:01:29That gives us a single go binary, no daemon, no service. Now let's run the self test. All I got to
00:01:37do for this is run Bumblebee self test. And hopefully we get back self test. Okay. Good. The scanner can
00:01:46detect its known fixture data correctly. That's what this test did. Now let's run a baseline scan.
00:01:52All we're going to do is do Bumblebee scan profile. We're going to say baseline and we're going to drop
00:01:57in our nd.json file. This is the scan we use for regular developer endpoint inventory. It checks common,
00:02:05global and user level package routes, editor extensions, browser extensions, and supported MCP
00:02:10configs. Now let's look at the output. I'm going to run head here. And this is the big thing Bumblebee
00:02:17is doing now. Each line is a structured record. We get back. So you get the ecosystem package name,
00:02:25version source file, confidence level, the metadata, and you get where Bumblebee found it. So now,
00:02:31instead of us asking, do I maybe have this installed somewhere in the system? We can actually now see it
00:02:36right here. And because this is read only metadata parsing, Bumblebee is not calling NPM. It's not
00:02:43importing any Python packages and it's not building your Go project. All it's doing is it's just reading
00:02:50files. And it's why this is useful during an incident. If you have Go installed, this is the
00:02:55point where I'd maybe pause the video, maybe try it on your own machine. It's super easy to spin up.
00:03:00Okay, cool. But why is this not just another security scanner? Because we already have these. Now,
00:03:06at first glance, you might think a few things. It's another SCA tool, but that's actually not what
00:03:12this is. SCA tools are mostly about your application dependencies. SBOM tools are about what you shipped.
00:03:19EDR is about what you executed. Bumblebee is about the local developer state. So imagine a compromised
00:03:26package advisory drops. You need to know which laptops might be exposed. The obvious move is
00:03:32to ask everyone to run package manager commands, but that's exactly the wrong thing here. If we're
00:03:38looking for something malicious, you don't want your command to accidentally execute the malicious
00:03:42behavior. So Bumblebee is straightforward. Read metadata, emit inventory, match known exposures,
00:03:49and then get out. It's done. It has three scan profiles. First is the baseline. This is your
00:03:55lightweight recurring scan. It looks at global packages, user level tool chains, extensions,
00:04:02and MCP configs. Basically what normally exists on this developer machine. That's the question that
00:04:09it's giving us back. It's answering. Then it goes to the project. This is for known workspace
00:04:14directories like code, source, or work. Use this when you care about locked files across
00:04:20actual dev folders. And then we can even get it to go deeper. This is the incident response mode.
00:04:26You point it at explicit routes, even something broad like home, usually with an exposure catalog and a
00:04:32duration limit. So your normal workflow might be Bumblebee scan profile baseline. Okay. When something bad
00:04:38happens, you switch to a deeper scan, Bumblebee scan profile, you can go deeper with this command right
00:04:44here. That's really the process for all this baseline when things are calm, deep scan when there's smoke.
00:04:51And the coverage is what makes this really interesting. Bumblebee can look across npm, pnpn, yarn, bun,
00:04:58go modules, you name it. Plus it can look at supported MCP JSON configs. That one is a major feature because
00:05:06nowadays, mcp configs are becoming the new ENV files. We have them all over our system. Bumblebee also
00:05:13outputs NDJSON. Now, some people are going to hate that. But another way to look at it is,
00:05:18it means you can pipe it into JQ, ship it to a file, collect it through MDM, ingest it into an SEIM,
00:05:25or hand it to another agentic workflow. It's just trying to be boring, scriptable infrastructure. And for this
00:05:32kind of problem, boring is probably best anyways. Now it's fast. It's really fast. It's a single go
00:05:38binary with zero nonstandard library dependencies. That is a very dev friendly starting point. That
00:05:45means it's safe by design. The read only approach is not a small detail. During a supply chain incident,
00:05:51just run the package manager and see what happens. That's not always the best plan. If the package you're
00:05:58looking at has malicious lifestyle scripts or weird plugin behavior, you don't want your scanner to be
00:06:03the thing that accidentally triggers it. Now, this also fills a real gap. Most teams have some visibility
00:06:10into CI, some visibility into container production, and some endpoint visibility. But the dev machine can
00:06:17get messy. It has half finished projects, it has old clones, global package, test virtual environments,
00:06:23AI tooling, all the stuff that never shows up in your clean official inventory. Bumblebee gives you a
00:06:30practical way to see that local state. And then finally, the AI config coverage is right on time. Local
00:06:36agents, MPC servers, and tool calling workflows are moving fast. But keep this in mind now too, while you're
00:06:43going to use Bumblebee. This is brand new. Like I'm talking super, super new as it just dropped. So
00:06:49expect changes. It is focused on Mac OS and Linux right now. The exposure catalog flow is nice, but it
00:06:54also means Bumblebee gets much more useful when you have good advisory data. And it is not EDR, right?
00:07:02It answers a narrower question. What packages, extensions, and dev tool configs are present on this
00:07:09machine. And do any match something that we already know is bad. That's the point. This is not replacing
00:07:14your security stack. It is filling the part your security stack probably doesn't see clearly. So
00:07:19should you actually use Bumblebee? My answer is yes, especially your day-to-day work,
00:07:24touches NPM, Go, VS Code, cursor, Claude, servers, that kind of stuff. Run a baseline scan once a week,
00:07:32right? It's one single command. Bumblebee scan your profile, and it's going to do what I showed you here.
00:07:37Now you have a snapshot of what's on your machine. Dump the NDJSON somewhere central.
00:07:43Then when an incident hits, you can search across everything instead of asking everyone in Slack,
00:07:49hey, does anyone have this? Bumblebee tells you what dev machines currently expose through local
00:07:55package metadata, extension manifests, and supported AI tool configs. That is extremely useful in the first
00:08:02hour when anything goes wrong because nobody wants to debate. They want to know who is exposed, where
00:08:08is it, and how fast can you prove it? And for that, Bumblebee is pretty compelling. It's a pretty strong
00:08:14open source tool that we just got. If you enjoy coding tools and tips like this, be sure to subscribe to
00:08:18to the BetterStack channel.
00:08:20We'll see you in another video.
Community Posts
No posts yet. Be the first to write about this video!
Write about this video