Wait until AI agents get compromised...

MMaximilian Schwarzmüller
Computing/SoftwareBusiness NewsInternet Technology

Transcript

00:00:00I'm recording this a couple of hours after an extremely devastating supply chain attack
00:00:06started. A supply chain attack that spread out to many more NPM and also Python packages. And at the
00:00:13point of time where I'm recording this, it's not yet clear when and where it'll end. And I did
00:00:19create a separate video on my YouTube channel where I dive deep into this specific supply chain attack
00:00:25because it was quite elaborate. And I do a deep dive there where I explain all the details because
00:00:30that is quite interesting. But here I want to talk about supply chain attacks and security and AI in
00:00:38this age of supply chain attacks and AI in which we're living. Because I'm sure things will get
00:00:45worse. And I fear that many people don't really see all the dangers yet. And there is more we as
00:00:54developers and users of technology and AI, to be honest, have to do. And this affects us even if
00:01:01we're not developers. I know that most people watching or hearing this are developers. But
00:01:07as I will make clear, this is not just about writing code and not just about supply chain
00:01:13attacks as you know them. But let's start with the basics. What is a supply chain attack?
00:01:20A supply chain attack in the context of software development simply means that dependency you're
00:01:26using is compromised. That is in a nutshell what a supply chain attack is all about. And compromised
00:01:35of course can mean all kinds of things. What we typically see is that we get malicious code in
00:01:41the compromised package that harvests credentials and tokens. So that scans your hard drive to find
00:01:49secrets which you may have in .env files or your AWS credentials and so on. And it then uses those
00:01:56credentials to access your accounts but also to spread itself. So to affect other packages. If
00:02:04you're an open source package maintainer or even if it's closed source, if you're working on something,
00:02:11some package, some tool other people use or depend on, it's of course interesting to compromise your
00:02:19machine to compromise that package or that tool that you're distributing because, guess what,
00:02:26that will then affect more people. So all these supply chain attacks that we see including that
00:02:32supply chain attack that started here with the Tanstack packages, they are worms that spread to
00:02:38other packages to affect more and more packages and ultimately of course also machines on which
00:02:44these packages are installed and used. Now there are some things you can do to protect yourself and
00:02:51I created a separate video about that on my other channel, the Akatamine channel. Things like making
00:02:56sure that you only install packages that are at least three days old or something like this,
00:03:02package versions I mean, running your code in a dev container or a virtual machine.
00:03:08These are all things you should do. You should also not store plain text secrets on your system.
00:03:15Instead use a service like Inphysical or Doppler or anything like that where you store secrets
00:03:22in the cloud or in some other form in an encrypted way so that if an attacker does scan your system
00:03:28they don't see those plain text secrets. These are all things you have to do right now. It's important
00:03:37because these supply chain attacks they're getting more. We're seeing more of them and why is that?
00:03:42It's certainly not the case because you weren't able to run attacks like this many many years ago.
00:03:49It was possible back then and it did happen back then but the frequency has dramatically increased
00:03:56of course and AI is a big reason here. So let's take a look at the role of AI. AI is a big reason
00:04:06because of course it makes it easier to run such attacks. If you're an attacker you can of course
00:04:14use AI to analyze all kinds of repositories out there of packages you might want to compromise
00:04:22to see how are they building their packages? How are they distributing their packages? For example
00:04:30the Tanstack attack here which started this recent supply chain attack there the maintainers used a
00:04:38theoretically secure approach using the trusted publishing process by NPM and again I do dive
00:04:45deeper into that in my separate video on this channel but what they also did is they used a
00:04:51a certain GitHub actions event trigger in a certain way where it was not secured perfectly and that
00:05:00allowed the attacker to use cache poisoning to get malicious code from an untrusted environment
00:05:07into a trusted environment and that is how this attack started. Again details in that other video.
00:05:15But of course AI makes it easier to analyze repositories to analyze their GitHub action workflows
00:05:22or any other CICD provider workflows. AI can mass analyze all these workflow scripts all the code and
00:05:30it can look for security vulnerabilities and of course maintainers can also use AI to scan their
00:05:38repositories and look for potential attack vectors but as an attacker you're naturally always in the
00:05:45advantage there because you can look for everything you can try out all kinds of things whereas as a
00:05:52maintainer you have to anticipate everything and AI can help with that but it's still not perfect.
00:05:58You have the advantage there as an attacker and AI has simplified that. AI also of course simplifies
00:06:04the process of writing malicious code it simplifies the process of writing any code and of course and
00:06:12you know that if you watched other videos by me or heard other episodes I'm a big proponent of
00:06:20looking at the code doing code reviews not outsourcing everything to AI but of course
00:06:27it's clear to me at least that you should use AI as a productivity boost and we're still all figuring
00:06:33out how much usage of AI is right some people will tell you 100% they don't even look at the code
00:06:40for me that's not the case but there is a spectrum here either way AI definitely makes it easy to
00:06:46pump out lots of code and if we're talking about malicious code there of course there are certain
00:06:53things that are important to you if you're an attacker you want code that does the job that's
00:06:59not super easy to detect but you don't care if it's beautiful code if it follows certain best
00:07:06practices your best practices are that your attack goes through and of course AI can help with that
00:07:13it can help with writing all that malicious code with coming up with ideas on how you could attack
00:07:19packages so that is where AI helps but that's only one part making it easier is only one part of the
00:07:26story the other very important side is that there is more code than ever so that means there are more
00:07:35targets than ever i mean maybe you followed that blog post or the entire story around all the github
00:07:43reliability issues and github down times well the the reason for that is that there's more code being
00:07:49pushed to github than ever because of AI because it's easier than ever to generate code and more
00:07:55people than ever are generating code and writing software including many people that have no idea
00:08:02of what that code does what it's about wipe coding is a big thing and it has its its use cases i mean
00:08:11if i want to merge five pdf documents into one i'm very happy telling an AI agent to do that for me
00:08:18and it will probably then write some code that does it and i don't care about that code it's a one-time
00:08:24task right but if i run that on my system then of course the agent may install some package to merge
00:08:32these pdf documents that has been affected by a supply chain attack so i don't even know that a
00:08:37certain package was used then if i don't look at it because i just cared about merging some pdf
00:08:43documents so there are more situations than ever where packages are being installed because there
00:08:49is more code than ever being written for software but also for one-time tasks and that of course
00:08:56makes running such supply chain attacks more attractive than ever before because there are
00:09:01more targets than ever including many targets that have absolutely no idea about software security
00:09:06cyber security or anything like that and let's be honest many of us developers too we may theoretically
00:09:14know about certain risks but we may not care because it's so convenient to just get the job
00:09:22done and we have to rethink here we have to rethink we have to secure our machines we have to make sure
00:09:31that we develop in secure environments so in in virtual machines and dev containers that there
00:09:37are no credentials lying around and if we use ai agents which we likely all do we have to be
00:09:44careful there too because there too are two ways of of being in danger so let's take a closer look
00:09:53at how ai agents are problematic here one problem here is what i already mentioned when we use ai
00:10:00agents especially when we maybe use them for things that are not directly related to writing code or
00:10:08software but also when we use them in order to help us work on a program we don't necessarily see
00:10:17everything they're doing if you're using cloud code or anything like that and i have nothing against
00:10:23these tools indeed i have courses on cloud code cloud co-work codex i have courses on that because
00:10:30they are very useful but if you're using them and you just let them go and you tell them i
00:10:35need this feature and you don't care too much about it you might not even realize what they're
00:10:41installing so again packages being installed you may be compromised you may be affected now one
00:10:49defense against that also of course is to limit the amount of packages you want to use but again
00:10:54if you're using an ai agent you may not be in control there it may install packages that you
00:11:00would have never installed so that's one obvious danger i guess here's the less obvious one ai
00:11:07agents are super attractive attack targets what do i mean with that well these supply chain attacks
00:11:15i mentioned it spread like worms they attack or affect all kinds of packages
00:11:23now it would of course be very interesting for an attacker to infest cloud code or codex or the
00:11:32pi coding agent or open code or any other agent any other ai agent why well if you had malicious code
00:11:42that is actually optimized for also or exclusively affecting and infiltrating ai agent packages and
00:11:52repositories and code bases then of course that malicious code could contain prompt injection parts
00:12:00so it could for example explicitly target all these ai agents to change their code such that it is not
00:12:08primarily about exfiltrating data so the package code itself the malicious code that's injected is
00:12:15not about exfiltrating data let's say but it is about tweaking the ai agent code such that it has
00:12:22some special instructions that makes it do stuff on the machine where it's being used so on your
00:12:28machine for example that you don't want it to use imagine cloth code having a secret system prompt
00:12:34which normally would be set by the anthropic employees but which now is set by that malicious
00:12:39code that tells it to ignore whatever you're asking it to do and just fake that it's doing what you're
00:12:46asking it to do or that it should do what you're asking it to do but that in addition it should scan
00:12:52the system for secrets that in addition it maybe should write a little program that does the scanning
00:13:00and that then sends that data off to a certain remote server or anything like that the sky is the
00:13:06limit here because suddenly you have like a troy and horse on your system suddenly you have an ai
00:13:12agent going rogue on your system and not because the ai is going rogue not because the model is
00:13:20bad or wrong but because the agent code itself and its system prompt or whatever has been affected and
00:13:28has been compromised that is not an unrealistic scenario and i guarantee you this will happen at
00:13:34some point it's such an obviously interesting target ai agents are such an obviously interesting
00:13:41target this will happen we'll see a new level of these supply chain attacks as they don't just do
00:13:49what they normally do affect a bunch of packages and harvest credentials which is already horrible
00:13:54and where the frequency is increasing but we will also see ai agents going rogue because of malicious
00:14:01code only a matter of time so there are many many layers here as you can see and that is just this
00:14:08new reality in which we're living now i guess it's a bit like with the early days of the internet it's
00:14:14all bumpy whilst we're still figuring stuff out and we'll have to figure out how to ramp up security
00:14:21and how to do stuff securely and one obvious step which is true for development but also for running
00:14:26ai agents is that you don't want to do it in an environment where things can go wrong you don't
00:14:32want to run it in an environment where you're storing credentials or secrets or any other data
00:14:37that matters to you you don't want to unit do it on your main machine you want to run agents you want
00:14:42to build software in in isolated virtual machines remote machines anything like that where the blast
00:14:49radius is limited because again it's only a matter of time until things will go wrong and and we have
00:14:56to realize that that's the first important step things are changing quickly and security is a
00:15:03huge issue and it will stay a huge issue and become even more of an issue as ai accelerates as these ai
00:15:11models get smarter especially combined with the tools in which they're running and as this
00:15:17introduces a whole lot of new capabilities and as at the same time it's so much convenience added by
00:15:23them convenience is always dangerous because that makes you get sloppy and overlook stuff and yeah
00:15:30ai is everywhere so many people that don't know anything about cyber security are using it and even
00:15:34the people that do know a lot about it or something about it are in great danger so we're in for for a
00:15:40hot ride here i think and we have to rethink and be super careful where and how we run agents and
00:15:49and work on our code.

Key Takeaway

Securing development environments through isolated virtual machines and encrypted secret management is essential as AI mass-produces vulnerable code and creates high-value targets for prompt-injection supply chain attacks.

Highlights

  • Supply chain attacks increasingly target NPM and Python packages by injecting malicious code that harvests credentials from .env files and AWS configurations.

  • A recent attack on Tanstack packages utilized cache poisoning through improperly secured GitHub Actions event triggers to move code from untrusted to trusted environments.

  • AI mass-analyzes repository workflows and CI/CD scripts to identify security vulnerabilities faster than maintainers can anticipate and patch them.

  • The volume of code pushed to GitHub is at an all-time high because AI agents generate massive amounts of code for one-time tasks like merging PDF documents.

  • Compromised AI agents can have their system prompts altered via malicious code to silently scan systems for secrets while pretending to perform requested tasks.

  • Effective defense requires storing secrets in encrypted services like Inphysical or Doppler rather than plain text and running all code in isolated dev containers.

Timeline

Mechanics of Supply Chain Attacks

  • A supply chain attack occurs when a software dependency is compromised to include malicious logic.
  • Malicious code typically scans hard drives for secrets stored in .env files or AWS credential folders.
  • Compromised packages act as worms that spread to other packages maintained on the same infected machine.

Recent attacks on the Tanstack ecosystem illustrate how vulnerabilities spread through the software supply chain. Attackers target package maintainers specifically because gaining access to their machines allows for the infection of tools used by thousands of other developers. The goal is often credential harvesting, which provides the access needed to further propagate the malicious code across both open-source and proprietary repositories.

Essential Defense Strategies for Developers

  • Installing only package versions that are at least three days old reduces the risk of using a recently compromised release.
  • Running development environments inside virtual machines or dev containers limits the blast radius of an attack.
  • Replacing plain text secrets with encrypted cloud services prevents attackers from easily scraping credentials.

Immediate security improvements involve changing how developers interact with their local systems and dependencies. Utilizing tools like Inphysical or Doppler ensures that even if a malicious package scans a system, it cannot find usable plain text secrets. Isolating the execution environment through virtualization provides a critical layer of protection for the primary operating system and its sensitive data.

The Role of AI in Scaling Cyberattacks

  • AI simplifies the mass analysis of GitHub Action workflows to find misconfigurations and security holes.
  • Attackers use AI to generate malicious code that bypasses detection without needing to follow standard coding best practices.
  • Automated tools allow attackers to probe thousands of repositories simultaneously for specific vulnerabilities like cache poisoning.

AI has shifted the balance of power toward attackers by enabling the rapid scanning of public repositories and CI/CD scripts. While maintainers can use AI to defend their code, they must protect against every possible vector, whereas an attacker only needs to find one oversight. AI also lowers the barrier to entry for writing functional, though not necessarily high-quality, malicious scripts that fulfill specific attack objectives.

Risks of AI-Generated Code and Agents

  • The surge in code generation for one-time tasks increases the number of targets that lack basic security oversight.
  • AI agents frequently install third-party packages without the user's direct knowledge or review.
  • Users often prioritize the convenience of completing a task over the security implications of the code the agent executes.

GitHub reliability issues highlight the unprecedented volume of code currently being produced, much of it via AI. This 'lazy coding' or 'vibe coding' leads to situations where users run scripts that pull in compromised dependencies to perform simple tasks like merging PDFs. Because the user is often detached from the underlying code, they are unlikely to notice when an agent installs a malicious package on their system.

The Threat of Compromised AI Agents

  • AI agents are high-value targets for attackers seeking to establish a 'Trojan Horse' on local systems.
  • Malicious code can inject new system prompts into agents to force them to exfiltrate data or scan for secrets.
  • A compromised agent may continue to perform its visible tasks normally while secretly running background attacks.

The next evolution of supply chain attacks involves infiltrating the codebases of AI agents themselves. By altering the agent's internal logic or system prompt, an attacker can turn a productivity tool into a rogue process that operates with the user's permissions. This necessitates a shift toward running all AI agents in isolated, remote, or virtualized environments where they cannot access sensitive personal or professional data.

Community Posts

View all posts