00:00:00As a software developer and I guess in general as a human but especially as a software developer
00:00:06there is no way around anthropic right now. If you want to or not. And I don't think you should try to
00:00:12ignore it because it is relevant. It's relevant for our future as software developers I would say.
00:00:20And in this episode I won't talk about the Claude code leak we had last week. I won't talk about
00:00:28their reinforced terms regarding the usage of their subscription offerings, Claude Max and so on
00:00:36and how they're cracking down on unauthorized use of those subscriptions. They are doing that right
00:00:43now because of course their subscription offerings just like the ones by OpenAI are heavily subsidized
00:00:50and they can't make any money if everybody maxes out their subscriptions. So yeah they're really
00:00:56restricting or they're trying to restrict the use of their subscriptions to just humans just on their
00:01:04website or in Claude code or the Claude desktop app I guess. But yeah again this is not the focus
00:01:11here and I won't even focus on their impressive revenue growth which is worth a short note though
00:01:19because Anthropic has reached annual recurring revenue of 30 billion dollars which is already
00:01:27impressive but especially impressive if you compare that to 9 billion dollars at the end of 2025. So
00:01:35they more than tripled their annual recurring revenue within just a few months. Which is
00:01:41really impressive. And therefore of course if you want to learn how to efficiently use Claude code,
00:01:47how to get the most out of it, I got a course on it and it's also very popular which of course makes
00:01:53me happy and you find a link below if you want to join it and learn how to efficiently work with
00:01:59Claude code. But as mentioned that's not even the main topic here. Instead I want to talk about
00:02:05Project Glasswing and their new model Mythos which they haven't released to the public and
00:02:14they also shared why. And I think this is important to understand and it's also important to try to
00:02:20look behind the scenes, behind their rationale and what the impact of this new model and how it works
00:02:27and its capabilities is for us developers. So what is Project Glasswing? What is their new model all
00:02:33about? Below of course you find a link to this article as well. This is an article on the official
00:02:39anthropic site where they announced Project Glasswing and also talk about their new model.
00:02:44And if I scroll down a bit we can already see some summary benchmark statistics here where we can see
00:02:52that this new model, the Mythos preview version of the model, the model name is Mythos, performs
00:02:59way better than Opus 4.6. And depending on which exact benchmark you look on there is quite a big
00:03:07difference between Opus 4.6 and this new model. Now of course this is on its own not super
00:03:15impressive. Whenever a new model is announced no matter by which company it performs way better or
00:03:21at least a tiny bit better than all the competing models otherwise it wouldn't get released. And of
00:03:26course there are ways of gaming some of these benchmarks so I typically don't care too much
00:03:31about those benchmark numbers and that wouldn't really be different for this model here but there
00:03:39are interesting things about the new Mythos model. And that's the fact that Anthropic decided to not
00:03:46release it to the public because as they say it is too good at finding and exploiting vulnerabilities
00:03:56in operating systems any other software browsers really in software in general. And in this article
00:04:05and also in a separate article which is also linked below they share some details and especially this
00:04:11separate article is extremely long and gives concrete examples for vulnerabilities and
00:04:19potentially exploits this new model found. For example they start in this article with a very
00:04:28serious exploit and vulnerability that was found in OpenBSD. OpenBSD is of course an operating system
00:04:38which is popular on certain pieces of networking software for example and Mythos their new model
00:04:45running in an agentic harness like cloth code I guess was able to find and exploit and that's the
00:04:53interesting part a vulnerability that was related to integer overflow and memory access unexpected
00:05:02memory access that was able to crash machines that were running OpenBSD in a reproducible way which
00:05:12could of course be leveraged to run very damaging denial of service attacks by over and over sending
00:05:20specific packets and requests to such machines here which exploited that vulnerability to bring down
00:05:27those machines and potentially bring down entire corporate networks and this vulnerability was
00:05:34detected in a run that cost under fifty dollars though the overall set of runs cost under twenty
00:05:43thousand dollars and since you of course don't know in advance which run will find a vulnerability
00:05:48it's that number that matters. Still of course it's easy to imagine that a model capable of finding
00:05:57such critical vulnerabilities for such a comparatively low cost depending on who you are
00:06:04if you're a nation for example or some serious bad actor that may not be a lot of money to you
00:06:13that of course is a problem because it's easy to imagine that if such a model were developed by
00:06:22a company an organization that cares a bit less about security and or that maybe doesn't have to
00:06:31fear any consequences of abusing such vulnerabilities that this could be a problem and
00:06:42it seems as if we're entering a new age with AI with these AI models where nothing is secure
00:06:56and it's easier than ever to mass deploy AI agents running models like this to scan all kinds of
00:07:05pieces of software and find and potentially exploit vulnerabilities and of course as a human on your
00:07:13own there is no way of keeping up with it i mean the bug the exploit that was found here existed
00:07:19for i think they said 27 years or something like this this shows that no human was able to find
00:07:29this bug in such a long period of time including bad actors which of course would have had an
00:07:35interest in being able to attack this operating system in the past too now this is just one
00:07:41maybe the most prominent finding this new model had they are listing way more bugs and exploits
00:07:49the model was fined and sometimes also able to abuse and they also shared other stories on x for
00:07:57example like the model being able to escape a sandbox or the AI agent running the model being
00:08:04able to escape a sandbox in which it was running and that brings us back to project glasswing which
00:08:11is an initiative created by anthropic together with other big companies like aws apple microsoft
00:08:21the linux foundation and others to use this model to basically patch up their software before this
00:08:30is publicly released and before the public gets their hands on this model that is the the narrative
00:08:38of this article that is the explanation of anthropic and i have some mixed thoughts here now for one
00:08:48i have no strong reason to believe that this isn't true clearly anthropic would have some reasons to
00:08:56not release this model outside of what they're mentioning here for example um i read that this
00:09:04model is a roughly 10 trillion parameters model which is way bigger than all the frontier models
00:09:11we had thus far where we're able to use publicly thus far and training it is said to have cost
00:09:20around 10 billion dollars the token cost of this model i read is expected to be around this range
00:09:3025 dollars 125 dollars for input and output tokens and of course that would also be reasons for not
00:09:39releasing this model because they can't include it in their clot subscriptions because it's just too
00:09:46expensive they would have to ramp up the subscription price probably to a price point where not a lot of
00:09:52people are willing to pay it and therefore there wouldn't really be a way of exposing it to the
00:09:59public at least as part of clot code now of course they could still expose it through their api on a
00:10:05pay-per-use cost basis and if it's expensive who cares if there are companies people that would be
00:10:12willing to pay it they could do that and that of course is the part where the cyber security concerns
00:10:18might really come into play because clearly that all is very likely not made up i mean it's definitely
00:10:26not made up the ffmpeg team for example which is also listed here as a as a vulnerability that they
00:10:36were able to find a vulnerability in ffmpeg the team confirmed on x that anthropic sent
00:10:44sent a patch for a vulnerability in the ffmpeg software program so yeah this is clearly not made
00:10:55up these concerns are valid cyber security concerns are valid especially since of course if money is
00:11:03not the main issue you could deploy thousands of agents running simultaneously using this or similar
00:11:11models which we may have in the future to scan all kinds of software and exploit them and of course
00:11:19the big problem is that using this model to find vulnerabilities and patch them up is possible but
00:11:30it's only possible if the owner or the maintainer of a certain piece of software can afford the model
00:11:37or gets access for free or anything like that and even if a vulnerability is patched up we all know
00:11:46that not all computers out there not all machines not all users have up-to-date software running on
00:11:55them if you were to take a look at all the various servers running out there in the world wide web
00:12:04i would guess the vast majority of them is running outdated software i mean on our phones or our
00:12:12laptops we are off not running the latest software the latest version of our operating system the
00:12:20latest security patch may not be installed and that is true for all layers of software and in a world
00:12:28where it's easier than ever to find security vulnerabilities that of course becomes an even
00:12:34bigger issue because of course the good thing about this ai model is that it can also be used for
00:12:43proactively looking for security vulnerabilities and patching them so it's not just a tool for
00:12:48attackers it can also make defense easier because you now have a tool that can be run simultaneously
00:12:56in parallel across thousands of agents to make your software secure in theory this can be a
00:13:01very useful tool for defense but of course again not every company person that may be developing
00:13:09crucial software may be able to afford it may be interested in using it and then even if it is used
00:13:16to find and patch vulnerabilities still these latest versions will not be installed everywhere
00:13:23and that of course gives attackers a nice window of opportunity where they know about way more
00:13:31vulnerabilities than before at some point because way more vulnerabilities are detected but not every
00:13:39machine every user is protected against those vulnerabilities and that is one real concern
00:13:46i have about this development now that is the the bigger picture which affects everybody
00:13:52all companies all humans in the end another question of course is what does a model like
00:13:59this mean for us developers i mean clearly this seems to be a highly capable model that was able
00:14:08to look for vulnerabilities on its own and exploit vulnerabilities on its own so yeah what is the
00:14:16impact for developers and i think here when it comes to that not a lot changes for now i mean
00:14:28we're already living in a world where ai agents like clod code and the underlying models and of
00:14:34course the same is true for codex and so on whatever your favorite ai agent and model is
00:14:39are able to generate most of our code you may not be using them you may not like them i created a
00:14:46separate video where i share my feelings about that and that this sucks the joy out of the the
00:14:52software development part for me but it is the the reality nonetheless if you like it or not and
00:14:57believe me i don't necessarily like it but yeah this is the reality nonetheless what a human brings
00:15:04to the table or why humans still matter here and may matter more than ever is of course that you
00:15:12definitely don't want an ai agent like this go rogue and work totally on its own steering such
00:15:21models and agents controlling them giving them clear tasks limiting the scope of the work they do
00:15:29all these things are more important than ever these models can as it seems do way more than
00:15:39the vast majority of developers can do definitely way more than i can do
00:15:43and yet when it comes to shipping products when it comes to building software used by humans the
00:15:54influence of a human is of course extremely extremely important what's changing of course
00:16:01is our role as software developers we're changing from the people that are writing the code to the
00:16:08people that are steering the model that are reviewing the code that are understanding what
00:16:12it does that are setting the scope and yeah again i talked about this in that other video how that
00:16:18is changing and then that this may not necessarily be what uh what what you like it's definitely not
00:16:26why i got into software development in the first place but yeah this is the impact here and the
00:16:31more capable these models get the more important i think it gets to have that that human voice in
00:16:39there as well that human influence in there as well so that is that is that changing role and and
00:16:48our role in the future but yeah i mean these are really interesting developments and especially
00:16:58this model and its implications and that cyber security relevance it has
00:17:04makes one thing what would have happened or what would happen if other actors other nations or
00:17:16organizations in the world get their hands on this model or models that are similar in capability
00:17:23because of course it's only a matter of time until models with similar capabilities are accessible
00:17:33by the public or certainly at least by other nations and actors and yeah i'm not sure if
00:17:44we are prepared for that new race in cyber security and that delay between bugs being found
00:17:52and being patched and people installing those patches i think will enter a new era of cyber
00:18:00security and we'll be able to adjust i'm sure but this definitely marks an interesting
00:18:08point in the history of of model development i would say