US-Regierung stoppt Fable 5 und Mythos 5: Sicherheitsbedenken oder überzogene Reaktion?

Some things are just too good to be true. And apparently Fable was one of those things because it just got banned. Yes, really. All of us who have been trying to maximize our utilization of Fable over the last 3 days are no longer able to do such because of the United States government. I'm a US citizen and I'm even restricted here. It's actually unbelievable. I've never seen anything like this in my life. And this is another one of those terrible precedents being set. I'm tired of living in unprecedented times, man. And I want [ __ ] to be a little more boring cuz this is this is a lot. I'll read the tweet first because it's the quick summary and then we're going to go into all the details for the rest of the video. The US government citing national security authorities has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national whether inside or outside the United States including foreign national anthropic employees. So even anthropic employees that don't live here are no longer allowed to access it. Especially ones who are here, citizens of other nations. That's crazy. There's a lot of Anthropic employees I know that are not US citizens. [ __ ] I I I'm thinking like I have so many friends that work at Anthropic that now cannot use the models. What the [ __ ] What the [ __ ] The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all of our customers to ensure compliance because they don't have a way to verify that we're citizens yet. They're going to figure that all out. Other models are not affected. We apologize for the disruption. Yada yada. We have a lot to talk about here and I just lost a lot of money because I paid for a shitload of quad subs for Fable and most of those were for my employees that are not US-based. That's what I'm not going to get back. So, I need to make some somewhere and I'm going to have to go with today's sponsor. I'm starting to get annoyed with all of the setup flows necessary to make my agents useful. They might be able to run fine in one codebase, but what happens when the changes affect another? What happens when I need to access data from a logging system that I haven't set up the MCP for yet? What happens when somebody's asking me for updates on Slack and I have to go copy paste all the context between things? How do I even know how much it'll cost me when I'm running across all these different things? These are problems that have annoyed me and probably you too for a while, especially those of y'all working at big companies. Thankfully, Code Rabbit has solved it in more with the new agent product. If you only know Code Rabbit for their code review, you need to catch up because they're going so much further now with the agent. And I want to be really clear about this first. They didn't just build another tool to file slot PRs via Slack. They built something that understands your company, your codebase, your integrations, your data, and more. Their agent integrates with everything from Century and Post Hog to Jira and Linear to Notion and Google Drive because they know your data lives across all of these different places and it needs to access it to solve real problems. This also means it can pull from different sources as part of the same request. Data Dog threw an alert for this team. Somebody tagged in Code Rabbit. It went through the alerts from Data Dog and then checked the Google Cloud logs separately. found a PR, which now we're in three different data sources that it was able to access and use to provide the useful information. This is the type of thing you normally needed your best dev who had a horrible bus factor to solve for you. So, it's got to be super expensive, right? There's like 15 different ways they could bill. Well, they solve that, too. Rather than having all these different pieces you have to worry about the costs of, they just charge you for how much time the agent spends doing things. This means no more runaway bills and no more utter chaos trying to figure out how much a task is going to cost and no more trying to debug 15 different tools that aren't attached to your agent properly. Just ping it in Slack and go. Codabrabbit's reviews already stopped me from shipping so much broken code. Now they're helping fix it. Check them out now at soy.codeb. The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the US, including foreign national anthropic employees. The net effect of this order is that we must abruptly disable Fable and Mythos for all of our customers to ensure compliance. Access to all other anthropic models will not be affected. We received this directive from the government today at 5:21 p.m. Eastern time. It is currently 6:00 p.m. Pacific time and there's a 3-hour difference there. So, this would have been this about 4 hours ago roughly. The letter did not provide specific details of its national security concern. Our understanding is that the government believes it has become aware of a method of bypassing or jailbreaking Fable 5. We reviewed a demonstration of this specific technique being used to identify a small number of previously known minor vulnerabilities. These vulnerabilities all appear relatively simple and we have found that other publicly available models are able to discover them as well without requiring a bypass. To break down what they're describing here, the US government was able to get an example of jailbreaking Fable 5 to find vulnerabilities in software and they don't want other governments being able to use it to do that. Anthropic is disagreeing subtly here by saying that the vulnerabilities it found with this jailbreak were simple and they have found other publicly available models can find those same vulnerabilities. Anthropic's posture with respect to Fable safeguards is laid out in their launch blog post is the following. First, we have instituted strong safeguards that greatly reduce the likelihood that Fable is misused for tasks related to cyber security among others. In fact, our safeguards are so strong that many users have complained that they are overly broad. Yep. In the weeks leading up to the launch of Fable, Anthropic worked with the US government, the UK AISI, multiple private third-party organizations, and internal teams to red team Fable safeguards for thousands of hours in total. These tests showed that Fable safeguards are substantially more effective than those of any previously deployed bottle. I would agree. It's hard to get it to do [ __ ] anything. No testers have yet been able to find a universal jailbreak, a jailbreak method that can very broadly bypass the model safeguards, unblocking a wide range of cyber capabilities. We suspect that perfect jailbreak resistance is not currently possible for any model provider. Every safeguard used in the industry is vulnerable to non-universal jailbreaks which can elicit some cyber information in specific circumstances and it is likely that universal jailbreaks will eventually be found in the future. We stated this clearly when we released Fable 5. Given that perfect jailbreak resistance does not appear to be possible today, Anthropic adopted a defense in-depth strategy with Fable 5. We aimed to make the jailbreaks either narrow in the case of non-universal jailbreaks or very expensive to produce in the case of universal ones. And to combine this with thorough monitoring to quickly detect and shut down any successful attacks. Again, I talked about this in a video I recorded earlier today that now feels really out of date. They are monitoring things that are returned that shouldn't have been and updating their safeguards accordingly because the model isn't trained to be safe. The model was trained to be really smart happens to be so smart it can do some hacking. So, they put a bunch of [ __ ] in front to make sure that it doesn't get a prompt that tells it to do things that are potentially unsafe. And most importantly, it doesn't respond with instructions that tell it how to do unsafe things. They also made the change where all prompts now have a 30-day retention required, which was not the case before. That has also made them lose a bunch of customers. They even call that out here, but allowed them to research and mitigate jailbreaks. They stand by this defense in depth strategy. It reduces the risks posed by Fable, making them comparable to the risks of existing models already deployed across the industry. And they've not even received a disclosure of concerning non-universal potential jailbreaks that led to a harmful result. The potential jailbreaks that have been disclosed to them are either benign responses or are minor findings that provide no mythospecific uplift. Yep, I am on anthropic side on this one. They went really hard with the security, but it sounds like someone in the US government was able to get it to respond with a vulnerability in some software and immediately turned everything up to 11. Like, oh my god, foreign nationals are going to use this to hack us, even though other models are capable of doing the same vulnerability discovery that the US government demonstrated to anthropic here. To date, the government has only given us verbal evidence of a potential narrow non-universal jailbreak, which essentially consists of asking the model to read a code base and fix any software flaws. Our understanding is that one potential jailbreak was shared with the government. We have reviewed the report and validated that the level of capability displayed there is widely available from other models, including OpenAI's GPT 5.5. This is the student telling the teacher, "Hey, did you forget to give us homework?" But it is actually a reasonable call out, too. I am scared we might lose 55 access now too though. The snitching is unreal. Oh god. Yeah. And it's also a good call out that 55 is used every day by the defenders who keep systems safe. We will share more details over the next 24 hours. And this is on a Friday, by the way. Friday evening. It's 6 p.m. on Friday. I'm supposed to be at a party right now, but I'm here filming this. I do like that they called out the specific strategy used here, which is not asking the model find software flaws and find hacks. They told the model to go through the codebase and fix flaws and through finding those there, they can go reverse engineer it. We're complying with the government's legal directive and we're removing access to Fable and Mythos for all users. However, we disagree that the findings of a narrow potential jailbreak should cause recalls for commercial models deployed to hundreds of millions of people. If this standard was applied across the industry, we believe it would essentially halt all new model deployment for all Frontier model providers. So, you're safe, Google, but everyone else is screwed. As they have stated publicly, they believe the government should have the ability to block unsafe deployments as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles. We apologize for the disruption to our customers. We believe this is a misunderstanding, and we are working to restore access as soon as possible. Crazy that this Dario blog post on policy on the AI exponential has become so relevant and important to their business already. I was making jokes about this just yesterday, but now it really matters. It is funny that in here they call out that legislation moves very slowly. Often this is for good reasons. Governments have grave powers and it's usually for the best that they aren't used too hastily, but the mismatch in time scale is nevertheless very painful. In the several years it can take Congress to act, AI can go from an amusing toy to the full country of geniuses. Over the last few years, since AI has become a major commercial technology, those of us who wanted to handle it responsibly have faced a dilemma. We could see clearly where the exponential was going. We strongly suspected that within a few years, AI would be one of the rare technologies that fundamentally reshapes the entire policy landscape in the same way that nuclear weapons reshaped geopolitics in the industrial revolution fundamentally reshaped every economic and social issue. But to those looking only at what AI could do at the time, it looked like much more mundane technology, similar perhaps to the latest consumer app or cryptocurrency. Cryptocurrency needs a lot of legislation. Let's be real here. It was hard to convince most policy makers and companies that anything other than a lz fair attitude made sense. And to be fair, the fact that AI's radical effects were not yet present in that we didn't know exactly what shape they might take made it difficult to design the right policies even if there had been the will to act. Given the limitations imposed by the situation, many safety advocates, including Anthropic, have so far been focused on advocating for policy actions that preserve optionality, tee up a fast reaction in the future, or give the world better insights into what is coming down the pike. Things like transparency legislation, export controls on chips, and data collection on AI's labor efforts. These are not enough, but they have felt like all that was possible. They got their fast reaction, though they did get that. Daario also calls out here that Mythos preview in the discovery that frontier models pose very real risks to cyber security creating the potential for disruption for of the financial sector critical infrastructure and national security is indeed a big deal and real preview scrambled the global cyber security landscape. It really did. It was crazy to watch. But its broader significance is that it proves beyond a doubt that AI models are now tools of global and national strategic consequence. Yep. And we are now suffering the consequences of exactly that. The cyber risks that mythosclass models present will not be the last that we must face. I believe that biological risks may soon follow and that serious II autonomy risks may not be far behind. We now globally and collectively need to activate a slow and rickety policy apparatus to deal with risks and opportunities that are going to compound surprisingly quickly from here. Many policy makers are showing increased openness to taking action. And it's been encouraging to see our peers come around to the same positions that we've been advocating for over the past few years. This is good, but I worry that these early actions are at least a year out of step with AI's rapid progress. This essay is an attempt to close the gap, to lay out where the exponential is now and the collective action needed to meet the moment. He then goes on to give the different examples of the policy types and categories he thinks should be implemented. It's a genuinely good read and I would recommend taking a look yourself. The link will be in the description. But instead of continuing to read that, I'm going to pour one out for my beloved. Fable, I will miss you. So, our three days together were magical, unlike anything I've experienced before. It some things are just too good to be true. So good that the government interferes. I'm sorry we were one of those things. Until we meet again. It's still routing. Still appears to be fable. Did not tell me that it rerouted. It has not been disabled here yet. I'm scared if I pick a different model. Will I still be able to pick Fable? Okay, cool. I still can. Rest in peace. People are already making websites. Is Fable down or not.app for how long until Fable is removed from the API? They're doing tests every 5 minutes, I believe, to see if it is responding to probes or not. Just two days ago, Anthropic urges us to not block state AI laws without setting federal standards. This might be retaliatory. This might be because Anthropic was pushing for the US to not interfere with the restrictions certain states wanted to implement. So now they are getting restricted super hard. But with OpenAI certainly cooking new models coming very very soon, I have a feeling the impact and implications of this are going to be absurd. I don't want to waste your guys' time too much because this is still a developing story. But I've covered everything we have now and I'll be sure to pin comments accordingly as more info comes in. For now, all we have is this official statement and access being cut off in the very near future, if not by the time this video is live. I'm going to wrap up ASAP because I want to get this out ASAP. I hope you guys understand. I will miss Fable and I will be using as much of it as I possibly can up until it is formally banned. Uh fingers crossed that this gets resolved quickly. And sorry to all of my friends who are not in the US, who are not US citizens that won't get access for even longer. What a [ __ ] mess. This is a disaster. I did not think we were going to get intervention like this as quickly as we are. Everything is chaos. I'll do my best to cover it.

US-Regierung stoppt Fable 5 und Mythos 5: Sicherheitsbedenken oder überzogene Reaktion?

Kommentare

KI-Shutdown: Wie Anthropic an eigenen Sicherheitswarnungen scheitert

US-Regierung zwingt Anthropic zur Abschaltung von Claude Mythos 5 und Fable 5

KI-Suche vor dem Aus? Die Bedrohung, die die Brande umkrempeln könnte