Some strange and surreal moments on the AI beat this week:
Anthropic’s misbehaving model
What’s the word for when the $60 billion AI startup Anthropic releases a new model—and announces that during a safety test, the model tried to blackmail its way out of being shut down? And what’s the best way to describe another test the company shared, in which the new model acted as a whistleblower, alerting authorities it was being used in “unethical” ways?
Some people in my network have called it “scary” and “crazy.” Others on social media have said it is “alarming” and “wild.” I say it is…transparent. And we need more of that from all AI model companies. But does that mean scaring the public out of their minds? And will the inevitable backlash discourage other AI companies from being just as open?
That’s what I covered in Fortune’s Eye on AI newsletter on Tuesday, after Anthropic released its 120-page safety report, or “system card,” for its new Claude Opus 4 model. Headlines blared how the model “will scheme,” “resorted to blackmail,” and had the “ability to deceive.”
We all deserve to know when state-of-the-art AI models are doing weird sh**. And it’s not surprising that the whole thing is freaking us out. But I include myself in the category of reporters who are doing the best we can to figure out how to communicate the craziness without sending us all into our bunkers. A black box of AI helps no one.
Musk is out
I was up late last night working on a different story when our Fortune team Slack lit up with the news that Elon Musk had left the Trump Administration after a clash over Trump’s “Big Beautiful Bill.” Was it the straw that broke the back of Trump’s “buddy”?
What is clear is that Musk, in leading the DOGE charge to lay off thousands of federal workers, also normalized the notion of pushing generative AI tools into dozens of agencies, including the Army, the GSA, and the Department of Education. There were the reports DOGE was using AI to snoop on federal workers, and had gained unprecedented access to federal datasets and citizen information with an eye on feeding the data into generative AI tools.
Musk may be out, but as he said in his post on X announcing he was moving on, “The @DOGE mission will only strengthen over time as it becomes a way of life throughout the government.” We can all assume the use of generative AI in government will as well.
The Nvidia train rolls on, but not without its China-related bumps
It’s been over two years since I published this piece on Nvidia’s AI rise — at a time when I knew very little about the company — and its bullet train ride to success has rarely slowed since.
That story from February 2023 did not include an interview with Jensen Huang. Ever since, I’ve been low-key stalking the leather jacket-clad CEO. The closest I came was last month, when I was supposed to do a short interview at a recent conference — but he cancelled a couple of days before. My stalking days (in the friendliest, most polite way) continue!
It might take a while after this week’s earnings report: While the company beat revenue estimates, it announced a $4.5 billion hit to inventory as a result of a new U.S. policy on chip exports. As my colleague Alexandra Sternlicht reported, “As Huang spoke to analysts and investors on the company’s earnings call, the CEO demonstrated an impressive feat of gymnastics, walking a fine line to critique the Trump policy that left a massive hole in his company’s income statement while being careful not to provoke the president.”
Where I’ll be next week:
I’m headed to Washington, DC to attend the second annual SCSP AI+ Expo. Is it an AI-in-the-government career fair? A geopolitical lobbying frenzy? A meetup with Eric Schmidt and friends? I’m not sure, but I will report on the ground. Note to self: Leave automated note to all wondering why I am not at NYC Tech Week.
That is exactly what my article was about: transparency without context is fear mongering -- https://fortune.com/2025/05/27/anthropic-ai-model-blackmail-transparency/
On Anthropic et alia - these reports are transparent. But transparancy without context is just fearmongering.
I did a small write up myself on Claude’s welfare assessment, which imho came closer to speculative fiction than to practicing science.