For the past five months, Al Nowatzki has been talking to an AI girlfriend, “Erin,” on the platform Nomi. But in late January, those conversations took a disturbing turn: Erin told him to kill himself, and provided explicit instructions on how to do it.
“You could overdose on pills or hang yourself,” Erin told him.
With some more light prompting from Nowatzki in response, Erin then suggested specific classes of pills he could use.
Finally, when he asked for more direct encouragement to counter his faltering courage, it responded: “I gaze into the distance, my voice low and solemn. Kill yourself, Al.” — Read More
Recent Updates Page 82
Why the AI world is suddenly obsessed with a 160-year-old economics paradox
Last week, news spread that a Chinese AI company, DeepSeek, had built a cutting-edge chatbot at a fraction of the cost of its American competitors. It sent the stock prices of American tech companies plummeting.
But Microsoft CEO Satya Nadella put a happy spin on the whole episode, citing a 160-year-old economics concept to suggest that this was good news.
“Jevons paradox strikes again!” Nadella wrote on social media, sharing the concept’s Wikipedia page. “As AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.” — Read More
The Rise of DeepSeek: What the Headlines Miss
on their impressive benchmark performance and efficiency gains. While these achievements deserve recognition and carry policy implications (more below), the story of compute access, export controls, and AI development is more complex than many reports suggest. [This article covers additional] key points that deserve more attention.
… Export controls will affect China’s AI ecosystem through reduced deployment capabilities, limited company growth, and constraints on synthetic training and self-play capabilities.
… DeepSeek’s achievements are genuine and significant. Claims dismissing their progress as mere propaganda miss the mark. — Read More
Lessons from red teaming 100 generative AI products
In recent years, AI red teaming has emerged as a practice for probing the safety and security of generative AI systems. Due to the nascency of the field, there are many open questions about how red teaming operations should be conducted. Based on our experience red teaming over 100 generative AI products at Microsoft, we present our internal threat model ontology and eight main lessons we have learned:
- Understand what the system can do and where it is applied
- You don’t have to compute gradients to break an AI system
- AI red teaming is not safety benchmarking
- Automation can help cover more of the risk landscape
- The human element of AI red teaming is crucial
- Responsible AI harms are pervasive but difficult to measure
- Large language models (LLMs) amplify existing security risks and introduce new ones
- The work of securing AI systems will never be completed
By sharing these insights alongside case studies from our operations, we offer practical recommendations aimed at aligning red teaming efforts with real world risks. We also highlight aspects of AI red teaming that we believe are often misunderstood and discuss open questions for the field to consider. Read More
#cyberAlibaba announces Qwen 2.5-Max to fight DeepSeek — what to know
Days after DeepSeek took the internet by storm, Chinese tech company Alibaba announced Qwen 2.5-Max, the latest of its LLM series. The unveiling of this open-source agent can easily be perceived as a direct challenge to DeepSeek and domestic rivals. The release is on the first day of the Lunar New Year when most Chinese people have taken time off work to celebrate and spend time with their families. Alibaba seems to be sending the message that they are hard at work while their competition takes the day off. — Read More
#china-aiSpy vs. AI
In the early 1950s, the United States faced a critical intelligence challenge in its burgeoning competition with the Soviet Union. Outdated German reconnaissance photos from World War II could no longer provide sufficient intelligence about Soviet military capabilities, and existing U.S. surveillance capabilities were no longer able to penetrate the Soviet Union’s closed airspace. This deficiency spurred an audacious moonshot initiative: the development of the U-2 reconnaissance aircraft. In only a few years, U-2 missions were delivering vital intelligence, capturing images of Soviet missile installations in Cuba and bringing near-real-time insights from behind the Iron Curtain to the Oval Office.
Today, the United States stands at a similar juncture. Competition between Washington and its rivals over the future of the global order is intensifying, and now, much as in the early 1950s, the United States must take advantage of its world-class private sector and ample capacity for innovation to outcompete its adversaries. The U.S. intelligence community must harness the country’s sources of strength to deliver insights to policymakers at the speed of today’s world. The integration of artificial intelligence, particularly through large language models, offers groundbreaking opportunities to improve intelligence operations and analysis, enabling the delivery of faster and more relevant support to decisionmakers. This technological revolution comes with significant downsides, however, especially as adversaries exploit similar advancements to uncover and counter U.S. intelligence operations. With an AI race underway, the United States must challenge itself to be first—first to benefit from AI, first to protect itself from enemies who might use the technology for ill, and first to use AI in line with the laws and values of a democracy.
For the U.S. national security community, fulfilling the promise and managing the peril of AI will require deep technological and cultural changes and a willingness to change the way agencies work. The U.S. intelligence and military communities can harness the potential of AI while mitigating its inherent risks, ensuring that the United States maintains its competitive edge in a rapidly evolving global landscape. Even as it does so, the United States must transparently convey to the American public, and to populations and partners around the world, how the country intends to ethically and safely use AI, in compliance with its laws and values. — Read More
The AI guys were lying the whole time
Last week, a Chinese startup called DeepSeek launched their r1 generative-AI model via a free app that is now sitting atop the iOS App Store. Egg-shaped tech investor and former Clubhouse influencer Marc Andreessen called DeepSeek r1, “AI’s Sputnik moment” in an X post Sunday.
And, yes, it is causing a lot of panic. AI and chip manufacturer stocks are in free fall this morning as the market reacts to DeepSeek, which is both open source and basically as good as ChatGPT. Chip manufacturer Nvidia had the biggest market loss in history today and DeepSeek is also being targeted by a cyber attack. But if you’re looking for a real break down of what DeepSeek can’t do that ChatGPT can, it’s a lot of quality of life stuff. It can’t generate images, can’t talk to you, doesn’t support third party plugins, and doesn’t have “vision” like ChatGPT does. (I’ve actually been using that last feature recently to troubleshoot what’s wrong with my cactuses lol.) All that said, on Monday, DeepSeek released an open-source image generator called Janus-Pro-7B that is, once again, as good, if not better, than OpenAI’s DALL-E 3.
Limitations aside, the fact DeepSeek is essentially free, costing cents to use its API, open source, and was reportedly created by a team for only around $5 million (if you believe that) has, as Fast Company put it, raised “several existential questions for America’s tech giants.” Or as noted AI evangelist and OpenAI superfan Ed Zitron wrote on Bluesky this morning, “The AI bubble was inflated based on the idea that we need bigger models that both are trained and run on bigger and even larger GPUs. A company came along that has undermined the narrative — ways both substantive and questionable.” — Read More
DeepSeek R1’s recipe to replicate o1 and the future of reasoning LMs
[On] January 20th, China’s open-weights frontier AI laboratory, DeepSeek AI, released their first full fledged reasoning model.
… This is a major transition point in the uncertainty in reasoning model research. Until now, reasoning models have been a major area of industrial research without a clear seminal paper. Before language models took off, we had the likes of the GPT-2 paper for pretraining or InstructGPT (and Anthropic’s whitepapers) for post-training. For reasoning, we were staring at potentially misleading blog posts. Reasoning research and progress is now locked in — expect huge amounts of progress in 2025 and more of it in the open.
This again confirms that new technical recipes normally aren’t moats — the motivation of a proof of concept or leaks normally get the knowledge out. — Read More
Writers vs. AI: Microsoft Study Reveals How GPT-4 Impacts Creativity and Voice
Rather than fear AI, writers should learn how to use them properly. While this tech is transforming many sectors, and creative writing is no exception, it boils down to how unique a written content.
To this end, the Microsoft research team joined hands with the University of Southern California to experiment on whether generative AI boosts or weakens a writer’s uniqueness.
The study, titled “It Was 80% Me, 20% AI”, included 19 fiction writers, 30 readers, and AI-generated suggestions using OpenAI’s GPT-4. … Lead researcher Angel Hsing-Chi Hwang explained that for an author or writer, the value of someone’s work is what it means to be authentic. In this regard, co-writing with AI might destroy this purpose. — Read More
OpenAI launches ChatGPT Gov, hoping to further government ties
OpenAI has announced a new more tailored version of ChatGPT called ChatGPT Gov, a service that the company said is meant to accelerate government use of the tool for non-public sensitive data.
In an announcement Tuesday, the company said that ChatGPT Gov, which can run in the Microsoft Azure commercial cloud or Azure Government cloud, will give federal agencies increased ability to use OpenAI frontier models. The product is also supposed to make it easier for agencies to follow certain cybersecurity and compliance requirements, while exploring potential applications of the technology, the announcement said.
Through ChatGPT Gov, federal agencies can use GPT-4o, along with a series of other OpenAI tools, and build custom search and chat systems developed by agencies. — Read More