The Last Programmers

We’re witnessing the final generation of people who translate ideas into code by hand.XXXXI quit my job at Amazon in May to join a startup called Icon

… I felt like I was reaching the ceiling of what I could learn about AI and building good products within Amazon’s constraints. That’s why I joined Icon. At Icon, we move at a completely different speed. We ship features in days that would have taken Amazon months to approve.

… The interesting part is watching how my teammates work. One of them hasn’t looked at actual code in weeks. Instead, he writes design documents in plain English and trusts AI to handle the implementation. When something needs fixing, he edits the document, not the code.

It made me realize something profound: we’re living through the end of an era where humans translate ideas into code by hand. Within a few years, that skill will be as relevant as knowing how to shoe a horse. — Read More

#devops

DOGE’s Flops Shouldn’t Spell Doom for AI In Government

Just a few months after Elon Musk’s retreat from his unofficial role leading the Department of Government Efficiency (DOGE), we have a clearer picture of his vision of government powered by artificial intelligence, and it has a lot more to do with consolidating power than benefitting the public. Even so, we must not lose sight of the fact that a different administration could wield the same technology to advance a more positive future for AI in government.

To most on the American left, the DOGE end game is a dystopic vision of a government run by machines that benefits an elite few at the expense of the people. It includes AI rewriting government rules on a massive scale, salary-free bots replacing human functions and nonpartisan civil service forced to adopt an alarmingly racist and antisemitic Grok AI chatbot built by Musk in his own image. And yet despite Musk’s proclamations about driving efficiency, little cost savings have materialized and few successful examples of automation have been realized. — Read More

#strategy

The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media

The Dead Internet Theory (DIT) suggests that much of today’s internet, particularly social media, is dominated by non-human activity, AI-generated content, and corporate agendas, leading to a decline in authentic human interaction. This study explores the origins, core claims, and implications of DIT, emphasizing its relevance in the context of social media platforms. The theory emerged as a response to the perceived homogenization of online spaces, highlighting issues like the proliferation of bots, algorithmically generated content, and the prioritization of engagement metrics over genuine user interaction. AI technologies play a central role in this phenomenon, as social media platforms increasingly use algorithms and machine learning to curate content, drive engagement, and maximize advertising revenue. While these tools enhance scalability and personalization, they also prioritize virality and consumption over authentic communication, contributing to the erosion of trust, the loss of content diversity, and a dehumanized internet experience. This study redefines DIT in the context of social media, proposing that the commodification of content consumption for revenue has taken precedence over meaningful human connectivity. By focusing on engagement metrics, platforms foster a sense of artificiality and disconnection, underscoring the need for human-centric approaches to revive authentic online interaction and community building. — Read More

#robotics

Don’t Build An AI Safety Movement

Safety advocates are about to change the AI policy debate for the worse. Faced with political adversity, few recent policy wins, and a perceived lack of obvious paths to policy victory, the movement yearns for a different way forward. One school of thought is growing in popularity: to create political incentive to get serious about safety policy, one must ‘build a movement’. That is, one must create widespread salience of AI safety topics and channel it into an organised constituency that puts pressure on policymakers.

Recent weeks are seeing more and more signs of efforts to build a popular movement. In two weeks, AI safety progenitors Eliezer Yudkowsky and Nate Soares are publishing a general-audience book to shore up public awareness and support — with a media tour to boot, I’m sure. PauseAI’s campaigns are growing in popularity and ecosystem support, with a recent UK-based swipe at Google DeepMind drawing national headlines. And successful safety career accelerator MATS is now also in the business of funneling young talent into attempts to build a movement. Now, these efforts are in their very early stages; and might still just stumble on their own. But they point to a broader motivation — one that’s worth seriously discussing now. — Read More

#trust

Why language models hallucinate

Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty. Such “hallucinations” persist even in state-of-the-art systems and undermine trust. We argue that language models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty, and we analyze the statistical causes of hallucinations in the modern training pipeline. Hallucinations need not be mysterious — they originate simply as errors in binary classification. If incorrect statements cannot be distinguished from facts, then hallucinations in pretrained language models will arise through natural statistical pressures. We then argue that hallucinations persist due to the way most evaluations are graded — language models are optimized to be good test-takers, and guessing when uncertain improves test performance. This “epidemic” of penalizing uncertain responses can only be addressed through a socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards, rather than introducing additional hallucination evaluations. This change may steer the field toward more trustworthy AI systems. — Read More

#nlp

A PM’s Guide to AI Agent Architecture: Why Capability Doesn’t Equal Adoption

Last week, I was talking to a PM who’d in the recent months shipped their AI agent. The metrics looked great: 89% accuracy, sub-second respond times, positive user feedback in surveys. But users were abandoning the agent after their first real problem, like a user with both a billing dispute and a locked account.

“Our agent could handle routine requests perfectly, but when faced with complex issues, users would try once, get frustrated, and immediately ask for a human.”

This pattern is observed across every product team that focuses on making their agents “smarter” when the real challenge is making architectural decisions that shape how users experience and begin to trust the agent. — Read More

#devops

Detecting and countering misuse of AI: August 2025

We’ve developed sophisticated safety and security measures to prevent the misuse of our AI models. But cybercriminals and other malicious actors are actively attempting to find ways around them. Today, we’re releasing a report that details how.

Our Threat Intelligence report discusses several recent examples of Claude being misused, including a large-scale extortion operation using Claude Code, a fraudulent employment scheme from North Korea, and the sale of AI-generated ransomware by a cybercriminal with only basic coding skills. We also cover the steps we’ve taken to detect and counter these abuses. — Read More

#cyber

Open Global Investment as a Governance Model for AGI

This paper introduces the “open global investment” (OGI) model, a proposed governance framework for artificial general intelligence (AGI) development.  The core idea is that AGI development could proceed within one or more corporations in a context that (a) encourages wide international shareholding, (b) reduces the risk of expropriation, (c) implements strengthened corporate governance processes, (d) operates within a government-defined framework for responsible AI development (and/or a public-private partnership), and (e) includes additional international agreements and governance measures to whatever extent is desirable and feasible.  We argue that this model, while very imperfect, offers advantages in terms of inclusiveness, incentive compatibility, and practicality compared to prominent alternatives—such as proposals modelled on the Manhattan project, CERN, or Intelsat—especially in scenarios with short AGI timelines. — Read More

#strategy

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL), reframing LLMs from passive sequence generators into autonomous, decision-making agents embedded in complex, dynamic worlds. This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL. Building on this foundation, we propose a comprehensive twofold taxonomy: one organized around core agentic capabilities, including planning, tool use, memory, reasoning, self-improvement, and perception, and the other around their applications across diverse task domains. Central to our thesis is that reinforcement learning serves as the critical mechanism for transforming these capabilities from static, heuristic modules into adaptive, robust agentic behavior. To support and accelerate future research, we consolidate the landscape of open-source environments, benchmarks, and frameworks into a practical compendium. By synthesizing over five hundred recent works, this survey charts the contours of this rapidly evolving field and highlights the opportunities and challenges that will shape the development of scalable, general-purpose AI agents. — Read More

GitHub Repo

#nlp

In a first, scientists map complete brain activity during decision-making

Mice moving tiny steering wheels to control shapes on a screen have given scientists an unprecedented view of how decisions unfold across the brain.

For the first time, researchers have mapped decision-making at single-cell resolution across an entire mammalian brain. — Read More

Read the Paper

#human