Leaders of many organizations are urging their teams to adopt agentic AI to improve efficiency, but are finding it hard to achieve any benefit. Managers attempting to add AI agents to existing human teams may find that bots fail to faithfully follow their instructions, return pointless or obvious results or burn precious time and resources spinning on tasks that older, simpler systems could have accomplished just as well.
The technical innovators getting the most out of AI are finding that the technology can be remarkably human in its behavior. And the more groups of AI agents are given tasks that require cooperation and collaboration, the more those human-like dynamics emerge.
Our research suggests that, because of how directly they seem to apply to hybrid teams of human and digital workers, the most effective leaders in the coming years may still be those who excel at understanding the timeworn principles of human management.
We have spent years studying the risks and opportunities for organizations adopting AI. Our 2025 book, Rewiring Democracy, examines lessons from AI adoption in government institutions and civil society worldwide. In it, we identify where the technology has made the biggest impact and where it fails to make a difference. Today, we see many of the organizations we’ve studied taking another shot at AI adoption—this time, with agentic tools. While generative AI generates, agentic AI acts and achieves goals such as automating supply chain processes, making data-driven investment decisions or managing complex project workflows. The cutting edge of AI development research is starting to reveal what works best in this new paradigm. — Read More
Recent Updates Page 25
AI suddenly develops a human skill on its own Scientists now officially confused, concerned, and considering therapy
People, take a stiff drink for this one, cause it’s going to be long, unhinged, and “why the hell is my toaster negotiating with my fridge” levels of existential blog.
Let me TL;DR this beast for ya.
In a plot twist no one saw coming, but everyone privately feared, our dear AI has decided to pick up a brand-new human skill all by itself, which is the skill of getting along in a group. — Read More
8 plots that explain the state of open models
Starting 2026, most people are aware that a handful of Chinese companies are making strong, open AI models that are applying increasing pressure on the American AI economy.
While many Chinese labs are making models, the adoption metrics are dominated by Qwen (with a little help from DeepSeek). Adoption of the new entrants in the open model scene in 2025, from Z.ai, MiniMax, Kimi Moonshot, and others is actually quite limited. This sets up the position where dethroning Qwen in adoption in 2026 looks impossible overall, but there are areas for opportunity. In fact, the strength of GPT-OSS shows that the U.S. could very well have the smartest open models again in 2026, even if they’re used far less across the ecosystem. — Read More
Chinese AI models have lagged the US frontier by 7 months on average since 2023
Since 2023, every model at the frontier of AI capabilities, as measured by the Epoch Capabilities Index, has been developed in the United States. Over that same period, Chinese models have trailed US capabilities by an average of seven months, with a minimum gap of four months and a maximum gap of 14. — Read More
The ROI Problem in Attack Surface Management
Attack Surface Management (ASM) tools promise reduced risk. What they usually deliver is more information.
Security teams deploy ASM, asset inventories grow, alerts start flowing, and dashboards fill up. There is visible activity and measurable output. But when leadership asks a simple question, “Is this reducing incidents?” the answer is often unclear.
This gap between effort and outcome is the core ROI problem in attack surface management, especially when ROI is measured primarily through asset counts instead of risk reduction. — Read More
mHC: Manifold-Constrained Hyper-Connections
Recently, studies exemplified by Hyper-Connections (HC) have extended the ubiquitous residual connection paradigm established over the past decade by expanding the residual stream width and diversifying connectivity patterns. While yielding substantial performance gains, this diversification fundamentally compromises the identity mapping property intrinsic to the residual connection, which causes severe training instability and restricted scalability, and additionally incurs notable memory access overhead. To address these challenges, we propose Manifold-Constrained Hyper-Connections (mHC), a general framework that projects the residual connection space of HC onto a specific manifold to restore the identity mapping property, while incorporating rigorous infrastructure optimization to ensure efficiency. Empirical experiments demonstrate that mHC is effective for training at scale, offering tangible performance improvements and superior scalability. We anticipate that mHC, as a flexible and practical extension of HC, will contribute to a deeper understanding of topological architecture design and suggest promising directions for the evolution of foundational models. — Read More
2025: The year in LLMs
This is the third in my annual series reviewing everything that happened in the LLM space over the past 12 months. For previous years see Stuff we figured out about AI in 2023 and Things we learned about LLMs in 2024.
It’s been a year filled with a lot of different trends. — Read More
Cybersecurity Changes I Expect in 2026
It becomes very clear that the primary security question for a company is how good their attackers’ ai is vs. their own.
— ISOs increasingly realize that there is no way to scale their human team to deal with how constant, continuous, and increasingly effective their attackers are becoming at attacking them
— It becomes a competition with how fast you can perform asset management, attack surface management, and vulnerability management on your company, but especially on your perimeter (which includes email and phishing/social engineering)
Read More
Planetary-Scale Deep Reasoning: Building Our Final Presidential Daily Brief Prompt & Comparing Gemini 3/2.5 Pro/Flash ASR/TOC
Over the last few days we have been exploring having Gemini 3 Pro “watch” an entire day of television news from a given channel from across the world and write a deeply reasoned and researched intelligence-style report that looks across all of that coverage and teases out the overarching themes, narratives, implications and future impacts of the day’s events. Yesterday we had Gemini 3 Pro interactively improve its own prompt to generate a final “ultimate” prompt to write a Presidential Daily Brief (PDB)-style intelligence report from a day’s broadcast transcripts. Today we’ll add a few final refinements and then demonstrate our new prompt on a single day of a Russian television news channel across Gemini 3 Pro, Gemini 3 Flash, Gemini 2.5 Pro and Gemini 2.5 Flash Thinking using both the full-day Chirp 1 ASR transcripts and a preprocessed story table of contents. No data was used to train or tune any model. — Read More
China Just Pulled Its Own Manhattan Project and No One Saw It Coming
Or: The West banned the machines. China hired the machinists. Sometimes plans just do not go how you planned them. Ironically, I had been writing this article for a month now, and all research pointed at China being way too far behind. Well…
December 2025. Reuters reveals that China completed an operational EUV lithography prototype in a high-security Shenzhen facility. Not through reverse engineering captured ASML machines. Not through some breakthrough in domestic optics manufacturing. Through something far simpler.
They recruited the humans who knew how to build them. — Read More