We keep hearing that China is catching up with the West in AI compute. A great example of this comes from NVIDIA’s CEO Jensen Huang, who recently claimed that China has made “enormous progress” in the last few years, and that “China is right behind us. We’re very, very close.”
And China has indeed been making a ton of progress. As we’ll see, Chinese hardware has been closing the gap across a range of metrics relating to computational power and data transfer, both of which are crucial aspects of AI workloads.
But despite progress on these metrics, we don’t think China is about to leap ahead of the West on AI compute. China’s top developers—including Alibaba, ByteDance, Baidu, and DeepSeek—still rely primarily on NVIDIA chips. And major bottlenecks still remain before China can leap ahead. — Read More
Recent Updates Page 30
Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation
Scaling language models unlocks impressive capabilities, but the accompanying computational and memory demands make both training and deployment expensive. Existing efficiency efforts typically target either parameter sharing or adaptive computation, leaving open the question of how to attain both simultaneously. We introduce Mixture-of-Recursions (MoR), a unified framework that combines the two axes of efficiency inside a single Recursive Transformer. MoR reuses a shared stack of layers across recursion steps to achieve parameter efficiency, while lightweight routers enable adaptive token-level thinking by dynamically assigning different recursion depths to individual tokens. This allows MoR to focus quadratic attention computation only among tokens still active at a given recursion depth, further improving memory access efficiency by selectively caching only their key-value pairs. Beyond these core mechanisms, we also propose a KV sharing variant that reuses KV pairs from the first recursion, specifically designed to decrease prefill latency and memory footprint. Across model scales ranging from 135M to 1.7B parameters, MoR forms a new Pareto frontier: at equal training FLOPs and smaller model sizes, it significantly lowers validation perplexity and improves few-shot accuracy, while delivering higher throughput compared with vanilla and existing recursive baselines. These gains demonstrate that MoR is an effective path towards large-model quality without incurring large-model cost. — Read More
Robotic neck incision replaces heart valve with no chest opening in world first
In a surgical first, doctors have replaced a heart valve through a small neck incision using robotic assistance, avoiding the need to open the chest.
The pioneering procedure, performed at the Cleveland Clinic by cardiothoracic surgeon Dr. Marijan Koprivanac, marks the first known clinical use of transcervical robotic access for aortic valve replacement (AVR).
Four patients underwent the technique earlier this year and were discharged within days. — Read More
Inverse Scaling in Test-Time Compute
We construct evaluation tasks where extending the reasoning length of Large Reasoning Models (LRMs) deteriorates performance, exhibiting an inverse scaling relationship between test-time compute and accuracy. Our evaluation tasks span four categories: simple counting tasks with distractors, regression tasks with spurious features, deduction tasks with constraint tracking, and advanced AI risks. We identify five distinct failure modes when models reason for longer: 1) Claude models become increasingly distracted by irrelevant information; 2) OpenAI o-series models resist distractors but overfit to problem framings; 3) models shift from reasonable priors to spurious correlations; 4) all models show difficulties in maintaining focus on complex deductive tasks; and 5) extended reasoning may amplify concerning behaviors, with Claude Sonnet 4 showing increased expressions of self-preservation. These findings suggest that while test-time compute scaling remains promising for improving model capabilities, it may inadvertently reinforce problematic reasoning patterns. Our results demonstrate the importance of evaluating models across diverse reasoning lengths to identify and address these failure modes in LRMs. — Read More
The Rise of the AI Database: Powering Real-Time AI Applications
As AI rapidly evolves, organizations are racing to build and deploy high-performance gen AI apps that deliver real-time insights and seamless user experiences. Central to this transformation is the emergence of the generative AI database, a new category of data platform optimized for vector search, semantic indexing and full-text retrieval. These systems are designed to address challenges like data silos, data quality and integration for AI and analytics. As the name suggests, a gen AI database is purpose-built to power generative AI models and applications, enabling developers to store, query and analyze both structured and unstructured data at scale, with the data stored in these platforms playing a crucial role in supporting advanced analytics and machine learning. — Read More
Context Engineering: 2025’s #1 Skill in AI
Let’s get one thing straight: if you’re still only talking about “prompt engineering,” you’re behind the curve. In the early days of Large Language Models (LLMs), crafting the perfect prompt was the name of the game.
For simple chatbots in 2022, it was enough. Then came Retrieval-Augmented Generation (RAG) in 2023, where we started feeding models domain-specific knowledge. Now, we have tool-using, memory-enabled agents that need to build relationships and maintain state over time. The single-interaction focus of prompt engineering just doesn’t cut it anymore. — Read More
Experts react: What Trump’s new AI Action Plan means for tech, energy, the economy, and more
“An industrial revolution, an information revolution, and a renaissance—all at once.” That’s how the Trump administration describes artificial intelligence (AI) in its new “AI Action Plan.” Released on Wednesday, the plan calls for cutting regulations to spur AI innovation and adoption, speeding up the buildout of AI data centers, exporting AI “full technology stacks” to US allies and partners, and ridding AI systems of what the White House calls “ideological bias.” How does the plan’s approach to AI policy differ from past US policy? What impacts will it have on the US AI industry and global AI governance? What are the implications for energy and the global economy? Our experts share their human-generated responses to these burning AI questions below. — Read More
America’s AI Action Plan
America is in a race to achieve global dominance in artificial intelligence (AI). Winning this race will usher in a new era of human flourishing, economic competitiveness, and national security for the American people. Recognizing this, President Trump directed the creation of an AI Action Plan in the early days of his second term in office. Based on the three pillars of accelerating innovation, building AI infrastructure, and leading in international diplomacy and security, this Action Plan is America’s roadmap to win the race. — Read More
Surprising no one, new research says AI Overviews cause massive drop in search clicks
Google’s search results have undergone a seismic shift over the past year as AI fever has continued to escalate among the tech giants. Nowhere is this change more apparent than right at the top of Google’s storied results page, which is now home to AI Overviews. Google contends these Gemini-based answers don’t take traffic away from websites, but a new analysis from the Pew Research Center says otherwise. Its analysis shows that searches with AI summaries reduce clicks, and their prevalence is increasing.
Google began testing AI Overviews as the “search generative experience” in May 2023, and just a year later, they were an official part of the search engine results page (SERP). Many sites (including this one) have noticed changes to their traffic in the wake of this move, but Google has brushed off concerns about how this could affect the sites from which it collects all that data.
SEO experts have disagreed with Google’s stance on how AI affects web traffic, and the newly released Pew study backs them up. — Read More
‘Another DeepSeek moment’: Chinese AI model Kimi K2 stirs excitement
Excitement is growing among researchers about another powerful artificial intelligence (AI) model to emerge from China, after DeepSeek shocked the world with its launch of R1 in January.
The performance of Kimi K2, launched on 11 July by Beijing-based company Moonshot AI, matches or surpasses that of Western rivals, as well as some DeepSeek models, across various benchmarks, according to the firm. In particular, it seems to excel at coding and scoring high in tests such as LiveCodeBench. — Read More