With the rise of traffic from AI agents, what’s considered a bot is no longer clear-cut. There are some clearly malicious bots, like ones that DoS your site or do credential stuffing, and ones that most site owners do want to interact with their site, like the bot that indexes your site for a search engine, or ones that fetch RSS feeds.
Historically, Cloudflare has relied on two main signals to verify legitimate web crawlers from other types of automated traffic: user agent headers and IP addresses. The User-Agent header allows bot developers to identify themselves, i.e. MyBotCrawler/1.1. However, user agent headers alone are easily spoofed and are therefore insufficient for reliable identification. To address this, user agent checks are often supplemented with IP address validation, the inspection of published IP address ranges to confirm a crawler’s authenticity. However, the logic around IP address ranges representing a product or group of users is brittle – connections from the crawling service might be shared by multiple users, such as in the case of privacy proxies and VPNs, and these ranges, often maintained by cloud providers, change over time.
… Today, we’re introducing two proposals – HTTP message signatures and request mTLS – for friendly bots to authenticate themselves, and for customer origins to identify them. In this blog post, we’ll share how these authentication mechanisms work, how we implemented them, and how you can participate in our closed beta. — Read More
Recent Updates Page 66
The AI Engineering Stack
“AI Engineering” is a term that I didn’t hear about two years ago, but today, AI engineers are in high demand. Companies like Meta, Google, and Amazon, offer higher base salaries for these roles than “regular” software engineers get, while AI startups and scaleups are scrambling to hire them.
However, closer inspection reveals AI engineers are often regular software engineers who have mastered the basics of large language models (LLM), such as working with them and integrating them.
So far, the best book I’ve found on this hot topic is AI Engineering by Chip Huyen, published in January by O’Reilly. Chip has worked as a researcher at Netflix, was a core developer at NVIDIA (building NeMo, NVIDIA’s GenAI framework), and cofounded Claypot AI. She has also taught machine learning (ML) at Stanford University. — Read More
AI in Search: Going beyond information to intelligence
We launched AI Overviews last year at I/O, and since then there’s been a profound shift in how people are using Google Search. People are coming to Google to ask more of their questions, including more complex, longer and multimodal questions.
AI in Search is making it easier to ask Google anything and get a helpful response, with links to the web. That’s why AI Overviews is one of the most successful launches in Search in the past decade. As people use AI Overviews, we see they’re happier with their results, and they search more often. In our biggest markets like the U.S. and India, AI Overviews is driving over 10% increase in usage of Google for the types of queries that show AI Overviews 1 . This means that once people use AI Overviews, they’re coming to do more of these types of queries, and what’s particularly exciting is how this growth increases over time. And we’re delivering this at the speed people expect of Google Search — AI Overviews delivers the fastest AI responses in the industry.
We’re continuing to advance Search with AI, and today at I/O, we showed the latest in how we’re building the future of Search, as we go beyond information to intelligence. Here’s a look at everything we announced. — Read More
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures
The rapid scaling of large language models (LLMs) has unveiled critical limitations in current hardware architectures, including constraints in memory capacity, computational efficiency, and interconnection bandwidth. DeepSeek-V3, trained on 2,048 NVIDIA H800 GPUs, demonstrates how hardware-aware model co-design can effectively address these challenges, enabling cost-efficient training and inference at scale. This paper presents an in-depth analysis of the DeepSeek-V3/R1 model architecture and its AI infrastructure, highlighting key innovations such as Multi-head Latent Attention (MLA) for enhanced memory efficiency, Mixture of Experts (MoE) architectures for optimized computation-communication trade-offs, FP8 mixed-precision training to unlock the full potential of hardware capabilities, and a Multi-Plane Network Topology to minimize cluster-level network overhead. Building on the hardware bottlenecks encountered during DeepSeek-V3’s development, we engage in a broader discussion with academic and industry peers on potential future hardware directions, including precise low-precision computation units, scale-up and scale-out convergence, and innovations in low-latency communication fabrics. These insights underscore the critical role of hardware and model co-design in meeting the escalating demands of AI workloads, offering a practical blueprint for innovation in next-generation AI systems. — Read More
The Top 5 domestic large models contend for supremacy, a decisive battle in AGI
China’s foundation model market has completely changed! Today, the players on the table have become the “Top 5 Foundation Models” – Bytedance, Alibaba, Stepfun [阶跃星辰], Zhipu and DeepSeek. Where will the key winning point be in the next battle at the peak?
DeepSeek’s emergence from out of nowhere has completely changed the global AI situation.
From then on, not only has the competition pattern of large-scale models between China and the United States changed, but also the industrial landscape of domestic large-scale models has been broken in one swoop!
Looking at the market of large-scale foundation models in China, we can see that today’s foundation model landscape has changed dramatically and evolved into a new top five pattern –
Bytedance, Alibaba, Stepfun, Zhipu, and DeepSeek. — Read More
OpenAlpha_Evolve
OpenAlpha_Evolve is an open-source Python framework inspired by the groundbreaking research on autonomous coding agents like DeepMind’s AlphaEvolve. It’s a regeneration of the core idea: an intelligent system that iteratively writes, tests, and improves code using Large Language Models (LLMs) like Google’s Gemini, guided by the principles of evolution. — Read More
Large Language Models Are More Persuasive Than Incentivized Human Persuaders
We directly compare the persuasion capabilities of a frontier large language model (LLM; Claude Sonnet 3.5) against incentivized human persuaders in an interactive, real-time conversational quiz setting. In this preregistered, large-scale incentivized experiment, participants (quiz takers) completed an online quiz where persuaders (either humans or LLMs) attempted to persuade quiz takers toward correct or incorrect answers. We find that LLM persuaders achieved significantly higher compliance with their directional persuasion attempts than incentivized human persuaders, demonstrating superior persuasive capabilities in both truthful (toward correct answers) and deceptive (toward incorrect answers) contexts. We also find that LLM persuaders significantly increased quiz takers’ accuracy, leading to higher earnings, when steering quiz takers toward correct answers, and significantly decreased their accuracy, leading to lower earnings, when steering them toward incorrect answers. Overall, our findings suggest that AI’s persuasion capabilities already exceed those of humans that have real-money bonuses tied to performance. Our findings of increasingly capable AI persuaders thus underscore the urgency of emerging alignment and governance frameworks. — Read More
The Simulation Says the Orioles Should Be Good
The Baltimore Orioles should be good, but they are not good. At 15-24, they are one of the worst teams in all of Major League Baseball this season, an outcome thus far that fans, experts, and the team itself will tell you are either statistically improbable or nearing statistically impossible based on thousands upon thousands of simulations run before the season started.
Trying to figure out why this is happening is tearing the fanbase apart and has turned a large portion of them against management, which has put a huge amount of its faith, on-field strategy, and player acquisition decision making into predictive AI systems, advanced statistics, probabilistic simulations, expected value positive moves, and new-age baseball thinking in which statistical models and AI systems try to reduce human baseball players into robotic, predictable chess pieces. Teams have more or less tried to “solve” baseball like researchers try to solve games with AI. Technology has changed not just how teams play the game, but how fans like me experience it, too. — Read More
Company Regrets Replacing All Those Pesky Human Workers With AI, Just Wants Its Humans Back
Two years after partnering with OpenAI to automate marketing and customer service jobs, financial tech startup Klarna says it’s longing for human connection again.
Once gunning to be OpenAI CEO Sam Altman’s “favorite guinea pig,” Klarna is now plotting a big recruitment drive after its AI customer service agents couldn’t quite hack it.
The buy-now-pay-later company had previously shredded its marketing contracts in 2023, followed by its customer service team in 2024, which it proudly began replacing with AI agents. Now, the company says it imagines an “Uber-type of setup” to fill their ranks, with gig workers logging in remotely to argue with customers from the comfort of their own homes. — Read More
INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning
We introduce INTELLECT-2, the first globally distributed reinforcement learning (RL) training run of a 32 billion parameter language model. Unlike traditional centralized training efforts, INTELLECT-2 trains a reasoning model using fully asynchronous RL across a dynamic, heterogeneous swarm of permissionless compute contributors.
To enable a training run with this unique infrastructure, we built various components from scratch: we introduce PRIME-RL, our training framework purpose-built for distributed asynchronous reinforcement learning, based on top of novel components such as TOPLOC, which verifies rollouts from untrusted inference workers, and SHARDCAST, which efficiently broadcasts policy weights from training nodes to inference workers.
Beyond infrastructure components, we propose modifications to the standard GRPO training recipe and data filtering techniques that were crucial to achieve training stability and ensure that our model successfully learned its training objective, thus improving upon QwQ-32B, the state of the art reasoning model in the 32B parameter range.
We open-source INTELLECT-2 along with all of our code and data, hoping to encourage and enable more open research in the field of decentralized training. — Read More