Throughout history, technological and scientific advances have had both good and ill effects, but their overall impact has been overwhelmingly positive. Thanks to scientific progress, most people on earth live longer, healthier, and better than they did centuries or even decades ago.
I believe that AI (including AGI and ASI) can do the same and be a positive force for humanity. I also believe that it is possible to solve the “technical alignment” problem and build AIs that follow the words and intent of our instructions and report faithfully on their actions and observations.
… In the next decade, AI progress will be extremely rapid, and such periods of sharp transition can be risky. What we — in industry, academia, and government— do in the coming years will matter a lot to ensure that AI’s benefits far outweigh its costs. — Read More
Daily Archives: July 1, 2025
Using AI to identify cybercrime masterminds
Online criminal forums, both on the public internet and on the “dark web” of Tor .onion sites, are a rich resource for threat intelligence researchers. The Sophos Counter Threat Unit (CTU) have a team of darkweb researchers collecting intelligence and interacting with darkweb forums, but combing through these posts is a time-consuming and resource-intensive task, and it’s always possible that things are missed.
As we strive to make better use of AI and data analysis, Sophos AI researcher Francois Labreche, working with Estelle Ruellan of Flare and the Université de Montréal and Masarah Paquet-Clouston of the Université de Montréal, set out to see if they could approach the problem of identifying key actors on the dark web in a more automated way. Their work, originally presented at the 2024 APWG Symposium on Electronic Crime Research, has recently been published as a paper. — Read More
The New Skill in AI is Not Prompting, It’s Context Engineering
Context Engineering is new term gaining traction in the AI world. The conversation is shifting from “prompt engineering” to a broader, more powerful concept: Context Engineering. Tobi Lutke describes it as “the art of providing all the context for the task to be plausibly solvable by the LLM.” and he is right.
With the rise of Agents it becomes more important what information we load into the “limited working memory”. We are seeing that the main thing that determines whether an Agents succeeds or fails is the quality of the context you give it. Most agent failures are not model failures anyemore, they are context failures. — Read More
Mark Zuckerberg announces creation of Meta Superintelligence Labs. Read the memo
Mark Zuckerberg said Monday that he’s creating Meta Superintelligence Labs, which will be led by some of his company’s most recent hires, including Scale AI ex-CEO Alexandr Wang and former GitHub CEO Nat Friedman.
Zuckerberg said the new AI superintelligence unit, MSL, will house the company’s various teams working on foundation models such as the open-source Llama software, products and Fundamental Artificial Intelligence Research projects, according to an internal memo obtained by CNBC. — Read More
China’s biggest public AI drop since DeepSeek, Baidu’s open source Ernie, is about to hit the market
On Monday, Chinese technology giant Baidu is making its Ernie generative AI large language model open source, a move by China’s tech sector that could be its biggest in the AI race since the emergence of DeepSeek. The open sourcing of Ernie will be a gradual roll-out, according to the company.
Will it be a shock to the market on the order of DeepSeek? That’s a question which divides AI experts. [Some] say Ernie’s release could cement China’s position as the undisputed AI leader. — Read More
Life of an inference request (vLLM V1): How LLMs are served efficiently at scale
vLLM is an open-source inference engine that serves large language models. We deploy multiple vLLM instances across GPUs and load open weight models like Llama 4 into them. We then load balance traffic across vLLM instances, run health checks, and do upgrades. Our customers consume our managed service by sending their prompts to our API endpoints. This endpoint also determines the vLLM instance that serves their prompt.
vLLM sits at the intersection of AI and systems programming, so we thought that diving into its details might interest some of our readers. In this blog post, we describe how an inference request travels through vLLM’s OpenAI-compatible API server and core engine. We also provide key code pointers.
We assume readers are already familiar with the transformer architecture and large language models. If you’re not, we highly recommend this video by OpenAI co-founder Andrej Karpathy. We will focus on the new V1 architecture of vLLM and how it achieves state-of-the-art text generation performance. If you’re looking for the V0 behavior or multi-modal inference, please refer to other vLLM documentation. — Read More
Efficient Federated Learning with Encrypted Data Sharing for Data-Heterogeneous Edge Devices
As privacy protection gains increasing importance, more models are being trained on edge devices and subsequently merged into the central server through Federated Learning (FL). However, current research overlooks the impact of network topology, physical distance, and data heterogeneity on edge devices, leading to issues such as increased latency and degraded model performance. To address these issues, we propose a new federated learning scheme on edge devices that called Federated Learning with Encrypted Data Sharing(FedEDS). FedEDS uses the client model and the model’s stochastic layer to train the data encryptor. The data encryptor generates encrypted data and shares it with other clients. The client uses the corresponding client’s stochastic layer and encrypted data to train and adjust the local model. FedEDS uses the client’s local private data and encrypted shared data from other clients to train the model. This approach accelerates the convergence speed of federated learning training and mitigates the negative impact of data heterogeneity, making it suitable for application services deployed on edge devices requiring rapid convergence. Experiments results show the efficacy of FedEDS in promoting model performance. — Read More