Google Revealed “Attention Is All You Need” Part II

For years deep learning has followed one central idea. If we want smarter models, we stack more layers, run larger training, and scale everything upward. This simple formula has given us large language models that reason well and generate high-quality text. Yet they still share one huge weakness. They cannot learn on the fly. They cannot update themselves during use.

Any change needs heavy retraining, and this often destroys old knowledge.

Google Research recently published a paper called Nested Learning. It offers a very different way of thinking about how learning should work inside neural networks. The researchers claim that a model is not just a big stack of layers. It is a hierarchy of learners that operate at different timescales. If this view is correct, it could reshape how we build AI systems in the coming years. — Read More

#big7

Apple’s  AI Game is Misunderstood

Apple’s AI strategy has become a Rorschach test for the technology industry. Critics see a company falling dangerously behind. Needham analyst Laura Martin claims it is one to two years behind its competitors. But almost all of this commentary, whether bullish or bearish, focuses on the wrong question.

The standard narrative compares Apple’s AI capex to Microsoft’s, Apple’s Siri to Google’s Gemini, Apple’s foundation models to OpenAI’s GPT-4. By these metrics, Apple looks behind. But these comparisons assume Apple is trying to win the same race. The evidence suggests it isn’t. — Read More

#big7

Meta’s ‘Avocado’ AI Model Delayed as Internal Tensions Rise

Meta is scrambling to deliver its next frontier AI model, codenamed Avocado, as internal friction mounts over the company’s shifting strategy from open-source Llama models to proprietary development. The social media giant’s $14.3 billion bet on new AI leadership is creating cultural clashes while competitors like OpenAI and Google pull ahead in the AI race.

Meta is facing its biggest AI reckoning yet: Avocado, won’t arrive until the first quarter of 2026. … The delay represents more than just technical challenges. According to sources familiar with the project, Avocado is wrestling with training-related performance testing as Meta tries to ensure the system will be competitive when it debuts.  — Read More

#big7

How Google Pulled Off Its Stunning, Rapid-Fire AI Turnaround

Google came into 2025 with its AI stumbles looming large. The company’s slow start to the generative AI race turned borderline catastrophic in 2024 when its products generated images of diverse Nazis, told users to eat rocks, and couldn’t match OpenAI’s shine. AI chat was seen as a major threat to search, and outsiders didn’t see a coherent strategy. In January, Google stock was on the sale rack and murmurs about CEO Sundar Pichai’s job security floated around the internet.

We’re not quite in December and Google has masterfully reversed course. Its AI models are world class. Its products are buzzy again. Its cloud business is booming. And search is stronger than ever. Its stock is up 56% this year and, at $3.59 trillion, it just surpassed Microsoft’s market cap. Now, no serious person would question Pichai’s job status. – Read More

#big7

OpenAI can’t beat Google in consumer AI

OpenAI can’t beat Google at consumer AI, as long as we are in the “chatbot” paradigm. Clock’s ticking for OpenAI to pull a rabbit out of the hat asap (in December). It’s worrisome that OpenAI’s best effort at front-running the Gemini 3 release was with GPT-5.1, which was barely an improvement. Most importantly, Google has much cheaper inference COGs than OpenAI due to its vertical AI integration (with TPUs) and scale. That allows Google to commoditize whatever OpenAI puts out, making monetization impossible.

Google’s data advantage, especially in multi-modal, is really shining. Because Google’s so strong in multi-modal, Gemini 3 just destroyed Sonnet 4.5 in frontend UI coding (which is a visual task). Little things like this makes Google hard to beat, because OpenAI can’t synthetically generate every type of data for training, e.g.Youtube or Google Maps. — Read More

#big7

Introducing Nested Learning: A new ML paradigm for continual learning

We introduce Nested Learning, a new approach to machine learning that views models as a set of smaller, nested optimization problems, each with its own internal workflow, in order to mitigate or even completely avoid the issue of “catastrophic forgetting”, where learning new tasks sacrifices proficiency on old tasks.

The last decade has seen incredible progress in machine learning (ML), primarily driven by powerful neural network architectures and the algorithms used to train them. However, despite the success of large language models (LLMs), a few fundamental challenges persist, especially around continual learning, the ability for a model to actively acquire new knowledge and skills over time without forgetting old ones.

When it comes to continual learning and self-improvement, the human brain is the gold standard. It adapts through neuroplasticity — the remarkable capacity to change its structure in response to new experiences, memories, and learning. Without this ability, a person is limited to immediate context (like anterograde amnesia). We see a similar limitation in current LLMs: their knowledge is confined to either the immediate context of their input window or the static information that they learn during pre-training.

The simple approach, continually updating a model’s parameters with new data, often leads to “catastrophic forgetting” (CF), where learning new tasks sacrifices proficiency on old tasks. Researchers traditionally combat CF through architectural tweaks or better optimization rules. However, for too long, we have treated the model’s architecture (the network structure) and the optimization algorithm (the training rule) as two separate things, which prevents us from achieving a truly unified, efficient learning system.

In our paper, “Nested Learning: The Illusion of Deep Learning Architectures”, published at NeurIPS 2025, we introduce Nested Learning, which bridges this gap. Nested Learning treats a single ML model not as one continuous process, but as a system of interconnected, multi-level learning problems that are optimized simultaneously. We argue that the model’s architecture and the rules used to train it (i.e., the optimization algorithm) are fundamentally the same concepts; they are just different “levels” of optimization, each with its own internal flow of information (“context flow”) and update rate. By recognizing this inherent structure, Nested Learning provides a new, previously invisible dimension for designing more capable AI, allowing us to build learning components with deeper computational depth, which ultimately helps solve issues like catastrophic forgetting. — Read More

#big7

Introducing SIMA 2, the next milestone in our research creating general and helpful AI agents.

Read More

#big7, #videos

Google DeepMind is using Gemini to train agents inside Goat Simulator 3

Google DeepMind has built a new video-game-playing agent called SIMA 2 that can navigate and solve problems in a wide range of 3D virtual worlds. The company claims it’s a big step toward more general-purpose agents and better real-world robots.

Google DeepMind first demoed SIMA (which stands for “scalable instructable multiworld agent”) last year. But SIMA 2 has been built on top of Gemini, the firm’s flagship large language model, which gives the agent a huge boost in capability.

The researchers claim that SIMA 2 can carry out a range of more complex tasks inside virtual worlds, figure out how to solve certain challenges by itself, and chat with its users. It can also improve itself by tackling harder tasks multiple times and learning through trial and error. — Read More

#big7

‘There isn’t really another choice:’ Signal chief explains why the encrypted messenger relies on AWS

After last week’s major Amazon Web Services (AWS) outage took Signal along with it, Elon Musk was quick to criticize the encrypted messaging app’s reliance on big tech. But Signal president Meredith Whittaker argues that the company didn’t have any other choice but to use AWS or another major cloud provider.

“The problem here is not that Signal ‘chose’ to run on AWS,” Whittaker writes in a series of posts on Bluesky. “The problem is the concentration of power in the infrastructure space that means there isn’t really another choice: the entire stack, practically speaking, is owned by 3-4 players.” — Read More

#big7

Microsoft AI announces first image generator created in-house

Microsoft AI just announced its first text-to-image generator, MAI-Image-1, designed and developed in-house. The tech giant, which recently announced its first in-house Microsoft AI models, called the new image generator “the next step on our journey.”

Microsoft says it sought feedback from creative professionals in order to avoid “repetitive or generically-stylized outputs.” MAI-Image-1 “excels” at photorealistic imagery like lightning, landscapes, and more, the company claims. And it can process requests and produce images faster than “larger, slower models.” The model has already secured a spot in the top 10 of LMArena, the AI benchmark site where humans compare outputs from different systems and vote on the best one. — Read More

#big7