The Top 100Gen AI Consumer Apps

In just six months, the consumer AI landscape has been redrawn. Some products surged, others stalled, and a few unexpected players rewrote the leaderboard overnight. Deepseek rocketed from obscurity to a leading ChatGPT challenger. AI video models advanced from experimental to fairly dependable (at least for short clips!). And so-called “vibecoding” is changing who can create with AI, not just who can use it. The competition is tighter, the stakes are higher, and the winners aren’t just launching, they’re sticking.

We turned to the data to answer: Which AI apps are people actively using? What’s actually making money, beyond being popular? And which tools are moving beyond curiosity-driven dabbling to become daily staples?

This is the fourth installment of the Top 100 Gen AI Consumer Apps, our bi-annual ranking of the top 50 AI-first web products (by unique monthly visits, per Similarweb) and top 50 AI-first mobile apps (by monthly active users, per Sensor Tower). Since our last report in August 2024, 17 new companies have entered the rankings of top AI-first web products.  — Read More

#investing

The Model is the Product

There were a lot of speculation over the past years about what the next cycle of AI development could be. Agents? Reasoners? Actual multimodality?

I think it’s time to call it: the model is the product.

All current factors in research and market development push in this direction.

— Generalist scaling is stalling. 
— Opinionated training is working much better than expected. 
— Inference cost are in free fall. 

This is also an uncomfortable direction. All investors have been betting on the application layer. In the next stage of AI evolution, the application layer is likely to be the first to be automated and disrupted. — Read More

#investing

I quit my FAANG job because it’ll be automated by the end of 2025

Until this February, I had gainful employment at [redacted FAANG co] doing machine learning engineering for fine-tuning LLMs on language translation tasks. It was a great gig, and I enjoyed the work and my coworkers. However, taking a medium-term look at the market dynamics surrounding my employment prompted me to quit a few weeks ago. I’m now convinced that my former job there will be obsolete by the end of the year. — Read More

#strategy

DOJ: Google must sell Chrome, Android could be next

Google has gotten its first taste of remedies that Donald Trump’s Department of Justice plans to pursue to break up the tech giant’s monopoly in search. In the first filing since Trump allies took over the department, government lawyers backed off a key proposal submitted by the Biden DOJ. The government won’t ask the court to force Google to sell off its AI investments, and the way it intends to handle Android is changing. However, the most serious penalty is intact—Google’s popular Chrome browser is still on the chopping block. — Read More

#big7, #strategy

PipeOffload: Improving Scalability of Pipeline Parallelism with Memory Optimization

Pipeline parallelism (PP) is widely used for training large language models (LLMs), yet its scalability is often constrained by high activation memory consumption as the number of in-flight microbatches grows with the degree of PP. In this paper, we focus on addressing this challenge by leveraging the under-explored memory offload strategy in PP. With empirical study, we discover that in the majority of standard configurations, at least half, and potentially all, of the activations can be offloaded with negligible overhead. In the cases where full overload is not possible, we introduce a novel selective offload strategy that decreases peak activation memory in a better-than-linear manner. Furthermore, we integrate memory offload with other techniques to jointly consider overall throughput and memory limitation. Our experiments proves that the per-device activation memory effectively reduces with the total number of stages, making PP a stronger alternative than TP, offering up to a 19\% acceleration with even lower memory consumption. The implementation is open-sourced at \href{this https URL}{this url}. – Read More

#performance

QwQ-32B: Embracing the Power of Reinforcement Learning

Scaling Reinforcement Learning (RL) has the potential to enhance model performance beyond conventional pretraining and post-training methods. Recent studies have demonstrated that RL can significantly improve the reasoning capabilities of models. For instance, DeepSeek R1 has achieved state-of-the-art performance by integrating cold-start data and multi-stage training, enabling deep thinking and complex reasoning.

Our research explores the scalability of Reinforcement Learning (RL) and its impact on enhancing the intelligence of large language models. We are excited to introduce QwQ-32B, a model with 32 billion parameters that achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated). — Read More

#nlp

You knew it was coming: Google begins testing AI-only search results

Google has become so integral to online navigation that its name became a verb, meaning “to find things on the Internet.” Soon, Google might just tell you what’s on the Internet instead of showing you. The company has announced an expansion of its AI search features, powered by Gemini 2.0. Everyone will soon see more AI Overviews at the top of the results page, but Google is also testing a more substantial change in the form of AI Mode. This version of Google won’t show you the 10 blue links at all—Gemini completely takes over the results in AI Mode. — Read More

#big7

Eerily realistic AI voice demo sparks amazement and discomfort online

In late 2013, the Spike Jonze film Her imagined a future where people would form emotional connections with AI voice assistants. Nearly 12 years later, that fictional premise has veered closer to reality with the release of a new conversational voice model from AI startup Sesame that has left many users both fascinated and unnerved.

“I tried the demo, and it was genuinely startling how human it felt,” wrote one Hacker News user who tested the system. “I’m almost a bit worried I will start feeling emotionally attached to a voice assistant with this level of human-like sound.”

In late February, Sesame released a demo for the company’s new Conversational Speech Model (CSM) that appears to cross over what many consider the “uncanny valley” of AI-generated speech, with some testers reporting emotional connections to the male or female voice assistant (“Miles” and “Maya”). — Read More

#audio

Introducing GPT-4.5

We’re releasing a research preview of GPT‑4.5—our largest and best model for chat yet. GPT‑4.5 is a step forward in scaling up pre-training and post-training. By scaling unsupervised learning, GPT‑4.5 improves its ability to recognize patterns, draw connections, and generate creative insights without reasoning.

Early testing shows that interacting with GPT‑4.5 feels more natural. Its broader knowledge base, improved ability to follow user intent, and greater “EQ” make it useful for tasks like improving writing, programming, and solving practical problems. We also expect it to hallucinate less.

We’re sharing GPT‑4.5 as a research preview to better understand its strengths and limitations. We’re still exploring what it’s capable of and are eager to see how people use it in ways we might not have expected. — Read More

#llm

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

Despite their groundbreaking performance for many generative modeling tasks, diffusion models have fallen short on discrete data domains such as natural language. Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing score entropy, a novel loss that naturally extends score matching to discrete spaces, integrates seamlessly to build discrete diffusion models, and significantly boosts performance. Experimentally, we test our Score Entropy Discrete Diffusion models (SEDD) on standard language modeling tasks. For comparable model sizes, SEDD beats existing language diffusion paradigms (reducing perplexity by 25-75\%) and is competitive with autoregressive models, in particular outperforming GPT-2. Furthermore, compared to autoregressive mdoels, SEDD generates faithful text without requiring distribution annealing techniques like temperature scaling (around 6-8× better generative perplexity than un-annealed GPT-2), can trade compute and quality (similar quality with 32× fewer network evaluations), and enables controllable infilling (matching nucleus sampling quality while enabling other strategies besides left to right prompting). — Read More

#nlp