My LLM codegen workflow atm

I have been building so many small products using LLMs. It has been fun, and useful. However, there are pitfalls that can waste so much time. A while back a friend asked me how I was using LLMs to write software. I thought “oh boy. how much time do you have!” and thus this post.

I talk to many dev friends about this, and we all have a similar approach with various tweaks in either direction. — Read More

#devops

The hottest AI models, what they do, and how to use them

AI models are being cranked out at a dizzying pace, by everyone from Big Tech companies like Google to startups like OpenAI and Anthropic. Keeping track of the latest ones can be overwhelming. 

Adding to the confusion is that AI models are often promoted based on industry benchmarks. But these technical metrics often reveal little about how real people and companies actually use them. 

To cut through the noise, TechCrunch has compiled an overview of the most advanced AI models released since 2024, with details on how to use them and what they’re best for. — Read More

#strategy

LLM Pretraining with Continuous Concepts

Next token prediction has been the standard training objective used in large language model pretraining. Representations are learned as a result of optimizing for token-level perplexity. We propose Continuous Concept Mixing (CoCoMix), a novel pretraining framework that combines discrete next token prediction with continuous concepts. Specifically, CoCoMix predicts continuous concepts learned from a pretrained sparse autoencoder and mixes them into the model’s hidden state by interleaving with token hidden representations. Through experiments on multiple benchmarks, including language modeling and downstream reasoning tasks, we show that CoCoMix is more sample efficient and consistently outperforms standard next token prediction, knowledge distillation and inserting pause tokens. We find that combining both concept learning and interleaving in an end-to-end framework is critical to performance gains. Furthermore, CoCoMix enhances interpretability and steerability by allowing direct inspection and modification of the predicted concept, offering a transparent way to guide the model’s internal reasoning process. — Read More

#training

Will AI Take Your Job, and When?

With the release of Deep Research tools by the likes of OpenAI, Google, and most recently, Perplexity (cheapest option by far and reasonably well tested against the others), concerns about job safety and displacement due to AI are growing.

But we know history repeats, so does history support these fears? And if so, what skills will be necessary to survive in an AI world?

We’ll discuss the impact timelines previous industrial revolutions had on the economy, examine the most recent research on adoption and productivity from one top University and one top AI lab, understand what it means to be an ‘AI Human,’ finally, giving you the best mental model to analyze whether AI will take your job. — Read More

#strategy

ChinAI #300: Artificial Challenged Intelligence [人工智障] in China’s most humble profession

About 150 ChinAI issues ago, in June 2021, I started seeing the phrase 人工智障 [which I translate as “artificial challenged intelligence”] pop up in Chinese media. Bloggers used the term to make fun of billboard displays that intended to name-and-shame jay-walkers but ended up featuring faces from bus ads (ChinAI #144). Comic artists captured the frustrations of using a smart sweeping robot (ChinAI #165). This week’s feature translation (link to original NetEase DataBlog article) examines artificial challenged intelligence in the context of China’s customer service industry.

Key Takeaways: China has made a stark transition to AI customer service — a 17-fold growth in the market in the past seven years — but this has produced more customer dissatisfaction. — Read More

#china-ai

The Only AI Moat is Hardware

And Compute is the Upper Bound for Achievable Intelligence

I have lost count of how many times I have been asked about DeepSeek over the past week — specifically, whether it signals the obsolescence of high-performance AI compute or, by extension, the beginning of the end for NVIDIA.

The answer is “No.” — but if you still need more than one word, here is why. — Read More

#strategy

A ‘True Crime’ Documentary Series Has Millions of Views. The Murders Are All AI-Generated

Elizabeth Hernandez found out about the decade-old murder from a flurry of tips sent to her newsroom in August last year.

The tips were all reacting to a YouTube video with a shocking title: “Husband’s Secret Gay Love Affair with Step Son Ends in Grisly Murder.” It described a gruesome crime that apparently took place in Littleton, Colorado. Almost two million people had watched it.

“Some people in fact were saying, ‘Why didn’t The Denver Post cover this?’” Hernandez, a reporter at the paper, told me. “Because in the video, it makes it sound like it was a big news event and yet, when you Google it, there is no coverage.”

The reason for the lack of coverage was pretty clear to her. … The murder was fake, and the video was made using generative AI. — Read More

#fake

Brain implant that could boost mood by using ultrasound to go under NHS trial

A groundbreaking NHS trial will attempt to boost patients’ mood using a brain-computer-interface that directly alters brain activity using ultrasound.

The device, which is designed to be implanted beneath the skull but outside the brain, maps activity and delivers targeted pulses of ultrasound to “switch on” clusters of neurons. Its safety and tolerability will be tested on about 30 patient in the £6.5m trial, funded by the UK’s Advanced Research and Invention Agency (Aria).

In future, doctors hope the technology could revolutionise the treatment of conditions such as depression, addiction, OCD and epilepsy by rebalancing disrupted patterns of brain activity. — Read More

#human

Data Formulator: Create Rich Visualizations with AI

Data Formulator is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.

Data Formulator is an AI-powered tool for analysts to iteratively create rich visualizations. Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines user interface interactions (UI) and natural language (NL) inputs for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI. — Read More

#devops

Competitive Programming with Large Reasoning Models

We show that reinforcement learning applied to large language models (LLMs) significantly boosts performance on complex coding and reasoning tasks. Additionally, we compare two general-purpose reasoning models – OpenAI o1 and an early checkpoint of o3 – with a domain-specific system, o1-ioi, which uses hand-engineered inference strategies designed for competing in the 2024 International Olympiad in Informatics (IOI). We competed live at IOI 2024 with o1-ioi and, using hand-crafted test-time strategies, placed in the 49th percentile. Under relaxed competition constraints, o1-ioi achieved a gold medal. However, when evaluating later models such as o3, we find that o3 achieves gold without hand-crafted domain-specific strategies or relaxed constraints. Our findings show that although specialized pipelines such as o1-ioi yield solid improvements, the scaled-up, general-purpose o3 model surpasses those results without relying on hand-crafted inference heuristics. Notably, o3 achieves a gold medal at the 2024 IOI and obtains a Codeforces rating on par with elite human competitors. Overall, these results indicate that scaling general-purpose reinforcement learning, rather than relying on domain-specific techniques, offers a robust path toward state-of-the-art AI in reasoning domains, such as competitive programming. — Read More

#devops