Voice Agent Engineering

Read More

#audio, #videos

Stumbling and Overheating, Most Humanoid Robots Fail to Finish Half Marathon in Beijing

About 12,000 human athletes ran in a half marathon race in Beijing on Saturday, but most of the attention was on a group of other, more unconventional participants: 21 humanoid robots. The event’s organizers, which included several branches of Beijing’s municipal government, claim it’s the first time humans and bipedal robots have run in the same race, though they jogged on separate tracks. Six of the robots successfully finished the course, but they were unable to keep up with the speed of the humans.

The fastest robot, Tiangong Ultra, developed by Chinese robotics company UBTech in collaboration with the Beijing Humanoid Robot Innovation Center, finished the race in two hours and 40 minutes after assistants changed its batteries three times and it fell down once. — Read More

#robotics

Inside OpenAI’s Controversial Plan to Abandon its Nonprofit Roots

Earlier this month, OpenAI announced that it aspires to build “the best-equipped nonprofit the world has ever seen” and was convening a commission to help determine how to use its “potentially historic financial resources.”

But critics view this new commission as a transparent attempt to placate opposition to its controversial plan to restructure fully as a for-profit — one that fails to address the fundamental legal issues at stake. — Read More

#strategy

The Second Half

tldr: We’re at AI’s halftime.

For decades, AI has largely been about developing new training methods and models. And it worked: from beating world champions at chess and Go, surpassing most humans on the SAT and bar exams, to earning IMO and IOI gold medals. Behind these milestones in the history book — DeepBlue, AlphaGo, GPT-4, and the o-series — are fundamental innovations in AI methods: search, deep RL, scaling, and reasoning. Things just get better over time.

So what’s suddenly different now?

In three words: RL finally works. More precisely: RL finally generalizes. After several major detours and a culmination of milestones, we’ve landed on a working recipe to solve a wide range of RL tasks using language and reasoning. Even a year ago, if you told most AI researchers that a single recipe could tackle software engineering, creative writing, IMO-level math, mouse-and-keyboard manipulation, and long-form question answering — they’d laugh at your hallucinations. Each of these tasks is incredibly difficult and many researchers spend their entire PhDs focused on just one narrow slice. — Read More

#strategy

What «Shifting Left» Means and Why it Matters for Data Stacks

Moving Data Quality and Business Logic Upstream for More Efficient Data Systems

Shifting left is an interesting concept that’s gaining momentum in modern data engineering. SDF has been among those sharing this approach, even making “shifting left” one of their main slogans. As Elias DeFaria, SDF’s co-founder, describes it, shifting left means “improving data quality by moving closer toward the data source”.

However, the benefits extend beyond just data quality improvements. With dbt Labs’ recent acquisition of SDF, many are wondering: what does this mean for the shifting left movement, and more importantly, what exactly is shifting left in the data context?

In this article, we’ll explore the core principles behind shifting left, examine how code-first approaches have made moving logic upstream more efficient, and answer the questions: Why should data teams shift left? What elements need to be shifted? And how can your organization implement this approach to build more maintainable, efficient data systems? — Read More

#devops

Model Context Protocol (MCP)

MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools. — Read More

#devops

How to Build an Agent

It’s not that hard to build a fully functioning, code-editing agent.

It seems like it would be. When you look at an agent editing files, running commands, wriggling itself out of errors, retrying different strategies – it seems like there has to be a secret behind it.

There isn’t. It’s an LLM, a loop, and enough tokens. It’s what we’ve been saying on the podcast from the start. The rest, the stuff that makes Amp so addictive and impressive? Elbow grease.

But building a small and yet highly impressive agent doesn’t even require that. You can do it in less than 400 lines of code, most of which is boilerplate.

I’m going to show you how, right now. We’re going to write some code together and go from zero lines of code to “oh wow, this is… a game changer.” — Read More

#devops

NVIDIA to Manufacture American-Made AI Supercomputers in US for First Time

NVIDIA is working with its manufacturing partners to design and build factories that, for the first time, will produce NVIDIA AI supercomputers entirely in the U.S.

Together with leading manufacturing partners, the company has commissioned more than a million square feet of manufacturing space to build and test NVIDIA Blackwell chips in Arizona and AI supercomputers in Texas. — Read More

#nvidia

DolphinGemma: How Google AI is helping decode dolphin communication

For decades, understanding the clicks, whistles and burst pulses of dolphins has been a scientific frontier. What if we could not only listen to dolphins, but also understand the patterns of their complex communication well enough to generate realistic responses?

Today, on National Dolphin Day, Google, in collaboration with researchers at Georgia Tech and the field research of the Wild Dolphin Project (WDP), is announcing progress on DolphinGemma: a foundational AI model trained to learn the structure of dolphin vocalizations and generate novel dolphin-like sound sequences. This approach in the quest for interspecies communication pushes the boundaries of AI and our potential connection with the marine world. — Read More

#big7

OpenAI debuts its GPT-4.1 flagship AI model

OpenAI has introduced GPT-4.1, a successor to the GPT-4o multimodal AI model launched by the company last year. During a livestream on Monday, OpenAI said GPT-4.1 has an even larger context window and is better than GPT-4o in “just about every dimension,” with big improvements to coding and instruction following.

GPT-4.1 is now available to developers, along with two smaller model versions. That includes GPT-4.1 Mini, which, like its predecessor, is more affordable for developers to tinker with, and GPT-4.1 Nano, an even more lightweight model that OpenAI says is its “smallest, fastest, and cheapest” one yet. — Read More

#nlp