The latest generation of Gemini models, 2.5 Pro and Flash, are unlocking new frontiers in robotics. Their advanced coding, reasoning, and multimodal capabilities, now combined with spatial understanding, provide the foundation for the next generation of interactive and intelligent robots.
This post explores how developers can leverage Gemini 2.5 to build sophisticated robotics applications. — Read More
Tag Archives: Robotics
Real-Time Action Chunking with Large Models
Unlike chatbots or image generators, robots must operate in real time. While a robot is “thinking”, the world around it evolves according to physical laws, so delays between inputs and outputs have a tangible impact on performance. For a language model, the difference between fast and slow generation is a satisfied or annoyed user; for a vision-language-action model (VLA), it could be the difference between a robot handing you a hot coffee or spilling it in your lap. While VLAs have achieved promising results in open-world generalization, they can be slow to run. Like their cousins in language and vision, these models have billions of parameters and require heavy-duty GPUs. On edge devices like mobile robots, that adds even more latency for network communication between a centralized inference server and the robot. — Read More
Boston Dynamics Makes AGT HISTORY With Robots Dancing To “Don’t Stop Me Now” by Queen
Meta’s V-JEPA 2 model teaches AI to understand its surroundings
Meta on Wednesday unveiled its new V-JEPA 2 AI model, a “world model” that is designed to help AI agents understand the world around them.
V-JEPA 2 is an extension of the V-JEPA model that Meta released last year, which was trained on over 1 million hours of video. This training data is supposed to help robots or other AI agents operate in the physical world, understanding and predicting how concepts like gravity will impact what happens next in a sequence.
These are the kinds of common sense connections that small children and animals make as their brains develop. — Read More
The Shape of Things to Come
Amazon ‘testing humanoid robots to deliver packages’: Amazon is reportedly developing software for humanoid robots that could perform the role of delivery workers and “spring out” of its vans.
… The Information reported that the robots could eventually take the jobs of delivery workers. It is developing the artificial intelligence software that would power the robots but will use hardware developed by other companies. — Read More
Walmart and Wing expand drone delivery to five more US cities: Wing, the on-demand drone delivery company owned by Alphabet, is spreading its commercial wings with help from Walmart.
The two companies announced Thursday plans to roll out drone delivery to more than 100 Walmart stores in five new cities: Atlanta, Charlotte, Houston, Orlando, and Tampa. Walmart is also adding Wing drone deliveries to its existing market in the Dallas-Fort Worth area. — Read More
Stumbling and Overheating, Most Humanoid Robots Fail to Finish Half Marathon in Beijing
About 12,000 human athletes ran in a half marathon race in Beijing on Saturday, but most of the attention was on a group of other, more unconventional participants: 21 humanoid robots. The event’s organizers, which included several branches of Beijing’s municipal government, claim it’s the first time humans and bipedal robots have run in the same race, though they jogged on separate tracks. Six of the robots successfully finished the course, but they were unable to keep up with the speed of the humans.
The fastest robot, Tiangong Ultra, developed by Chinese robotics company UBTech in collaboration with the Beijing Humanoid Robot Innovation Center, finished the race in two hours and 40 minutes after assistants changed its batteries three times and it fell down once. — Read More
Samsung’s cute Ballie robot arrives this summer with Google Gemini in tow
Samsung’s Ballie will go on sale in the US and South Korea this summer, the company announced today. What’s more, through a partnership with Google Cloud, the diminutive robot will ship with a Gemini AI model.
Samsung didn’t state the specific system that powers Ballie, but in combination with the company’s own proprietary language models, it says the robot has multimodal capabilities, meaning Ballie can process voice, audio and visual data from its sensors. According to Samsung, Ballie can also manage your smart home devices and even offer health and styling recommendations, if you’re inclined to seek that type of advice from a robot. — Read More
Accelerate Generalist Humanoid Robot Development with NVIDIA Isaac GR00T N1
Humanoid robots are designed to adapt to human workspaces, tackling repetitive or demanding tasks. However, creating general-purpose humanoid robots for real-world tasks and unpredictable environments is challenging. Each of these tasks often requires a dedicated AI model. Training these models from scratch for every new task and environment is a laborious process due to the need for vast task-specific data, high computational cost, and limited generalization.
NVIDIA Isaac GR00T helps tackle these challenges and accelerates general-purpose humanoid robot development by providing you with open-source SimReady data, simulation frameworks such as NVIDIA Isaac Sim and Isaac Lab, synthetic data blueprints, and pretrained foundation models. — Read More
China to host world’s first human-robot marathon as robotics drives national goals
For the first time, dozens of humanoid robots are expected to join a half-marathon to be held in the capital’s Daxing district in April, according to local authorities.
This comes as China ramps up efforts to develop artificial intelligence and robotics, to gain an edge in the tech rivalry with the US as well as combat the challenges of an ageing society and a falling birth rate.
Some 12,000 humans will take part in the coming race – and running alongside them on the 21km (13-mile) route will be robots from more than 20 companies, according to the administrative body of Beijing Economic-Technological Development Area, or E-Town.
Prizes will be offered for the top three runners. — Read More
It’s Surprisingly Easy to Jailbreak LLM-Driven Robots
AI chatbots such as ChatGPT and other applications powered by large language models (LLMs) have exploded in popularity, leading a number of companies to explore LLM-driven robots. However, a new study now reveals an automated way to hack into such machines with 100 percent success. By circumventing safety guardrails, researchers could manipulate self-driving systems into colliding with pedestrians and robot dogs into hunting for harmful places to detonate bombs.
Essentially, LLMs are supercharged versions of the autocomplete feature that smartphones use to predict the rest of a word that a person is typing. LLMs trained to analyze to text, images, and audio can make personalized travel recommendations, devise recipes from a picture of a refrigerator’s contents, and help generate websites.
The extraordinary ability of LLMs to process text has spurred a number of companies to use the AI systems to help control robots through voice commands, translating prompts from users into code the robots can run. For instance, Boston Dynamics’ robot dog Spot, now integrated with OpenAI’s ChatGPT, can act as a tour guide. Figure’s humanoid robots and Unitree’s Go2 robot dog are similarly equipped with ChatGPT.
However, a group of scientists has recently identified a host of security vulnerabilities for LLMs. So-called jailbreaking attacks discover ways to develop prompts that can bypass LLM safeguards and fool the AI systems into generating unwanted content, such as instructions for building bombs, recipes for synthesizing illegal drugs, and guides for defrauding charities. — Read More