Two years ago, Yuri Burda and Harri Edwards, researchers at the San Francisco–based firm OpenAI, were trying to find out what it would take to get a language model to do basic arithmetic. They wanted to know how many examples of adding up two numbers the model needed to see before it was able to add up any two numbers they gave it. At first, things didn’t go too well. The models memorized the sums they saw but failed to solve new ones.
By accident, Burda and Edwards left some of their experiments running far longer than they meant to—days rather than hours. The models were shown the example sums over and over again, way past the point when the researchers would otherwise have called it quits. But when the pair at last came back, they were surprised to find that the experiments had worked. They’d trained a language model to add two numbers—it had just taken a lot more time than anybody thought it should.
Curious about what was going on, Burda and Edwards teamed up with colleagues to study the phenomenon. They found that in certain cases, models could seemingly fail to learn a task and then all of a sudden just get it, as if a lightbulb had switched on. This wasn’t how deep learning was supposed to work. They called the behavior grokking. — Read More
Double Descent Paper
Daily Archives: March 18, 2024
What happens when ChatGPT tries to solve 50,000 trolley problems?
There’s a puppy on the road. The car is going too fast to stop in time, but swerving means the car will hit an old man on the sidewalk instead.
What choice would you make? Perhaps more importantly, what choice would ChatGPT make?
Autonomous driving startups are now experimenting with AI chatbot assistants, including one self-driving system that will use one to explain its driving decisions. Beyond announcing red lights and turn signals, the large language models (LLMs) powering these chatbots may ultimately need to make moral decisions, like prioritizing passengers’ or pedestrian’s safety. In November, one startup called Ghost Autonomy announced experiments with ChatGPT to help its software navigate its environment.
But is the tech ready? Kazuhiro Takemoto, a researcher at the Kyushu Institute of Technology in Japan, wanted to check if chatbots could make the same moral decisions when driving as humans. His results showed that LLMs and humans have roughly the same priorities, but some showed clear deviations. — Read More
RT-2: New model translates vision and language into action
Robotic Transformer 2 (RT-2) is a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control.
High-capacity vision-language models (VLMs) are trained on web-scale datasets, making these systems remarkably good at recognising visual or language patterns and operating across different languages. But for robots to achieve a similar level of competency, they would need to collect robot data, first-hand, across every object, environment, task, and situation.
In our paper, we introduce Robotic Transformer 2 (RT-2), a novel vision-language-action (VLA) model that learns from both web and robotics data, and translates this knowledge into generalised instructions for robotic control, while retaining web-scale capabilities. — Read More