Transformer models, notably large language models (LLMs), have the remarkable ability to perform in-context learning (ICL) — to perform new tasks when prompted with unseen input-output examples without any explicit model training. In this work, we study how effectively transformers can bridge between their pretraining data mixture, comprised of multiple distinct task families, to identify and learn new tasks in-context which are both inside and outside the pretraining distribution. Building on previous work, we investigate this question in a controlled setting, where we study transformer models trained on sequences of (x,f(x)) pairs rather than natural language. Our empirical results show transformers demonstrate near-optimal unsupervised model selection capabilities, in their ability to first in-context identify different task families and in-context learn within them when the task families are well-represented in their pretraining data. However when presented with tasks or functions which are out-of-domain of their pretraining data, we demonstrate various failure modes of transformers and degradation of their generalization for even simple extrapolation tasks. Together our results highlight that the impressive ICL abilities of high-capacity sequence models may be more closely tied to the coverage of their pretraining data mixtures than inductive biases that create fundamental generalization capabilities. — Read More
Tag Archives: Human
Stephen Woldram: How to Think Computationally about AI, the Universe and Everything
Human language. Mathematics. Logic. These are all ways to formalize the world. And in our century there’s a new and yet more powerful one: computation.
And for nearly 50 years I’ve had the great privilege of building an ever taller tower of science and technology based on that idea of computation. And today I want to tell you some of what that’s led to.
There’s a lot to talk about—so I’m going to go quickly… sometimes with just a sentence summarizing what I’ve written a whole book about. — Read More
Minds of machines: The great AI consciousness conundrum
David Chalmers was not expecting the invitation he received in September of last year. As a leading authority on consciousness, Chalmers regularly circles the world delivering talks at universities and academic meetings to rapt audiences of philosophers—the sort of people who might spend hours debating whether the world outside their own heads is real and then go blithely about the rest of their day. This latest request, though, came from a surprising source: the organizers of the Conference on Neural Information Processing Systems (NeurIPS), a yearly gathering of the brightest minds in artificial intelligence.
Less than six months before the conference, an engineer named Blake Lemoine, then at Google, had gone public with his contention that LaMDA, one of the company’s AI systems, had achieved consciousness. Lemoine’s claims were quickly dismissed in the press, and he was summarily fired, but the genie would not return to the bottle quite so easily—especially after the release of ChatGPT in November 2022. Suddenly it was possible for anyone to carry on a sophisticated conversation with a polite, creative artificial agent.
Chalmers was an eminently sensible choice to speak about AI consciousness. He’d earned his PhD in philosophy at an Indiana University AI lab, where he and his computer scientist colleagues spent their breaks debating whether machines might one day have minds. In his 1996 book, The Conscious Mind, he spent an entire chapter arguing that artificial consciousness was possible. — Read More
‘Mind-blowing’ IBM chip speeds up AI
IBM’s NorthPole processor sidesteps need to access external memory, boosting computing power and saving energy.
A brain-inspired computer chip that could supercharge artificial intelligence (AI) by working faster with much less power has been developed by researchers at IBM in San Jose, California. Their massive NorthPole processor chip eliminates the need to frequently access external memory, and so performs tasks such as image recognition faster than existing architectures do — while consuming vastly less power.
“Its energy efficiency is just mind-blowing,” says Damien Querlioz, a nanoelectronics researcher at the University of Paris-Saclay in Palaiseau. The work, published in Science1, shows that computing and memory can be integrated on a large scale, he says. “I feel the paper will shake the common thinking in computer architecture.” — Read More
A new chip architecture points to faster, more energy-efficient AI
We’re in the midst of a Cambrian explosion in AI. Over the last decade, AI has gone from theory and small tests to enterprise-scale use cases. But the hardware used to run AI systems, although increasingly powerful, was not designed with today’s AI in mind. As AI systems scale, the costs skyrocket. And Moore’s Law, the theory that the density of circuits in processors would double each year, has slowed.
But new research out of IBM Research’s lab in Almaden, California, nearly two decades in the making, has the potential to drastically shift how we can efficiently scale up powerful AI hardware systems. — Read More
Read the Paper
Towards a Real-Time Decoding of Images from Brain Activity
At every moment of every day, our brains meticulously sculpt a wealth of sensory signals into meaningful representations of the world around us. Yet how this continuous process actually works remains poorly understood.
Today, Meta is announcing an important milestone in the pursuit of that fundamental question. Using magnetoencephalography (MEG), a non-invasive neuroimaging technique in which thousands of brain activity measurements are taken per second, we showcase an AI system capable of decoding the unfolding of visual representations in the brain with an unprecedented temporal resolution.
This AI system can be deployed in real time to reconstruct, from brain activity, the images perceived and processed by the brain at each instant. — Read More
Read the Paper
Minds of machines: The great AI consciousness conundrum
David Chalmers was not expecting the invitation he received in September of last year. As a leading authority on consciousness, Chalmers regularly circles the world delivering talks at universities and academic meetings to rapt audiences of philosophers—the sort of people who might spend hours debating whether the world outside their own heads is real and then go blithely about the rest of their day. This latest request, though, came from a surprising source: the organizers of the Conference on Neural Information Processing Systems (NeurIPS), a yearly gathering of the brightest minds in artificial intelligence.
… Chalmers was an eminently sensible choice to speak about AI consciousness. He’d earned his PhD in philosophy at an Indiana University AI lab, where he and his computer scientist colleagues spent their breaks debating whether machines might one day have minds. In his 1996 book, The Conscious Mind, he spent an entire chapter arguing that artificial consciousness was possible.
If he had been able to interact with systems like LaMDA and ChatGPT back in the ’90s, before anyone knew how such a thing might work, he would have thought there was a good chance they were conscious, Chalmers says. But when he stood before a crowd of NeurIPS attendees in a cavernous New Orleans convention hall, clad in his trademark leather jacket, he offered a different assessment. Yes, large language models—systems that have been trained on enormous corpora of text in order to mimic human writing as accurately as possible—are impressive. But, he said, they lack too many of the potential requisites for consciousness for us to believe that they actually experience the world. — Read More
This is the largest map of the human brain ever made
Researchers have created the largest atlas of human brain cells so far, revealing more than 3,000 cell types — many of which are new to science. The work, published in a package of 21 papers today in Science, Science Advances and Science Translational Medicine, will aid the study of diseases, cognition and what makes us human, among other things, say the authors.
The enormous cell atlas offers a detailed snapshot of the most complex known organ. “It’s highly significant,” says Anthony Hannan, a neuroscientist at the Florey Institute of Neuroscience and Mental Health in Melbourne, Australia. Researchers have previously mapped the human brain using techniques such as magnetic resonance imaging, but this is the first atlas of the whole human brain at the single-cell level, showing its intricate molecular interactions, adds Hannan. “These types of atlases really are laying the groundwork for a much better understanding of the human brain.” — Read More
Auto-Regressive Next-Token Predictors are Universal Learners
Large language models display remarkable capabilities in logical and mathematical reasoning, allowing them to solve complex tasks. Interestingly, these abilities emerge in networks trained on the simple task of next-token prediction. In this work, we present a theoretical framework for studying auto-regressive next-token predictors. We demonstrate that even simple models such as linear next-token predictors, trained on Chain-of-Thought (CoT) data, can approximate any function efficiently computed by a Turing machine. We introduce a new complexity measure — length complexity — which measures the number of intermediate tokens in a CoT sequence required to approximate some target function, and analyze the interplay between length complexity and other notions of complexity. Finally, we show experimentally that simple next-token predictors, such as linear networks and shallow Multi-Layer Perceptrons (MLPs), display non-trivial performance on text generation and arithmetic tasks. Our results demonstrate that the power of language models can be attributed, to a great extent, to the auto-regressive next-token training scheme, and not necessarily to a particular choice of architecture. — Read More
A Lab Just 3D-Printed a Neural Network of Living Brain Cells
YOU CAN 3D-PRINT nearly anything: rockets, mouse ovaries, and for some reason, lamps made of orange peels. Now, scientists at Monash University in Melbourne, Australia, have printed living neural networks composed of rat brain cells that seem to mature and communicate like real brains do.
Researchers want to create mini-brains partly because they could someday offer a viable alternative to animal testing in drug trials and studies of basic brain function. …3D-printing is just one entry in the race to build a better mini-brain. …With 3D-printing, researchers can culture cells in specific patterns on top of recording electrodes, granting them a degree of experimental control normally reserved for flat cell cultures. But because the structure is soft enough to allow cells to migrate and reorganize themselves in 3D space, it gains some of the advantages of the organoid approach, more closely mimicking the structure of normal tissue. — Read More