A high-performance speech neuroprosthesis

Speech brain–computer interfaces (BCIs) have the potential to restore rapid communication to people with paralysis by decoding neural activity evoked by attempted speech into text1,2 or sound3,4. Early demonstrations, although promising, have not yet achieved accuracies sufficiently high for communication of unconstrained sentences from a large vocabulary1,2,3,4,5,6,7. Here we demonstrate a speech-to-text BCI that records spiking activity from intracortical microelectrode arrays. Enabled by these high-resolution recordings, our study participant—who can no longer speak intelligibly owing to amyotrophic lateral sclerosis—achieved a 9.1% word error rate on a 50-word vocabulary (2.7 times fewer errors than the previous state-of-the-art speech BCI2) and a 23.8% word error rate on a 125,000-word vocabulary (the first successful demonstration, to our knowledge, of large-vocabulary decoding). Our participant’s attempted speech was decoded  at 62 words per minute, which is 3.4 times as fast as the previous record8 and begins to approach the speed of natural conversation (160 words per minute9). Finally, we highlight two aspects of the neural code for speech that are encouraging for speech BCIs: spatially intermixed tuning to speech articulators that makes accurate decoding possible from only a small region of cortex, and a detailed articulatory representation of phonemes that persists years after paralysis. These results show a feasible path forward for restoring rapid communication to people with paralysis who can no longer speak. — Read More

#human

Largest genetic study of brain structure identifies how the brain is organised

The largest ever study of the genetics of the brain – encompassing some 36,000 brain scans – has identified more than 4,000 genetic variants linked to brain structure. The results of the study, led by researchers at the University of Cambridge, are published in Nature Genetics today.

Our brains are very complex organs, with huge variety between individuals in terms of the overall volume of the brain, how it is folded and how thick these folds are. Little is known about how our genetic make-up shapes the development of the brain.

… [F]indings have allowed researchers to confirm and, in some cases, identify, how different properties of the brain are genetically linked to each other. — Read More

#human

Does AI Understand the World?

Do large language models understand the world? As a scientist and engineer, I’ve avoided asking whether an AI system “understands” anything. There’s no widely agreed-upon, scientific test for whether a system really understands — as opposed to appearing to understand — just as no such tests exist for consciousness or sentience, as I discussed in an earlier letter. This makes the question of understanding a matter of philosophy rather than science. But with this caveat, I believe that LLMs build sufficiently complex models of the world that I feel comfortable saying that, to some extent, they do understand the world. — Read More

Conversation with Geoff Hinton

#human

Reconstructing the Mind’s Eye: fMRI-to-image with Contrastive Learning and Diffusion Priors

#human

Is Consciousness Real? 

Read More

#human, #videos

I Wore the Future With a Brain-Connected AR-VR Headset

The next frontier might be neurotech: OpenBCI’s Galea headset, along with advances in assistive controls, points to a wild, wearable road ahead.

A few weeks ago, I saw the best quality mixed reality headset with an interface controlled using my fingers and eyes: Apple’s Vision Pro. But a few months before its announcement, I saw something perhaps even wilder. Clips on my ears, a crown of rubbery-tipped sensors nestled into my hair and a face mask lowered in front of my eyes. Suddenly I was looking at my own brain waves in VR and moving things around with only tiny movements of my facial muscles. I was test driving OpenBCI’s Galea.

The future of VR and AR is advancing steadily, but inputs remain a challenge. For now, it’s a territory moving from physical controllers to hand- and eye-tracking. But there are deeper possibilities beyond that, and they’re neural.  — Read More

#human

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic look ahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models’ problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL. — Read More

#human, #nlp

High-resolution image reconstruction with latent diffusion models from human brain activity

Reconstructing visual experiences from human brain activity offers a unique way to understand how the brain represents the world, and to interpret the connection between computer vision models and our visual system. While deep generative models have recently been employed for this task, reconstructing realistic images with high semantic fidelity is still a challenging problem. Here, we propose a new method based on a diffusion model (DM) to reconstruct images from human brain activity obtained via functional magnetic resonance imaging (fMRI). More specifically, we rely on a latent diffusion model (LDM) termed Stable Diffusion. This model reduces the computational cost of DMs, while preserving their high generative performance. We also characterize the inner mechanisms of the LDM by studying how its different components (such as the latent vector of image Z, conditioning inputs C, and different elements of the denoising U-Net) relate to distinct brain functions. We show that our proposed method can reconstruct high-resolution images with high fidelity in straight-forward fashion, without the need for any additional training and fine-tuning of complex deep-learning models. We also provide a quantitative interpretation of different LDM components from a neuroscientific perspective. Overall, our study proposes a promising method for reconstructing images from human brain activity, and provides a new framework for understanding DMs. Please check out our webpage at https://sites.google.com/view/stablediffusion-with-brain/. — Read More

#human

AI unlikely to gain human-like cognition, unless connected to real world through robots

Connecting artificial intelligence systems to the real world through robots and designing them using principles from evolution is the most likely way AI will gain human-like cognition, according to research from the University of Sheffield.

In a paper published in Science Robotics, Professor Tony Prescott and Dr Stuart Wilson from the University’s Department of Computer Science, say that AI systems are unlikely to resemble real brain processing no matter how large their neural networks or the datasets used to train them might become, if they remain disembodied. — Read More

#human

I-JEPA: The first AI model based on Yann LeCun’s vision for more human-like AI

Last year, Meta’s Chief AI Scientist Yann LeCun proposed a new architecture intended to overcome key limitations of even the most advanced AI systems today. His vision is to create machines that can learn internal models of how the world works so that they can learn much more quickly, plan how to accomplish complex tasks, and readily adapt to unfamiliar situations.

We’re excited to introduce the first AI model based on a key component of LeCun’s vision. This model, the Image Joint Embedding Predictive Architecture (I-JEPA), learns by creating an internal model of the outside world, which compares abstract representations of images (rather than comparing the pixels themselves). I-JEPA delivers strong performance on multiple computer vision tasks, and it’s much more computationally efficient than other widely used computer vision models. The representations learned by I-JEPA can also be used for many different applications without needing extensive fine tuning. For example, we train a 632M parameter visual transformer model using 16 A100 GPUs in under 72 hours, and it achieves state-of-the-art performance for low-shot classification on ImageNet, with only 12 labeled examples per class. Other methods typically take two to 10 times more GPU-hours and achieve worse error rates when trained with the same amount of data.

Our paper on I-JEPA will be presented at CVPR 2023 next week, and we’re also open-sourcing the training code and model checkpoints today. — Read More

#human