Reasoning, the process of devising and executing complex goal-oriented action sequences, remains a critical challenge in AI. Current large language models (LLMs) primarily employ Chain-of-Thought (CoT) techniques, which suffer from brittle task decomposition, extensive data requirements, and high latency. Inspired by the hierarchical and multi-timescale processing in the human brain, we propose the Hierarchical Reasoning Model (HRM), a novel recurrent architecture that attains significant computational depth while maintaining both training stability and efficiency. HRM executes sequential reasoning tasks in a single forward pass without explicit supervision of the intermediate process, through two interdependent recurrent modules: a high-level module responsible for slow, abstract planning, and a low-level module handling rapid, detailed computations. With only 27 million parameters, HRM achieves exceptional performance on complex reasoning tasks using only 1000 training samples. The model operates without pre-training or CoT data, yet achieves nearly perfect performance on challenging tasks including complex Sudoku puzzles and optimal path finding in large mazes. Furthermore, HRM outperforms much larger models with significantly longer context windows on the Abstraction and Reasoning Corpus (ARC), a key benchmark for measuring artificial general intelligence capabilities. These results underscore HRM’s potential as a transformative advancement toward universal computation and general-purpose reasoning systems. — Read More
Tag Archives: Human
Scientists just developed a new AI modeled on the human brain — it’s outperforming LLMs like ChatGPT at reasoning tasks
The hierarchical reasoning model (HRM) system is modeled on the way the human brain processes complex information, and it outperformed leading LLMs in a notoriously hard-to-beat benchmark.
Scientists have developed a new type of artificial intelligence (AI) model that can reason differently from most large language models (LLMs) like ChatGPT, resulting in much better performance in key benchmarks.
The new reasoning AI, called a hierarchical reasoning model (HRM), is inspired by the hierarchical and multi-timescale processing in the human brain — the way different brain regions integrate information over varying durations (from milliseconds to minutes). — Read More
Neuralink: We Have a Backlog of 10K Patients Who Want Our Brain Implant
Neuralink has a backlog of 10,000 individuals interested in having its N1 device drilled into their skulls, according to President and Co-Founder Dongjin (DJ) Seo. The company has implanted the N1 into 12 clinical trial patients so far; Seo expects the number to grow to 25 by year’s end.
People can sign up to participate in the company’s clinical trials online, but to qualify, they must have either limited or no ability to use their hands due to a cervical spinal cord injury or ALS. — Read More
A built-in ‘off switch’ to stop persistent pain
Nearly 50 million people in the U.S. live with chronic pain, an invisible and often stubborn condition that can last for decades.
Now, collaborative research led by neuroscientist J. Nicholas Betley finds that a critical hub in the brainstem, has a built-in “off switch” to stop persistent pain signals from reaching the rest of the brain.
Their findings could help clinicians better understand chronic pain. “If we can measure and eventually target these neurons, that opens up a whole new path for treatment,” says Betley. — Read More
Do Humans Really Have World Models?
What if our world models are just as emergent and flimsy as AI’s?
I keep hearing that world models are the way forward for AI.
I tend to agree, and have been saying the same for many years as a technical person in AI but a non-A-tier-AI-researcher working on actual models.
Anyway, I’m up at 3:45AM today with an insane thought.
Why do we think humans have world models? — Read More
New Ultrasound Helmet Reaches Deep Inside The Brain Without Surgery
Deep-brain structures like the basal ganglia or the thalamus wield major influence on our behavior. If something goes awry, dysregulation in the deep brain may trigger neurological conditions like Parkinson’s disease or depression.
Despite the clear importance of these structures, our knowledge about them remains limited by their location, making them difficult to study and treat.
In a new study, researchers unveil a device that might offer an alternative to invasive procedures. Featuring a novel ultrasound helmet, it not only modulates deep-brain circuits without surgery, but reportedly can do so with unrivaled precision. — Read More
Read the Study
SpikingBrain Technical Report: Spiking Brain-inspired Large Models
Mainstream Transformer-based large language models (LLMs) face significant efficiency bottlenecks: training computation scales quadratically with sequence length, and inference memory grows linearly. These constraints limit their ability to process long sequences effectively. In addition, building large models on non-NVIDIA computing platforms poses major challenges in achieving stable and efficient training and deployment. To address these issues, we introduce SpikingBrain, a new family of brain-inspired models designed for efficient long-context training and inference. SpikingBrain leverages the MetaX GPU cluster and focuses on three core aspects: i) Model Architecture: linear and hybrid-linear attention architectures with adaptive spiking neurons; ii) Algorithmic Optimizations: an efficient, conversion-based training pipeline compatible with existing LLMs, along with a dedicated spike coding framework; iii) System Engineering: customized training frameworks, operator libraries, and parallelism strategies tailored to the MetaX hardware.
Using these techniques, we develop two models: SpikingBrain-7B, a linear LLM, and SpikingBrain-76B, a hybrid-linear MoE LLM. These models demonstrate the feasibility of large-scale LLM development on non-NVIDIA platforms. SpikingBrain achieves performance comparable to open-source Transformer baselines while using exceptionally low data resources (continual pre-training of ∼150B tokens). Our models also significantly improve long-sequence training efficiency and deliver inference with (partially) constant memory and event-driven spiking behavior. For example, SpikingBrain-7B achieves more than 100× speedup in Time to First Token (TTFT) for 4M-token sequences. Our training framework supports weeks of stable large-scale training on hundreds of MetaX C550 GPUs, with the 7B model reaching a Model FLOPs Utilization (MFU) of 23.4%. In addition, the proposed spiking scheme achieves 69.15% sparsity, enabling low-power operation. Overall, this work demonstrates the potential of brain-inspired mechanisms to drive the next generation of efficient and scalable large model design. — Read More
In a first, scientists map complete brain activity during decision-making
Mice moving tiny steering wheels to control shapes on a screen have given scientists an unprecedented view of how decisions unfold across the brain.
For the first time, researchers have mapped decision-making at single-cell resolution across an entire mammalian brain. — Read More
Read the Paper
Parallel AI Agents Are a Game Changer
I’ve been in this industry long enough to watch technologies come and go. I’ve seen the excitement around new frameworks, the promises of revolutionary tools, and the breathless predictions about what would “change everything.” Most of the time, these technologies turned out to be incremental improvements wrapped in marketing hyperbole.
But parallel agents? This is different. This is the first time I can say, without any exaggeration, that I’m witnessing technology that will fundamentally transform how we develop software. — Read More
AGI is an Engineering Problem
We’ve reached an inflection point in AI development. The scaling laws that once promised ever-more-capable models are showing diminishing returns. GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes that brute-force scaling can’t solve. The path to artificial general intelligence isn’t through training ever-larger language models—it’s through building engineered systems that combine models, memory, context, and deterministic workflows into something greater than their parts.
Let me be blunt: AGI is an engineering problem, not a model training problem. — Read More