Nvidia’s NEW Humanoid Robots STUNS The ENITRE INDUSTRY! (Nvidia Project GROOT)

Read More

#nvidia, #robotics, #videos

Nvidia reveals Blackwell B200 GPU, the ‘world’s most powerful chip’ for AI

Nvidia’s must-have H100 AI chip made it a multitrillion-dollar company, one that may be worth more than Alphabet and Amazon, and competitors have been fighting to catch up. But perhaps Nvidia is about to extend its lead — with the new Blackwell B200 GPU and GB200 “superchip.”

Nvidia says the new B200 GPU offers up to 20 petaflops of FP4 horsepower from its 208 billion transistors. Also, it says, a GB200 that combines two of those GPUs with a single Grace CPU can offer 30 times the performance for LLM inference workloads while also potentially being substantially more efficient. It “reduces cost and energy consumption by up to 25x” over an H100, says Nvidia. — Read More

#nvidia

Groq

Groq is on a mission to set the standard for GenAI inference speed, helping real-time AI applications come to life today. Using a new type of end-to-end processing unit system, called a LPU Inference Engine, with LPU standing for Language Processing Unit™, Groq provides the fastest inference for computationally intensive applications with a sequential component to them, such as AI language applications (LLMs). Groq supports standard machine learning (ML) frameworks such as PyTorch, TensorFlow, and ONNX for inference. Groq does not currently support ML training with the LPU Inference Engine. — Read More

#nlp, #nvidia

US researchers develop ‘unhackable’ computer chip that works on light

Researchers at the University of Pennsylvania have developed a new computer chip that uses light instead of electricity. This could improve the training of artificial intelligence (AI) models by improving the speed of data transfer and, more efficiently, reducing the amount of electricity consumed.

… A team led by Nader Enghata, a professor at the School of Engineering and Applied Science at the University of Pennsylvania, has designed a silicon-photonic (SiPh) chip that can perform mathematical computations using light. The team turned to light as it is the fastest means of transferring data known to humanity. However, using widely abundant silicon ensures the technology can be scaled quickly. — Read More

#nvidia

New Texas Center Will Create Generative AI Computing Cluster Among Largest of Its Kind

The University of Texas at Austin is creating one of the most powerful artificial intelligence hubs in the academic world to lead in research and offer world-class AI infrastructure to a wide range of partners.

UT is launching the Center for Generative AI, powered by a new GPU computing cluster, among the largest in academia. The cluster will comprise 600 NVIDIA H100s GPUs — short for graphics processing units, specialized devices to enable rapid mathematical computations, making them ideal for training AI models. The Texas Advanced Computing Center (TACC) will host and support the cluster, called Vista.  – Read More

#nvidia

Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying

We propose Tied-LoRA, a simple paradigm utilizes weight tying and selective training to further increase parameter efficiency of the Low-rank adaptation (LoRA) method. Our investigations include all feasible combinations parameter training/freezing in conjunction with weight tying to identify the optimal balance between performance and the number of trainable parameters. Through experiments covering a variety of tasks and two base language models, we provide analysis revealing trade-offs between efficiency and performance. Our experiments uncovered a particular Tied-LoRA configuration that stands out by demonstrating comparable performance across several tasks while employing only 13~\% percent of parameters utilized by the standard LoRA method. — Read More

#nvidia

Nvidia announces new HGX H200 computing platform, with advanced memory to handle AI workloads

Nvidia Corp. today announced the introduction of the HGX H200 computing platform, a new powerful system that features the upcoming H200 Tensor Core graphics processing unit based on its Hopper architecture, with advanced memory to handle the massive amounts of data needed for artificial intelligence and supercomputing workloads.

The company announced the new platform (pictured) during today’s Supercomputing 2023 conference in Denver, Colorado. It revealed that the H200 will be the first GPU to be built with HB3e memory, a high-speed memory designed to accelerate large language model AIs and high-performance computing capabilities for scientific and industrial endeavors.

The H200 is the next generation after the H100 GPU, Nvidia’s first GPU to be built on the Hopper architecture. It includes a new feature called the Transformer Engine designed to speed up natural language processing models. With the addition of the new HB3e memory, the H200 has more than 141 gigabytes of memory at 4.8 terabits per second, capable of nearly double the capacity and 2.4 times the bandwidth of the Nvidia A100 GPU. — Read More

#nvidia

Nvidia to release new AI chips for Chinese market after export ban

Nvidia is expected to introduce new high-end AI chips for Chinese customers after its current ones were blocked from being sold in the country. China, together with Taiwan and the U.S., ranks among Nvidia’s top markets — Read More

#china-ai, #nvidia

‘Mind-blowing’ IBM chip speeds up AI

IBM’s NorthPole processor sidesteps need to access external memory, boosting computing power and saving energy.

A brain-inspired computer chip that could supercharge artificial intelligence (AI) by working faster with much less power has been developed by researchers at IBM in San Jose, California. Their massive NorthPole processor chip eliminates the need to frequently access external memory, and so performs tasks such as image recognition faster than existing architectures do — while consuming vastly less power.

“Its energy efficiency is just mind-blowing,” says Damien Querlioz, a nanoelectronics researcher at the University of Paris-Saclay in Palaiseau. The work, published in Science1, shows that computing and memory can be integrated on a large scale, he says. “I feel the paper will shake the common thinking in computer architecture.” — Read More

#human, #nvidia

China Chips and Moore’s Law

On Tuesday the Biden administration tightened export controls for advanced AI chips being sold to China; the primary target was Nvidia’s H800 and A800 chips, which were specifically designed to skirt controls put in place last year. The primary difference between the H800/A800 and H100/A100 is the bandwidth of their interconnects: the A100 had 600 Gb/s interconnects (the H100 has 900GB/s), which just so happened to be the limit proscribed by last year’s export controls; the A800 and H800 were limited to 400 Gb/s interconnects.

The reason why interconnect speed matters is tied up with Nvidia CEO Jensen Huang’s thesis that Moore’s Law is dead. Moore’s Law, as originally stated in 1965, states that the number of transistors in an integrated circuit would double every year. Moore revised his prediction 10 years later to be a doubling every two years, which held until the last decade or so, when it has slowed to a doubling about every three years. — Read More

#china-ai, #nvidia