AMD MI300X Accelerators are Competitive with NVIDIA H100, Crunch MLPerf Inference v4.1

The MLCommons consortium on Wednesday posted MLPerf Inference v4.1 benchmark results for popular AI inferencing accelerators available in the market, across brands that include NVIDIA, AMD, and Intel. AMD’s Instinct MI300X accelerators emerged competitive to NVIDIA’s “Hopper” H100 series AI GPUs. AMD also used the opportunity to showcase the kind of AI inferencing performance uplifts customers can expect from its next-generation EPYC “Turin” server processors powering these MI300X machines. “Turin” features “Zen 5” CPU cores, sporting a 512-bit FPU datapath, and improved performance in AI-relevant 512-bit SIMD instruction-sets, such as AVX-512, and VNNI. The MI300X, on the other hand, banks on the strengths of its memory sub-system, FP8 data format support, and efficient KV cache management. — Read More

#nvidia

New in Gemini: Custom Gems and improved image generation with Imagen 3

We have new features rolling out, starting today, that we previewed at Google I/O. Gems, a new feature that lets you customize Gemini to create your own personal AI experts on any topic you want, are now available for Gemini Advanced, Business and Enterprise users. And our new image generation model, Imagen 3, will be rolling out across Gemini, Gemini Advanced, Business and Enterprise in the coming days. — Read More

#big7

Diffusion Models Are Real-Time Game Engines

We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression. Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation. GameNGen is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions. Conditioning augmentations enable stable auto-regressive generation over long trajectories. — Read More

#big7

Capt. Grace Hopper on Future Possibilities: Data, Hardware, Software, and People (Part One, 1982)

Read More

Part 2

#videos

OpenAI’s Strawberry and Orion: The Next Leap in AI Evolution

In the ever-evolving landscape of artificial intelligence, OpenAI continues to push the boundaries of what’s possible. Their latest endeavors, the Strawberry and Orion AI models, are poised to redefine our expectations of machine intelligence. Let’s dive into what makes these models tick and why they matter.

… What sets Strawberry apart is its ability to think — and I mean really think. It’s not just pattern matching or regurgitating training data. This AI is solving problems it’s never seen before, like a mathematician encountering a novel proof. — Read More

#performance

Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

When writing and talking, people sometimes pause to think. Although reasoning-focused works have often framed reasoning as a method of answering questions or completing agentic tasks, reasoning is implicit in almost all written text. For example, this applies to the steps not stated between the lines of a proof or to the theory of mind underlying a conversation. In the Self-Taught Reasoner (STaR, Zelikman et al. 2022), useful thinking is learned by inferring rationales from few-shot examples in question-answering and learning from those that lead to a correct answer. This is a highly constrained setting — ideally, a language model could instead learn to infer unstated rationales in arbitrary text. We present Quiet-STaR, a generalization of STaR in which LMs learn to generate rationales at each token to explain future text, improving their predictions. We address key challenges, including 1) the computational cost of generating continuations, 2) the fact that the LM does not initially know how to generate or use internal thoughts, and 3) the need to predict beyond individual next tokens. To resolve these, we propose a tokenwise parallel sampling algorithm, using learnable tokens indicating a thought’s start and end, and an extended teacher-forcing technique. Encouragingly, generated rationales disproportionately help model difficult-to-predict tokens and improve the LM’s ability to directly answer difficult questions. In particular, after continued pretraining of an LM on a corpus of internet text with Quiet-STaR, we find zero-shot improvements on GSM8K (5.9%→10.9%) and CommonsenseQA (36.3%→47.2%) and observe a perplexity improvement of difficult tokens in natural text. Crucially, these improvements require no fine-tuning on these tasks. Quiet-STaR marks a step towards LMs that can learn to reason in a more general and scalable way. — Read More

#performance

STaR: Bootstrapping Reasoning With Reasoning

Generating step-by-step “chain-of-thought” rationales improves language model performance on complex reasoning tasks like mathematics or commonsense question-answering. However, inducing language model rationale generation currently requires either constructing massive rationale datasets or sacrificing accuracy by using only few-shot inference. We propose a technique to iteratively leverage a small number of rationale examples and a large dataset without rationales, to bootstrap the ability to perform successively more complex reasoning. This technique, the “Self-Taught Reasoner” (STaR), relies on a simple loop: generate rationales to answer many questions, prompted with a few rationale examples; if the generated answers are wrong, try again to generate a rationale given the correct answer; fine-tune on all the rationales that ultimately yielded correct answers; repeat. We show that STaR significantly improves performance on multiple datasets compared to a model fine-tuned to directly predict final answers, and performs comparably to fine-tuning a 30× larger state-of-the-art language model on CommensenseQA. Thus, STaR lets a model improve itself by learning from its own generated reasoning. — Read More

#performance

Reimagining cloud strategy for AI-first enterprises

The rise of generative artificial intelligence (AI), natural language processing, and computer vision has sparked lofty predictions: AI will revolutionize business operations, transform the nature of knowledge work, and boost companies’ bottom lines and the larger global economy by trillions of dollars.

Executives and technology leaders are eager to see these promises realized, and many are enjoying impressive results of early AI investments. Balakrishna D.R. (Bali), executive vice president, global services head, AI and industry verticals at Infosys, says that generative AI is already proving game-changing for tasks such as knowledge management, search and summarization, software development, and customer service across sectors such as financial services, retail, health care, and automotive. — Read More

#strategy

Terence Tao at IMO 2024: AI and Mathematics

Read More

#videos

Andrew Ng’s new model lets you play around with solar geoengineering to see what would happen

AI pioneer Andrew Ng has released a simple online tool that allows anyone to tinker with the dials of a solar geoengineering model, exploring what might happen if nations attempt to counteract climate change by spraying reflective particles into the atmosphere.

The concept of solar geoengineering was born from the realization that the planet has cooled in the months following massive volcanic eruptions, including one that occurred in 1991, when Mt. Pinatubo blasted some 20 million tons of sulfur dioxide into the stratosphere. But critics fear that deliberately releasing such materials could harm certain regions of the world, discourage efforts to cut greenhouse-gas emissions, or spark conflicts between nations, among other counterproductive consequences

The goal of Ng’s emulator, called Planet Parasol, is to invite more people to think about solar geoengineering, explore the potential trade-offs involved in such interventions, and use the results to discuss and debate our options for climate action. The tool, developed in partnership with researchers at Cornell, the University of California, San Diego, and other institutions, also highlights how AI could help advance our understanding of solar geoengineering.  — Read More

Try the Model

#strategy