With Effort you can adjust smoothly – and in real time – how many calculations you’d like to do during inference of an LLM model.
At 50% calculations it is as fast as regular matrix multiplications on Apple Silicon chips. At 25% effort it’s twice as fast and still retains most of the quality.
You can also freely choose to skip loading the least important weights.
It is implemented for Mistral now, it should work for all the other models just as well. No retraining needed, just conversion to a different format and some precomputation. — Read More
Recent Updates Page 119
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention
This work introduces an efficient method to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. A key component in our proposed approach is a new attention technique dubbed Infini-attention. The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block. We demonstrate the effectiveness of our approach on long-context language modeling benchmarks, 1M sequence length passkey context block retrieval and 500K length book summarization tasks with 1B and 8B LLMs. Our approach introduces minimal bounded memory parameters and enables fast streaming inference for LLMs. — Read More
#nlpHow AI adds to human potential
Generative AI is advancing at a breakneck pace, prompting questions on risk and opportunity, from content creation to personal data management. In a special live recording, we delve into the ways AI can augment human work and spur innovation, instead of simply using AI to cut costs or replace jobs. Host Jeff Berman joined a seasoned AI researcher, Intel’s Lama Nachman, and a young start-up founder, Scale AI’s Alexandr Wang, on stage at the Intel Vision event in April 2024. They explore topics like AI’s disruption of creative industries, mitigating its biggest risks (like deep fakes), and why human critical thinking will be even more vital as AI technology spreads. — Read More
UMD-LinkUp AI Maps Transforms AI Job Tracking
UMD-LinkUp, a collaboration between the Robert H. Smith School of Business at the University of Maryland, LinkUp Job Market Data, and Outrigger Group, introduced the world’s first tool for mapping the creation of jobs requiring artificial intelligence skills: UMD-LinkUp AI Maps.
AI Maps leverages LinkUp’s industry-leading job data to visualize the spread of jobs requiring skills in AI across the country – by sector, state and more granular geographic levels. The resulting interactive map allows users to track the creation of U.S.-based AI jobs each month; rank states by their share of those jobs; do a deeper dive across economic sectors, metropolitan areas, and counties; and determine a region’s AI Intensity: the ratio of its AI jobs to all other postings. — Read More
Next Stop Paris — AI Production Output from TCLtv+
SoA survey reveals a third of translators and quarter of illustrators losing work to AI
Survey on generative AI highlights the growing impact of new technologies on creative careers, and an urgent need for ethical development that works within copyright laws
Throughout January 2024, we ran a survey of our 12,500 members and other authors, receiving nearly 800 responses on respondents’ experiences of generative artificial intelligence (AI) systems, and their views and concerns about the future impact on creative careers.
The findings demonstrate not only the deep uncertainty about the future role of generative AI in the profession, but also the impact it is already having on careers and livelihoods. — Read More
Is AI a platform shift or a paradigm shift? With Benedict Evans
Is robotics about to have its own ChatGPT moment?
Researchers are using generative AI and other techniques to teach robots new skills—including tasks they could perform in homes.
… What separates this new crop of robots is their software. Instead of the traditional painstaking planning and training, roboticists have started using deep learning and neural networks to create systems that learn from their environment on the go and adjust their behavior accordingly. At the same time, new, cheaper hardware, such as off-the-shelf components and robots like Stretch, is making this sort of experimentation more accessible.
Broadly speaking, there are two popular ways researchers are using AI to train robots: reinforcement learning, an AI technique that allows systems to improve through trial and error, to get robots to adapt their movements in new environments, and imitation learning, models learn to perform tasks by, for example, imitating the actions of a human teleoperating a robot or using a VR headset to collect data on a robot. — Read More
Klarna CEO says AI can do the job of 700 workers. But job replacement isn’t the biggest issue.
Fintech company Klarna, which powers e-commerce transactions for some of the world’s most recognizable brands, including Expedia, Macy’s and Nike, is at the forefront of AI adoption. It has integrated artificial intelligence across the company, most notably with an AI chatbot that it recently said does the equivalent work of 700 customer service agents. Klarna, which employs roughly 4,000 people, recently released statistics that show how efficient and effective the tool has been, wading into the thick of sensitive and high-stakes debates about the role of generative AI in business, how humans interact with it and its implications for the future of work. CEO Sebastian Siemiatkowski explains why he is so transparent about AI’s capabilities, and what concerns him most about the new technology. This interview has been edited for length and clarity. — Read More
#augmented-intelligenceAmazon CEO: “We’re deeply investing” in generative AI
Amazon CEO Andy Jassy revealed details about the company’s investments in generative AI in his annual shareholder letter published Thursday morning.
…[T]here are three distinct layers in the GenAI stack, each of which is gigantic, and each of which we’re deeply investing,” Jassy writes.
The “bottom layer” of Amazon’s AI strategy is to help developers and companies train models and produce predictions. Amazon says having its own custom AI training and inference chips will bring down costs for customers.
A “middle layer” serves companies that want to use their own data to customize existing foundational models and gain security and other features to build and scale generative AI applications.
The “top layer” is where Amazon builds generative AI applications for its own consumer businesses. For example, there’s “Rufus,” Amazon’s AI-powered shopping assistant, and the Amazon Web Services “Amazon Q.”
Read More