This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure

This paper convinced me LLMs are not just “applied statistics”, but learn world models and structure: https://thegradient.pub/othello/

You can look at an LLM trained on Othello moves, and extract from its internal state the current state of the board after each move you tell it. In other words, an LLM trained on only moves, like “E3, D3,..” contains within it a model of a 8×8 board grid and the current state of each square. — Read More

#nlp

Exploring Artificial Intelligence’s Potential & Threats | Andrew Ng | Eye on AI #131

Read More

#videos

Language to rewards for robotic skill synthesis

Empowering end-users to interactively teach robots to perform novel tasks is a crucial capability for their successful integration into real-world applications. For example, a user may want to teach a robot dog to perform a new trick, or teach a manipulator robot how to organize a lunch box based on user preferences. The recent advancements in large language models (LLMs) pre-trained on extensive internet data have shown a promising path towards achieving this goal. Indeed, researchers have explored diverse ways of leveraging LLMs for robotics, from step-by-step planning and goal-oriented dialogue to robot-code-writing agents.

While these methods impart new modes of compositional generalization, they focus on using language to link together new behaviors from an existing library of control primitives that are either manually engineered or learned a priori. Despite having internal knowledge about robot motions, LLMs struggle to directly output low-level robot commands due to the limited availability of relevant training data. As a result, the expression of these methods are bottlenecked by the breadth of the available primitives, the design of which often requires extensive expert knowledge or massive data collection.

In “Language to Rewards for Robotic Skill Synthesis”, we propose an approach to enable users to teach robots novel actions through natural language input.  — Read More

#robotics

Introducing IDEFICS: An Open Reproduction of State-of-the-Art Visual Language Model

We are excited to release IDEFICS (Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS), an open-access visual language model. IDEFICS is based on Flamingo, a state-of-the-art visual language model initially developed by DeepMind, which has not been released publicly. Similarly to GPT-4, the model accepts arbitrary sequences of image and text inputs and produces text outputs. IDEFICS is built solely on publicly available data and models (LLaMA v1 and OpenCLIP) and comes in two variants—the base version and the instructed version. Each variant is available at the 9 billion and 80 billion parameter sizes. — Read More

#image-recognition, #nlp

Artist-created images and animations about artificial intelligence (AI) made freely available online

What does artificial intelligence (AI) look like? Searching online, the answer is likely streams of code, glowing blue brains or white robots with men in suits.

… Since launching, Visualising AI has commissioned 13 artists to create more than 100 artworks, gaining over 100 million views, 800,000 downloads, and our imagery has been used by media outlets, research and civil society organisations. — Read More

View images on Unsplash

View videos on Pexels

#big7

Google and YouTube are trying to have it both ways with AI and copyright

Google has made clear it is going to use the open web to inform and create anything it wants, and nothing can get in its way. Except maybe Frank Sinatra.

There’s only one name that springs to mind when you think of the cutting edge in copyright law online: Frank Sinatra.

There’s nothing more important than making sure his estate — and his label, Universal Music Group — gets paid when people do AI versions of Ol’ Blue Eyes singing “Get Low” on YouTube, right? Even if that means creating an entirely new class of extralegal contractual royalties for big music labels just to protect the online dominance of your video platform while simultaneously insisting that training AI search results on books and news websites without paying anyone is permissible fair use? Right? Right?Read More

#legal

The human costs of the AI boom

If you use apps from world-leading technology companies such as OpenAI, Amazon, Microsoft or Google, there is a big chance you have already consumed services produced by online remote work — also known as cloudwork. Big and small organizations across the economy increasingly rely on outsourced labor available to them via platforms like Scale AI, Freelancer.com, Amazon Mechanical Turk, Fiverr and Upwork.

Recently, these platforms have become crucial for artificial intelligence (AI) companies to train their AI systems and ensure they operate correctly. OpenAI is a client of Scale AI and Remotasks, labeling data for their apps ChatGPT and DALL-E. Social networks hire platforms for content moderation. Beyond the tech world, universities, businesses and NGOs (nongovernmental organizations) regularly use these platforms to hire translators, graphic designers or IT experts.

Cloudwork platforms have become an essential earning opportunity for a rising number of people. A breakout study by the University of Oxford scholars Otto Kässi, Vili Lehdonvirta and Fabian Stephany estimated that more than 163 million people have registered on those websites. — Read More

#ethics

This AI Watches Millions Of Cars Daily And Tells Cops If You’re Driving Like A Criminal

Artificial intelligence is helping American cops look for “suspicious” patterns of movement, digging through license plate databases with billions of records. A drug trafficking case in New York has uncloaked — and challenged — one of the biggest rollouts of the controversial technology to date.

InMarch of 2022, David Zayas was driving down the Hutchinson River Parkway in Scarsdale. His car, a gray Chevrolet, was entirely unremarkable, as was its speed. But to the Westchester County Police Department, the car was cause for concern and Zayas a possible criminal; its powerful new AI tool had identified the vehicle’s behavior as suspicious.

Searching through a database of 1.6 billion license plate records collected over the last two years from locations across New York State, the AI determined that Zayas’ car was on a journey typical of a drug trafficker. — Read More

#surveillance

Introducing SeamlessM4T, a Multimodal AI Model for Speech and Text Translations

The world we live in has never been more interconnected, giving people access to more multilingual content than ever before. This also makes the ability to communicate and understand information in any language increasingly important.

Today, we’re introducing SeamlessM4T, the first all-in-one multimodal and multilingual AI translation model that allows people to communicate effortlessly through speech and text across different languages. — Read More

#big7, #translation