… Reid [Hoffman] recently sat down with Mustafa [Suleyman] to discuss the ever-changing landscape of artificial intelligence, as well as the ideals that were essential in creating the AI assistant, Pi. And we’re so excited to share this interview with you today, because it’s the perfect prologue to our upcoming miniseries, AI and You, where Reid will talk with an array of AI leaders, including Mustafa, to explore how you can harness AI to scale your productivity, your business, and yourself, while staying safe in the process. — Read More
Recent Updates Page 160
ChatGPT can now search the web in real time
OpenAI posted today that ChatGPT can once more trawl the web for current information, offering answers taken directly from “current and authoritative” sources, which it cites in its responses. The feature, called Browse with Bing, is only open to those with Plus and Enterprise subscriptions for now, but the company says it will roll it out “to all users soon.” — Read More
Multi-Modal AI is a UX Problem
Transformers and other AI breakthroughs have shown state-of-the-art performance across different modalities.
The next frontier in AI is combining these modalities in interesting ways. Explain what’s happening in a photo. Debug a program with your voice. Generate music from an image. There’s still technical work to be done with combining these modalities, but the greatest challenge is not a technical one but a user experience one.
What is the right UX for these use cases? — Read More
GPT-4V(ision) System Card — Safety Properties of GPT-4V
GPT-4 with vision (GPT-4V) enables users to instruct GPT-4 to analyze image inputs provided by the user, and is the latest capability we are making broadly available. Incorporating additional modalities (such as image inputs) into large language models (LLMs) is viewed by some as a key frontier in artificial intelligence research and development [1, 2, 3]. Multimodal LLMs offer the possibility of expanding the impact of language-only systems with novel interfaces and capabilities, enabling them to solve new tasks and provide novel experiences for their users.
In this system card, [4, 5]1 we analyze the safety properties of GPT-4V. Our work on safety for GPT-4V builds on the work done for GPT-4 [7] and here we dive deeper into the evaluations, preparation, and mitigation work done specifically for image inputs. — Read More
ChatGPT can now see, hear, and speak
We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.
Voice and image give you more ways to use ChatGPT in your life. Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.
We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms. — Read More
Spotify is going to clone podcasters’ voices — and translate them to other languages
A partnership with OpenAI will let podcasters replicate their voices to automatically create foreign-language versions of their shows.
What if podcasters could flip a switch and instantly speak another language? That’s the premise behind Spotify’s new AI-powered voice translation feature, which reproduces podcasts in other languages using the podcaster’s own voice.
The company has partnered with a handful of podcasters to translate their English-language episodes into Spanish with its new tool, and it has plans to roll out French and German translations in the coming weeks. — Read More
AI Revolution: Top Lessons from OpenAI, Anthropic, CharacterAI, & More
The AI Revolution is here. In this episode, you’ll learn what the most important themes that some of the world’s most prominent AI builders – from OpenAI, Anthropic, CharacterAI, Roblox, and more – are paying attention to. You’ll hear about the economics of AI, broad vs specialized models, the importance of UX, and whether we can expect scaling laws to continue. — Read More
The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”
We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form “A is B”, it will not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse. For instance, if a model is trained on “Olaf Scholz was the ninth Chancellor of Germany”, it will not automatically be able to answer the question, “Who was the ninth Chancellor of Germany?”. Moreover, the likelihood of the correct answer (“Olaf Scholz”) will not be higher than for a random name. Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if “A is B” occurs, “B is A” is more likely to occur). We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as “Uriah Hawthorne is the composer of ‘Abyssal Melodies'” and showing that they fail to correctly answer “Who composed ‘Abyssal Melodies?'”. The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT-3.5 and GPT-4) on questions about real-world celebrities, such as “Who is Tom Cruise’s mother? [A: Mary Lee Pfeiffer]” and the reverse “Who is Mary Lee Pfeiffer’s son?”. GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. This shows a failure of logical deduction that we hypothesize is caused by the Reversal Curse. — Read More
Paper
Code is available at https://github.com/lukasberglund/reversal_curse.
Everyone is above average
Congratulations – you are now above average!
It may sound like an old, bad statistics joke, but I mean it quite literally. We now have very strong evidence that AI elevates the skills of the lowest performers across a wide range of fields to, or even far above, what was previously average performance. — Read More
An NYPD security robot will be patrolling the Times Square subway station
The New York Police Department (NYPD) is implementing a new security measure at the Times Square subway station. It’s deploying a security robot to patrol the premises, which authorities say is meant to “keep you safe.” We’re not talking about a RoboCop-like machine or any human-like biped robot — the K5, which was made by California-based company Knightscope, looks like a massive version of R2-D2. Albert Fox Cahn, the executive director of privacy rights group Surveillance Technology Oversight Project, has a less flattering description for it, though, and told The New York Times that it’s like a “trash can on wheels.” — Read More