A temporary pause on training extra large language models

Breaking news: The letter that I mentioned earlier today is now public. It calls for a 6 month moratorium on training systems that are “more powerful than GPT-4”. A lot of notable people signed. I joined in.

I had no hand in drafting it, and there are things to fuss over (e.g., what exactly counts as more powerful than GPT-4? and how would we know, given that no details of GPT-4’s architecture or training set have been published?)—but the spirit of the letter is one that I support: until we get a better handle on the risks and benefits, we should proceed with caution.

It will be very interesting to see what happens next. Read More

#trust

Microsoft’s latest use for GPT4: Stopping hackers

The tech giant unveiled new cybersecurity software, escalating the arms race between defenders and hackers

Microsoft’s rapid campaign to integrate new artificial intelligence technology into its broad range of products continued Tuesday as the tech giant announced a new cybersecurity “co-pilot” meant to help companies track and defend against hacking attempts, upping the ante in the never-ending arms race between hackers and the cybersecurity professionals trying to keep them at bay.

It’s the latest salvo in Microsoft’s battle with Google and other tech companies to dominate the fast-growing field of “generative” AI, though it’s still unclear whether the flurry of product launches, demos and proclamations from executives will change the tech industry as dramatically as leaders are predicting. Read More

#cyber

Defensibility in the Age of AI

Tl;dr: Companies with technology that allows them to uniquely generate the data needed to train and fine-tune models are well positioned to create enduring value in the age of AI. The best AI companies may be those building in atoms and not just bits.

The pace of development in AI has given many the feeling that the ground is shifting under their feet. While incredibly exciting, this has led to a fair amount of anxiety among entrepreneurs who are wondering if there’s any true defensibility in what they’re building. A battle tested strategy in startups is to build a product that’s at least 10x better, 10x cheaper, or 10x easier than what exists while you march toward a long-term moat. But given how quickly AI development is advancing, a 10x product of last month may be obsolete this month. The fear is real. Read More

#strategy

#ai-first

Google’s New AI: DALL-E 2, But For Music!

Read More

#audio, #videos

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

We propose MM-REACT, a system paradigm that integrates ChatGPT with a pool of vision experts to achieve multimodal reasoning and action. In this paper, we define and explore a comprehensive list of advanced vision tasks that are intriguing to solve, but may exceed the capabilities >of existing vision and vision-language models. To achieve such advanced visual intelligence, MM-REACT introduces a textual prompt design that can represent text descriptions, textualized spatial coordinates, and aligned file names for >dense visual signals such as images and videos. MMREACT’s prompt design allows language models to accept, associate, and process multimodal information, thereby facilitating the synergetic combination of ChatGPT and various vision experts. Zero-shot experiments demonstrate MM-REACT’s effectiveness in addressing the specified capabilities of interests and its wide application in different scenarios that require advanced visual understanding. Furthermore, we discuss and compare MM-REACT’s system paradigm with an alternative approach that extends language models for multimodal scenarios through joint finetuning. Code, demo, video, and visualization are available at https://multimodal-react.github.io/.

Read More

#chatbots

Superhuman: What can AI do in 30 minutes?

The thing that we have to come to grips with in a world of ubiquitous, powerful AI tools is how much it can do for us. The multiplier on human effort is unprecedented, and potentially disruptive. But this fact can often feel abstract.

So I decided to run an experiment. I gave myself 30 minutes, and tried to accomplish as much as I could during that time on a single business project. At the end of 30 minutes I would stop. The project: to market the launch a new educational game. AI would do all the work, I would just offer directions.

And what it accomplished was superhuman. I will go through the details in a moment, but, in 30 minutes it: did market research, created a positioning document, wrote an email campaign, created a website, created a logo and “hero shot” graphic, made a social media campaign for multiple platforms, and scripted and created a video. In 30 minutes. Read More

#chatbots, #augmented-intelligence

Principled Reinforcement Learning with Human Feedback from Pairwise or K-wise Comparisons

We provide a theoretical framework for Reinforcement Learning with Human Feedback (RLHF). Our analysis shows that when the true reward function is linear, the widely used maximum likelihood >estimator (MLE) converges under both the Bradley-Terry-Luce (BTL) model and the Plackett-Luce (PL) model. However, we show that when training a policy based on the learned reward model, MLE fails while a pessimistic MLE provides policies with improved performance under certain coverage assumptions. Additionally, we demonstrate that under the PL model, the true MLE and an alternative MLE that splits the K-wise comparison into pairwise comparisons both converge. Moreover, the true MLE is asymptotically more efficient. Our results validate the empirical success of existing RLHF algorithms in InstructGPT and provide new insights for algorithm design. We also unify the problem of RLHF and max-entropy Inverse Reinforcement Learning (IRL), and provide the first sample complexity bound for max-entropy IRL. Read More

#reinforcement-learning

Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI

Read More
#singularity, #videos

It’s Game Over on Vocal Deepfakes

You may recall back in October I linked to an AI-generated simulated interview between Joe Rogan and Steve Jobs. I wrote:

I also don’t buy their claim that these voices are completely generated. Most of Jobs’s lines have auditorium echo — they sound like clips copy-and-pasted. If they can really generate these voices, why doesn’t their virtual Rogan actually say Steve Jobs’s name? Send me a clip of virtual Steve Jobs saying “John Gruber is a bozo, and I tell people not to waste their time reading Daring Fireball.” Then I’ll believe it.

I neglected to follow up until now, but Ignaz Kowalczuk from ElevenLabs (the company behind Prime Voice AI) took me up on the challenge and sent me this clip:

That clip sounds noticeably stilted, but it does sound like Steve Jobs.

Now come this: a Twitter thread from John Meyer, who trained a clone of Jobs’s voice and then hooked it up to ChatGPT to generate the words. The clips he posted to Twitter are freakishly uncanny. Read More

#audio, #fake

The Age of AI and Our Human Future

Read More

#videos