In this tutorial we will show you how anyone can build their own open-source ChatGPT without ever writing a single line of code! We’ll use the LLaMA 2 base model, fine tune it for chat with an open-source instruction dataset and then deploy the model to a chat app you can share with your friends. All by just clicking our way to greatness.
Why is this important? Well, machine learning, especially LLMs (Large Language Models), has witnessed an unprecedented surge in popularity, becoming a critical tool in our personal and business lives. Yet, for most outside the specialized niche of ML engineering, the intricacies of training and deploying these models appears beyond reach. If the anticipated future of machine learning is to be one filled with ubiquitous personalized models, then there’s an impending challenge ahead: How do we empower those with non-technical backgrounds to harness this technology independently? — Read More
Monthly Archives: September 2023
AI, Hardware, and Virtual Reality
In a recent interview I did with Craig Moffett we discussed why there is a “TMT” sector when it comes to industry classifications. TMT stands for technology, media, and telecoms, and what unifies them is that all deal in a world of massive up-front investment — i.e. huge fixed costs — and then near perfect scalability once deployed — zero marginal costs.
Each of these three categories, though, is distinct in the experience. …Another way to think about these categories is that if reality is the time and place in which one currently exists, each provides a form of virtual reality. — Read More
Empathy in AI
… Reid [Hoffman] recently sat down with Mustafa [Suleyman] to discuss the ever-changing landscape of artificial intelligence, as well as the ideals that were essential in creating the AI assistant, Pi. And we’re so excited to share this interview with you today, because it’s the perfect prologue to our upcoming miniseries, AI and You, where Reid will talk with an array of AI leaders, including Mustafa, to explore how you can harness AI to scale your productivity, your business, and yourself, while staying safe in the process. — Read More
ChatGPT can now search the web in real time
OpenAI posted today that ChatGPT can once more trawl the web for current information, offering answers taken directly from “current and authoritative” sources, which it cites in its responses. The feature, called Browse with Bing, is only open to those with Plus and Enterprise subscriptions for now, but the company says it will roll it out “to all users soon.” — Read More
Multi-Modal AI is a UX Problem
Transformers and other AI breakthroughs have shown state-of-the-art performance across different modalities.
The next frontier in AI is combining these modalities in interesting ways. Explain what’s happening in a photo. Debug a program with your voice. Generate music from an image. There’s still technical work to be done with combining these modalities, but the greatest challenge is not a technical one but a user experience one.
What is the right UX for these use cases? — Read More
GPT-4V(ision) System Card — Safety Properties of GPT-4V
GPT-4 with vision (GPT-4V) enables users to instruct GPT-4 to analyze image inputs provided by the user, and is the latest capability we are making broadly available. Incorporating additional modalities (such as image inputs) into large language models (LLMs) is viewed by some as a key frontier in artificial intelligence research and development [1, 2, 3]. Multimodal LLMs offer the possibility of expanding the impact of language-only systems with novel interfaces and capabilities, enabling them to solve new tasks and provide novel experiences for their users.
In this system card, [4, 5]1 we analyze the safety properties of GPT-4V. Our work on safety for GPT-4V builds on the work done for GPT-4 [7] and here we dive deeper into the evaluations, preparation, and mitigation work done specifically for image inputs. — Read More
ChatGPT can now see, hear, and speak
We are beginning to roll out new voice and image capabilities in ChatGPT. They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.
Voice and image give you more ways to use ChatGPT in your life. Snap a picture of a landmark while traveling and have a live conversation about what’s interesting about it. When you’re home, snap pictures of your fridge and pantry to figure out what’s for dinner (and ask follow up questions for a step by step recipe). After dinner, help your child with a math problem by taking a photo, circling the problem set, and having it share hints with both of you.
We’re rolling out voice and images in ChatGPT to Plus and Enterprise users over the next two weeks. Voice is coming on iOS and Android (opt-in in your settings) and images will be available on all platforms. — Read More
Spotify is going to clone podcasters’ voices — and translate them to other languages
A partnership with OpenAI will let podcasters replicate their voices to automatically create foreign-language versions of their shows.
What if podcasters could flip a switch and instantly speak another language? That’s the premise behind Spotify’s new AI-powered voice translation feature, which reproduces podcasts in other languages using the podcaster’s own voice.
The company has partnered with a handful of podcasters to translate their English-language episodes into Spanish with its new tool, and it has plans to roll out French and German translations in the coming weeks. — Read More
AI Revolution: Top Lessons from OpenAI, Anthropic, CharacterAI, & More
The AI Revolution is here. In this episode, you’ll learn what the most important themes that some of the world’s most prominent AI builders – from OpenAI, Anthropic, CharacterAI, Roblox, and more – are paying attention to. You’ll hear about the economics of AI, broad vs specialized models, the importance of UX, and whether we can expect scaling laws to continue. — Read More
The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”
We expose a surprising failure of generalization in auto-regressive large language models (LLMs). If a model is trained on a sentence of the form “A is B”, it will not automatically generalize to the reverse direction “B is A”. This is the Reversal Curse. For instance, if a model is trained on “Olaf Scholz was the ninth Chancellor of Germany”, it will not automatically be able to answer the question, “Who was the ninth Chancellor of Germany?”. Moreover, the likelihood of the correct answer (“Olaf Scholz”) will not be higher than for a random name. Thus, models exhibit a basic failure of logical deduction and do not generalize a prevalent pattern in their training set (i.e. if “A is B” occurs, “B is A” is more likely to occur). We provide evidence for the Reversal Curse by finetuning GPT-3 and Llama-1 on fictitious statements such as “Uriah Hawthorne is the composer of ‘Abyssal Melodies'” and showing that they fail to correctly answer “Who composed ‘Abyssal Melodies?'”. The Reversal Curse is robust across model sizes and model families and is not alleviated by data augmentation. We also evaluate ChatGPT (GPT-3.5 and GPT-4) on questions about real-world celebrities, such as “Who is Tom Cruise’s mother? [A: Mary Lee Pfeiffer]” and the reverse “Who is Mary Lee Pfeiffer’s son?”. GPT-4 correctly answers questions like the former 79% of the time, compared to 33% for the latter. This shows a failure of logical deduction that we hypothesize is caused by the Reversal Curse. — Read More
Paper
Code is available at https://github.com/lukasberglund/reversal_curse.