Since artificial intelligence-powered text-generation tools were made widely available to the public in the past few months, they’ve been heralded by some as the future of email, internet search, and content generation. But these AI-powered tools also have some clear shortcomings: They tend to be incorrect, and often generate answers that reinforce racial biases, for example. There are also serious ethical concerns about their unspecified training data.
It is not surprising that debates over using these tools have also been happening in fandom spaces. Excited fans almost immediately turned to them as a new way of exploring their favorite characters. With the right prompt, AI can spit out a few paragraphs of fic-like writing. But just as quickly, many fanfic writers began to speak out against the practice. Read More
Tag Archives: NLP
PaLM2
When you look back at the biggest breakthroughs in AI over the last decade, Google has been at the forefront of so many of them. Our groundbreaking work in foundation models has become the bedrock for the industry and the AI-powered products that billions of people use daily. As we continue to responsibly advance these technologies, there’s great potential for transformational uses in areas as far-reaching as healthcare and human creativity.
… Building on this work, today we’re introducing PaLM 2, our next generation language model. PaLM 2 is a state-of-the-art language model with improved multilingual, reasoning and coding capabilities.
… At I/O today, we announced over 25 new products and features powered by PaLM 2. That means that PaLM 2 is bringing the latest in advanced AI capabilities directly into our products and to people — including consumers, developers, and enterprises of all sizes around the world. Read More
Palantir AIP
Activate LLMs and other AI on your private network, subject to full control.
Constitutional AI: RLHF On Steroids
AIs like GPT-4 go through several different1 types of training. First, they train on giant text corpuses in order to work at all. Later, they go through a process called “reinforcement learning through human feedback” (RLHF) which trains them to be “nice”. RLHF is why they (usually) won’t make up fake answers to your questions, tell you how to make a bomb, or rank all human races from best to worst.
RLHF is hard. The usual method is to make human crowdworkers rate thousands of AI responses as good or bad, then train the AI towards the good answers and away from the bad answers. But having thousands of crowdworkers rate thousands of answers is expensive and time-consuming. And it puts the AI’s ethics in the hands of random crowdworkers. Companies train these crowdworkers in what responses they want, but they’re limited by the crowdworkers’ ability to follow their rules.
In their new preprint Constitutional AI: Harmlessness From AI Feedback, a team at Anthropic (a big AI company) announces a surprising update to this process: what if the AI gives feedback to itself? — Read More
OpenLLaMA: An Open Reproduction of LLaMA
In this repo, we release a permissively licensed open source reproduction of Meta AI’s LLaMA large language model. In this release, we’re releasing a public preview of the 7B OpenLLaMA model that has been trained with 200 billion tokens. We provide PyTorch and Jax weights of pre-trained OpenLLaMA models, as well as evaluation results and comparison against the original LLaMA models. Stay tuned for our updates. Read More
Large Language Models are Human-Level Prompt Engineers
By conditioning on natural language instructions, large language models (LLMs) have displayed impressive capabilities as general-purpose computers. However, task performance depends significantly on the quality of the prompt used to steer the model, and most effective prompts have been handcrafted by humans. Inspired by classical program synthesis andthe human approach to prompt engineering, we propose Automatic Prompt Engineer(APE) for automatic instruction generation and selection. In our method, we treat the instruction as the “program,” optimized by searching over a pool of instruction candidates proposed by an LLM in order to maximize a chosen score function. To evaluate the quality of the selected instruction, we evaluate the zero-shot performance of another LLM following the selected instruction. Extensive experiments show that our automatically generated instructions outperform the prior LLM baseline by a large margin and achieve better or comparable performance to the instructions generated by human annotators on 24/24 Instruction Induction tasks and 17/21 curated BIG-Bench tasks. We conduct extensive qualitative and quantitative analyses to explore the performance of APE. We show that APE-engineered prompts are able to improve few-shot learning performance (by simply prepending them to standard in-context learning prompts), find better zero-shot chain-ofthought prompts, as well as steer models toward truthfulness and/or informativeness. Read More
#nlpPEER: A Collaborative Language Model
Textual content is often the output of a collaborative writing process: We start with an initial draft, ask for suggestions, and repeatedly make changes. Agnostic of this process, today’s language models are trained to generate only the final result. As a consequence, they lack several abilities crucial for collaborative writing: They are unable to update existing texts, difficult to control and incapable of verbally planning or explaining their actions. To address these shortcomings, we introduce PEER, a collaborative language model that is trained to imitate the entire writing process itself: PEER can write drafts, add suggestions, propose edits and provide explanations for its actions. Crucially, we train multiple instances of PEER able to infill various parts of the writing process, enabling the use of self-training techniques for increasing the quality, amount and diversity of training data. This unlocks PEER’s full potential by making it applicable in domains for which no edit histories are available and improving its ability to follow instructions, to write useful comments, and to explain its actions. We show that PEER achieves strong performance across various domains and editing tasks. Read More
#nlpThe semiautomated social network is coming
It makes sense that LinkedIn would be the first major social network to push AI-generated content on its users. The Microsoft-owned company is weird. It’s corporate. It’s full of workfluencer posts and engagement bait that ranges in tone from management consultant bland to cheerfully psychotic. Happily, this is the same emotional spectrum on which AI tends to operate.
LinkedIn isn’t populating its feed with AI chatbots just yet, but last week began sharing “AI-powered conversation starters” with the express purpose of provoking discussion among users. These posts are “developed” with the help of LinkedIn’s editorial team and matched with human experts who can then offer their thoughts on topics like “how to create a consistent brand voice on social media” and “how to monitor the online reach of your writing.” So far, so anodyne — like the contents of an r/askmckinsey subreddit.
But this project is a milestone nevertheless, and may herald the start of a wider revolution for the web. It’s the first time — I know of — that a major social media platform has directly served users AI-generated content to keep them engaged. And in a time of social media stagnation, from Twitter’s manifold struggles to Meta’s desperate-looking pitch for paid subscriptions, it could point to the industry’s future: to the semiautomated social network. Read More
OpenAI co-founder on company’s past approach to openly sharing research: ‘We were wrong’
OpenAI announced its latest language model, GPT-4, but many in the AI community were disappointed by the lack of public information. Their complaints track increasing tensions in the AI world over safety.
Yesterday, OpenAI announced GPT-4, its long-awaited next-generation AI language model. The system’s capabilities are still being assessed, but as researchers and experts pore over its accompanying materials, many have expressed disappointment at one particular feature: that despite the name of its parent company, GPT-4 is not an open AI model.
OpenAI has shared plenty of benchmark and test results for GPT-4, as well as some intriguing demos, but has offered essentially no information on the data used to train the system, its energy costs, or the specific hardware or methods used to create it. Read More
OpenAI Introduces GPT-4
OpenAI announced GPT-4, its latest milestone in scaling up deep learning.
GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professional and academic benchmarks. For example, it passes a simulated bar exam with a score around the top 10% of test takers; in contrast, GPT-3.5’s score was around the bottom 10%. We’ve spent 6 months iteratively aligning GPT-4 using lessons from our adversarial testing program as well as ChatGPT, resulting in our best-ever results (though far from perfect) on factuality, steerability, and refusing to go outside of guardrails.
… We are releasing GPT-4’s text input capability via ChatGPT and the API (with a waitlist). To prepare the image input capability for wider availability, we’re collaborating closely with a single partner to start. We’re also open-sourcing OpenAI Evals, our framework for automated evaluation of AI model performance, to allow anyone to report shortcomings in our models to help guide further improvements. Read More