SDXL consists of an ensemble of experts pipeline for latent diffusion: In a first step, the base model is used to generate (noisy) latents, which are then further processed with a refinement model (available here: https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/) specialized for the final denoising steps. Note that the base model can be used as a standalone module.
Alternatively, we can use a two-stage pipeline as follows: First, the base model is used to generate latents of the desired output size. In the second step, we use a specialized high-resolution model and apply a technique called SDEdit (https://arxiv.org/abs/2108.01073, also known as “img2img”) to the latents generated in the first step, using the same prompt. This technique is slightly slower than the first one, as it requires more function evaluations. — Read More
Source code is available at https://github.com/Stability-AI/generative-models .
Daily Archives: August 22, 2023
Reinforced Self-Training (ReST) for Language Modeling
Reinforcement learning from human feedback (RLHF) can improve the quality of large language model’s (LLM) outputs by aligning them with human preferences. We propose a simple algorithm for aligning LLMs with human preferences inspired by growing batch reinforcement learning (RL), which we call Reinforced Self-Training (ReST). Given an initial LLM policy, ReST produces a dataset by generating samples from the policy, which are then used to improve the LLM policy using offline RL algorithms. ReST is more efficient than typical online RLHF methods because the training dataset is produced offline, which allows data reuse. While ReST is a general approach applicable to all generative learning settings, we focus on its application to machine translation. Our results show that ReST can substantially improve translation quality, as measured by automated metrics and human evaluation on machine translation benchmarks in a compute and sample-efficient manner. — Read More
Is the AI boom already over?
Generative AI tools are generating less interest than just a few months ago.
When generative AI products started rolling out to the general public last year, it kicked off a frenzy of excitement and fear.
People were amazed at the images and words these tools could create from just a single text prompt. Silicon Valley salivated over the prospect of a transformative new technology, one that it could make a lot of money off of after years of stagnation and the flops of crypto and the metaverse. And then there were the concerns about what the world would be after generative AI transformed it. Millions of jobs could be lost. It might become impossible to tell what was real or what was made by a computer. And if you want to get really dramatic about it, the end of humanity may be near. We glorified and dreaded the incredible potential this technology had. — Read More
Developers are now using AI for text-to-music apps
With the rise in popularity of Large Language Models (LLMs) and generative AI tools like ChatGPT, developers have found use cases to mold text in different ways for use cases ranging from writing emails to summarizing articles. Now, they are looking to help you generate bits of music by just typing some words.
Brett Bauman, the developer of PlayListAI (previously LinupSupply), launched a new app called Songburst on the App Store this week. The app doesn’t have a steep learning curve. You just have to type in a prompt like “Calming piano music to listen to while studying” or “Funky beats for a podcast intro” to let the app generate a music clip. — Read More