Who’s afraid of the big bad bots? A lot of people, it seems. The number of high-profile names that have now made public pronouncements or signed open letters warning of the catastrophic dangers of artificial intelligence is striking.
… Concerns about runaway, self-improving machines have been around since Alan Turing. Futurists like Vernor Vinge and Ray Kurzweil popularized these ideas with talk of the so-called Singularity, a hypothetical date at which artificial intelligence outstrips human intelligence and machines take over.
But at the heart of such concerns is the question of control: How do humans stay on top if (or when) machines get smarter? — Read More
Daily Archives: June 20, 2023
A.I. human-voice clones are coming for the Amazon, Apple, Google audiobook
Annual audiobook sales could reach over $30 billion within a decade, and time and cost of production suggest AI will play a bigger role in the future.
Google Play and Apple Books utilize AI-generated voices to some extent already, though there are high hurdles to recreating human voice pacing, intonation and emotion.
Voice actors say opportunities to clone their voices for speedier, cheaper production on some forms of audiobooks can’t be ignored. — Read More
The Curse of Recursion: Training on Generated Data Makes Models Forget
Stable Diffusion revolutionised image creation from descriptive text. GPT-2, GPT-3(.5) and GPT-4 demonstrated astonishing performance across a variety of language tasks. ChatGPT introduced such language models to the general public. It is now clear that large language models (LLMs) are here to stay, and will bring about drastic change in the whole ecosystem of online text and images. In this paper we consider what the future might hold. What will happen to GPT-{n} once LLMs contribute much of the language found online? We find that use of model-generated content in training causes irreversible defects in the resulting models, where tails of the original content distribution disappear. We refer to this effect as Model Collapse and show that it can occur in Variational Autoencoders, Gaussian Mixture Models and LLMs. We build theoretical intuition behind the phenomenon and portray its ubiquity amongst all learned generative models. We demonstrate that it has to be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web. Indeed, the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of content generated by LLMs in data crawled from the Internet. — Read More
#training, #transfer-learning