ChatGPT rival Pi, from Inflection, now performs “neck and neck with” OpenAI’s GPT-4 thanks to a new model, according to data first shared with Axios.
Why it matters: Inflection faces a crowded field in the market for AI-based assistants, competing against better-heeled rivals including Google, Microsoft and OpenAI, among others.
Driving the news: Inflection is announcing Thursday that Pi has been using a new model, version 2.5, in recent weeks and that the updated engine is now “powering Pi for the majority of users.” — Read More
Monthly Archives: March 2024
Mikey Shulman: Suno and the Sound of AI Music
How Public AI Can Strengthen Democracy
With the world’s focus turning to misinformation, manipulation, and outright propaganda ahead of the 2024 U.S. presidential election, we know that democracy has an AI problem. But we’re learning that AI has a democracy problem, too. Both challenges must be addressed for the sake of democratic governance and public protection.
Just three Big Tech firms (Microsoft, Google, and Amazon) control about two-thirds of the global market for the cloud computing resources used to train and deploy AI models. They have a lot of the AI talent, the capacity for large-scale innovation, and face few public regulations for their products and activities.
The increasingly centralized control of AI is an ominous sign for the co-evolution of democracy and technology. When tech billionaires and corporations steer AI, we get AI that tends to reflect the interests of tech billionaires and corporations, instead of the general public or ordinary consumers.
To benefit society as a whole we also need strong public AI as a counterbalance to corporate AI, as well as stronger democratic institutions to govern all of AI. — Read More
Introducing CopyrightCatcher, the first Copyright Detection API for LLMs
LLM training data often contains copyrighted works, and it is pretty easy to get an LLM to generate exact reproductions from these texts1. It is critical to catch these reproductions, since they pose significant legal and reputational risks for companies that build and use LLMs in production systems2. OpenAI, Anthropic, and Microsoft have all faced copyright lawsuits on LLM generations from authors3, music publishers4, and more recently, the New York Times5.
To check whether LLMs respond to your prompts with copyrighted text, you can use CopyrightCatcher. It detects when LLMs generate exact reproductions of content from text sources like books, and highlights any copyrighted text in LLM outputs. Check out our public CopyrightCatcher demo here! — Read More
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
We explore how generating a chain of thought — a series of intermediate reasoning steps — significantly improves the ability of large language models to perform complex reasoning. In particular, we show how such reasoning abilities emerge naturally in sufficiently large language models via a simple method called chain of thought prompting, where a few chain of thought demonstrations are provided as exemplars in prompting.
Experiments on three large language models show that chain of thought prompting improves performance on a range of arithmetic, commonsense, and symbolic reasoning tasks. The empirical gains can be striking. For instance, prompting a 540B-parameter language model with just eight chain of thought exemplars achieves state of the art accuracy on the GSM8K benchmark of math word problems, surpassing even finetuned GPT-3 with a verifier. — Read More
I used generative AI to turn my story into a comic—and you can too
Thirteen years ago, as an assignment for a journalism class, I wrote a stupid short story about a man who eats luxury cat food. This morning, I sat and watched as a generative AI platform called Lore Machine brought my weird words to life.
I fed my story into a text box and got this message: “We are identifying scenes, locations, and characters as well as vibes. This process can take up to 2 minutes.” Lore Machine analyzed the text, extracted descriptions of the characters and locations mentioned, and then handed those bits of information off to an image-generation model. An illustrated storyboard popped up on the screen. As I clicked through vivid comic-book renderings of my half-forgotten characters, my heart was pounding. — Read More
Aggregator’s AI Risk
A recurring theme on Stratechery is that the only technology analogous to the Internet’s impact on humanity is the printing press: Johannes Gutenberg’s invention in 1440 drastically reduced the marginal cost of printing books, dramatically increasing the amount of information that could be disseminated.
Of course you still had to actually write the book, and set the movable type in the printing press; this, though, meant we had the first version of the classic tech business model: the cost to create a book was fixed, but the potential revenue from printing a book — and overall profitability — was a function of how many copies you could sell. Every additional copy increased the leverage on the up-front costs of producing the book in the first place, improving the overall profitability; this, by extension, meant there were strong incentives to produce popular books.
… In this view the Internet is the final frontier, and not just because the American West was finally settled: on the Internet there are, or at least were, no rules, and not just in the legalistic sense; there were also no more economic rules as understood in the world of the printing press. Publishing and distribution were now zero marginal cost activities, just like consumption: you didn’t need a printing press. — Read More
ComPromptMized: Unleashing Zero-click Worms that Target GenAI-Powered Applications
In the past year, numerous companies have incorporated Generative AI (GenAI) capabilities into new and existing applications, forming interconnected Generative AI (GenAI) ecosystems consisting of semi/fully autonomous agents powered by GenAI services. While ongoing research highlighted risks associated with the GenAI layer of agents (e.g., dialog poisoning, privacy leakage, jailbreaking), a critical question emerges: Can attackers develop malware to exploit the GenAI component of an agent and launch cyber-attacks on the entire GenAI ecosystem?
This paper introduces Morris II, the first worm designed to target GenAI ecosystems through the use of adversarial self-replicating prompts. The study demonstrates that attackers can insert such prompts into inputs that, when processed by GenAI models, prompt the model to replicate the input as output (replication) and engage in malicious activities (payload). Additionally, these inputs compel the agent to deliver them (propagate) to new agents by exploiting the connectivity within the GenAI ecosystem. We demonstrate the application of Morris II against GenAI-powered email assistants in two use cases (spamming and exfiltrating personal data), under two settings (black-box and white-box accesses), using two types of input data (text and images). The worm is tested against three different GenAI models (Gemini Pro, ChatGPT 4.0, and LLaVA), and various factors (e.g., propagation rate, replication, malicious activity) influencing the performance of the worm are evaluated. — Read More
Introducing the next generation of Claude
Today, we’re announcing the Claude 3 model family, which sets new industry benchmarks across a wide range of cognitive tasks. The family includes three state-of-the-art models in ascending order of capability: Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Each successive model offers increasingly powerful performance, allowing users to select the optimal balance of intelligence, speed, and cost for their specific application.
Opus and Sonnet are now available to use in claude.ai and the Claude API which is now generally available in 159 countries. Haiku will be available soon. — Read More
Could We Achieve AGI Within 5 Years? NVIDIA’s CEO Jensen Huang Believes It’s Possible
In the dynamic field of artificial intelligence, the quest for Artificial General Intelligence (AGI) represents a pinnacle of innovation, promising to redefine the interplay between technology and human intellect. Jensen Huang, CEO of NVIDIA, a trailblazer in AI technology, recently brought this topic to the forefront of technological discourse. During a forum at Stanford University, Huang posited that AGI might be realized within the next five years, a projection that hinges critically on the definition of AGI itself.
According to Huang, if AGI is characterized by its ability to successfully pass a diverse range of human tests, then this milestone in AI development is not merely aspirational but could be nearing actualization. This statement from a leading figure in the AI industry not only sparks interest but also prompts a reassessment of our current understanding of artificial intelligence and its potential trajectory in the near future. — Read More