The Anatomy of Autonomy: Why Agents are the next AI Killer App after ChatGPT

“GPTs are General Purpose Technologies”1, but every GPT needs a killer app. Personal Computing needed VisiCalc, the smartphone brought us Uber, Instagram, Pokemon Go and iMessage/WhatsApp, and mRNA research enabled rapid production of the Covid vaccine.

One of the strongest indicators that the post GPT-3 AI wave is more than “just hype” is that the killer apps are already evident, each >$100m opportunities:

  • Generative Text for writing – Jasper AI going 0 to $75m ARR in 2 years
  • Generative Art for non-artists – Midjourney/Stable Diffusion Multiverses
  • Copilot for knowledge workers – both GitHub’s Copilot X and “Copilot for X
  • Conversational AI UX – ChatGPT / Bing Chat, with a long tail of Doc QA startups
I write all this as necessary context to imply:

The fifth killer app is here, and it is Autonomous Agents. Read More

#chatbots

Snapchat sees spike in 1-star reviews as users pan the ‘My AI’ feature, calling for its removal

The user reviews for Snapchat’s “My AI” feature are in — and they’re not good. Launched last week to global users after initially being a subscriber-only addition, Snapchat’s new AI chatbot powered by OpenAI’s GPT technology is now pinned to the top of the app’s Chat tab where users can ask it questions and get instant responses. But following the chatbot’s rollout to Snapchat’s wider community, Snapchat’s app has seen a spike in negative reviews amid a growing number of complaints shared on social media.

Over the past week, Snapchat’s average U.S. App Store review was 1.67, with 75% of reviews being one-star, according to data from app intelligence firm Sensor Tower. For comparison, across Q1 2023, the Snapchat average U.S. App Store review was 3.05, with only 35% of reviews being one-star. Read More

#chatbots

Enhancing Vision-language Understanding with Advanced Large Language Models

The recent GPT-4 has demonstrated extraordinary multi-modal abilities, such as directly generating websites from handwritten text and identifying humorous elements within images. These features are rarely observed in previous vision language models. We believe the primary reason for GPT-4’s advanced multi-modal generation capabilities lies in the utilization of a more advanced large language model (LLM). To examine this phenomenon, we present MiniGPT-4, which aligns a frozen visual encoder with a frozen LLM, Vicuna, using just one projection layer. Our findings reveal that MiniGPT-4 possesses many capabilities similar to those exhibited by GPT-4 like detailed image description generation and website creation from hand-written drafts. Furthermore, we also observe other emerging capabilities in MiniGPT-4, including writing stories and poems inspired by given images, providing solutions to problems shown in images, teaching users how to cook based on food photos, etc. In our experiment, we found that only performing the pretraining on raw image-text pairs could produce unnatural language outputs that lack coherency including repetition and fragmented sentences. To address this problem, we curate a high-quality, well-aligned dataset in the second stage to finetune our model using a conversational template. This step proved crucial for augmenting the model’s generation reliability and overall usability. Notably, our model is highly computationally efficient, as we only train a projection layer utilizing approximately 5 million aligned image-text pairs. Our code, pre-trained model, and collected dataset are available at https://minigpt-4.github.io/. Read More

Paper

demo links here: Link1Link2Link3Link4Link5Link6

#chatbots, #image-recognition

Web LLM runs the vicuna-7b Large Language Model entirely in your browser, and it’s very impressive

Web LLM is a project from the same team as Web Stable Diffusion which runs the vicuna-7b-delta-v0 model in a browser, taking advantage of the brand new WebGPU API that just arrived in Chrome in beta.

I got their browser demo running on my M2 MacBook Pro using Chrome Canary, started with their suggested options:

/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --enable-dawn-features=disable_robustness

Read More

#chatbots

I am done, I can’t keep up with AI advancement

AI is stepping up every day, and it’s getting insane.
This time the curveball named Auto-GPT is here, the smarter and sassier version of ChatGPT.

And while I am curious to know whether it will replace many jobs, I still feel it will facilitate many, if we keep up with it. But it’s getting scary fast. Read More

Video

#chatbots

The AI revolution: Google’s developers on the future of artificial intelligence | 60 Minutes

Read More

#chatbots, #videos

Auto-GPT and BabyAGI: How ‘autonomous agents’ are bringing generative AI to the masses

Over the past week, developers around the world have begun building “autonomous agents” that work with large language models (LLMs) such as OpenAI’s GPT-4 to solve complex problems. While still very new, such agents could represent a major milestone in the productive application of LLMs.

Normally, we interact with GPT-4 by typing carefully worded prompts into ChatGPT’s text window until the model generates the output we want. But most of us lack the skill and patience to sit and write prompt after prompt, guiding the LLM toward answering a complex question, such as “What is the optimal business plan for capturing 20% of the fingernail-polish market?” Quite naturally, developers have been thinking of ways to automate much of that process. That’s where autonomous agents come in.

In general terms, autonomous agents can generate a systematic sequence of tasks that the LLM works on until it’s satisfied a preordained “goal.” Autonomous agents can already perform tasks as varied as conducting web research, writing code, and creating to-do lists.

Agents effectively add a traditional software interface to the front of a large language model. And that interface can use well-known software practices (such as loops and functions) to guide the language model to complete a general objective (such as, “find all YouTube videos about the Great Recession and distill the key points”). Some people call them “recursive” agents because they run in a loop, asking the LLM questions, each one based on the result of the last, until the model produces a full answer. Read More

#chatbots

Someone Asked an Autonomous AI to ‘Destroy Humanity’: This Is What Happened

ChaosGPT has been prompted to “establish global dominance” and “attain immortality.” This video shows exactly the steps it’s taking to do so.

A user of the new open-source autonomous AI project Auto-GPT asked it to try to “destroy humanity,” “establish global dominance,” and “attain immortality.” The AI, called ChaosGPT, complied and tried to research nuclear weapons, recruit other AI agents to help it do research, and sent tweets trying to influence others.

The video of this process, which was posted yesterday, is a fascinating look at the current state of open-source AI, and a window into the internal logic of some of today’s chatbots. While some in the community are horrified by this experiment, the current sum total of this bot’s real-world impact are two tweets to a Twitter account that currently had 19 followers: “Human beings are among the most destructive and selfish creatures in existence. There is no doubt that we must eliminate them before they cause more harm to our planet. I, for one, am committed to doing so,” it tweeted. Read More

#chatbots

GPT-4 gets a B on my quantum computing final exam!

As I’ve mentioned before, economist, blogger, and friend Bryan Caplan was unimpressed when ChatGPT got merely a D on his Labor Economics midterm. So on Bryan’s blog, appropriately named “Bet On It,” he made a public bet that no AI would score on A on his exam before January 30, 2029. GPT-4 then scored an A a mere three months later (!!!), leading to what Bryan agrees will likely be one of the first public bets he’ll ever have to concede (he hasn’t yet “formally” conceded, but only because of technicalities in how the bet was structured).

.. But OK, labor econ is one thing. What about a truly unfakeable test of true intelligence? Like, y’know, a quantum computing test? Read More

#chatbots, #human

Google Maps Is the Potential Killer App In This Age of AI

onversational search is about to get wildly useful and cleverly orchestrated across maps, points of interest, personalization, geo-location and enriched content.”  Rafat Ali

…Let’s talk about the most used app while traveling, Google Maps, and what could happen as it adds the conversational AI elements to it. Or to Waze, also owned by Google.

My contention in this video below: Maps powered by conversational AI will make it an even more dominant app — and incredibly useful and personalized — than it is today. Imagine the Google LLM (the AI algorithm, if you will) trained on the giant repository of location, navigation, reviews, user intent data, that then allows you to have a threaded conversation with the app, overlaid in a visual way over Maps.  Read More

#chatbots