Why Are We Letting the AI Crisis Just Happen?

Bad actors could seize on large language models to engineer falsehoods at unprecedented scale.

New AI systems such as ChatGPT, the overhauled Microsoft Bing search engine, and the reportedly soon-to-arrive GPT-4 have utterly captured the public imagination. ChatGPT is the fastest-growing online application, ever, and it’s no wonder why. Type in some text, and instead of getting back web links, you get well-formed, conversational responses on whatever topic you selected—an undeniably seductive vision.

But the public, and the tech giants, aren’t the only ones who have become enthralled with the Big Data–driven technology known as the large language model. Bad actors have taken note of the technology as well. At the extreme end, there’s Andrew Torba, the CEO of the far-right social network Gab, who said recently that his company is actively developing AI tools to “uphold a Christian worldview” and fight “the censorship tools of the Regime.” But even users who aren’t motivated by ideology will have their impact. Clarkesworld, a publisher of sci-fi short stories, temporarily stopped taking submissions last month, because it was being spammed by AI-generated stories—the result of influencers promoting ways to use the technology to “get rich quick,” the magazine’s editor told The Guardian.

This is a moment of immense peril … Read More

#nlp, #fake

Meet DuckAssist, DuckDuckGo’s New AI Feature

DuckAssist isn’t a chatbot, but it uses artificial intelligence to help answer your questions.

Privacy-focused search engine DuckDuckGo unveiled on Wednesday an optional artificial intelligence feature called DuckAssist. Users of DuckDuckGo’s browser apps or extensions can access a beta version of the feature now, for free.

Unlike ChatGPT or Microsoft’s Bing AI, DuckAssist isn’t a chatbot, DuckDuckGo says. Instead, it’s an addition to the search engine’s existing Instant Answers feature. Instant Answers taps various online sources to give you a quick answer to your query without you having to click one of the links in the search results. Now DuckAssist can lend a hand, but it pulls from a smaller set of sources. Read More

#nlp

You.com challenges Google, Microsoft with launch of ‘multimodal conversational AI’ in search

You.com, a pioneering search engine startup based in San Francisco, CA, announced today the launch of YouChat 2.0, a groundbreaking new “multimodal conversational AI” system that promises to take the internet search experience to a whole new level. This update marks a significant step forward in the evolution of web search and offers a glimpse into the future of how we interact with information and the internet.

YouChat 2.0 is the first web search that combines advanced conversational AI with community-built apps, offering a unique and interactive experience with each query. With its blended large language model known as C-A-L (Chat, Apps and Links), YouChat 2.0 is able to serve up charts, images, videos, tables, graphs, text or code embedded in its responses to user queries. That means fewer tabs open and less drifting away from your search engine. Read More

Try It Here

#chatbots, #nlp

It’s Time to Embrace Intelligent Document Processing

Technological advancements in artificial intelligence (AI) have made it possible for businesses to unearth meaningful insights from unstructured documents more efficiently than ever.

A growing number of modern enterprises are embracing intelligent document processing (IDP) — a technique that leverages AI technologies such as natural language processing (NLP) and machine learning (ML), to transform unstructured and semi-structured data into usable information.

It’s certainly a step in the right direction. Companies must take advantage of AI-powered data extraction tools to process documents efficiently. It’s faster, more cost-effective, and more scalable. Read More

#nlp

7 problems facing Bing, Bard, and the future of AI search

Microsoft and Google say a new era of AI-assisted search is coming. But as with any new era in tech, it comes with plenty of problems, from bullshit generation to culture wars and the end of ad revenue.

This week, Microsoft and Google promised that web search is going to change. Yes, Microsoft did it in a louder voice while jumping up and down and saying “look at me, look at me,” but both companies now seem committed to using AI to scrape the web, distill what it finds, and generate answers to users’ questions directly — just like ChatGPT.

Microsoft calls its efforts “the new Bing” and is building related capabilities into its Edge browser. Google’s is called project Bard, and while it’s not yet ready to sing, a launch is planned for the “coming weeks.” And of course, there’s the troublemaker that started it all: OpenAI’s ChatGPT, which exploded onto the web last year and showed millions the potential of AI Q&A. Read More

#nlp, #llm

Battle of the Behemoths

The tech giants are girding their loins for battle in the AI search space.

Microsoft announced that today, we’re launching an all new, AI-powered Bing search engine and Edge browser, available in preview now at Bing.com, to deliver better search, more complete answers, a new chat experience and the ability to generate content. We think of these tools as an AI copilot for the web.

“AI will fundamentally change every software category, starting with the largest category of all – search,” said Satya Nadella, Chairman and CEO, Microsoft. “Today, we’re launching Bing and Edge powered by AI copilot and chat, to help people get more from search and the web.” Read More

Meanwhile, Google’s CEO, Sundar Pichai, announced Bard, a ChatGPT competitor, in a blog post today, describing the tool as an “experimental conversational AI service” that will answer users’ queries and take part in conversations. The software will be available to a group of “trusted testers” today, says Pichai, before becoming “more widely available to the public in the coming weeks.”

It’s not clear exactly what capabilities Bard will have, but it seems the chatbot will be just as free ranging as OpenAI’s ChatGPT. A screenshot encourages users to ask Bard practical queries, like how to plan a baby shower or what kind of meals could be made from a list of ingredients for lunch. Read More

Not to be outdone, China’s largest search engine company plans to debut a ChatGPT-style application in March, initially embedding it into its main search services, said the person, asking to remain unidentified discussing private information. The tool, whose name hasn’t been decided, will allow users to get conversation-style search results much like OpenAI’s popular platform. Read More

#big7, #chatbots, #nlp

Exclusive Interview: OpenAI’s Sam Altman Talks ChatGPT And How Artificial General Intelligence Can ‘Break Capitalism’

As CEO of OpenAI, Sam Altman captains the buzziest — and most scrutinized — startup in the fast-growing generative AI category, the subject of a recent feature story in the February issue of Forbes.

After visiting OpenAI’s San Francisco offices in mid-January, Forbes spoke to the recently press-shy investor and entrepreneur about ChatGPT, artificial general intelligence and whether his AI tools pose a threat to Google Search. Read More

#chatbots, #nlp

Whispers of A.I.’s Modular Future

ChatGPT is in the spotlight, but it’s Whisper—OpenAI’s open-source speech-transcription program—that shows us where machine learning is going.

One day in late December, I downloaded a program called Whisper.cpp onto my laptop, hoping to use it to transcribe an interview I’d done. I fed it an audio file and, every few seconds, it produced one or two lines of eerily accurate transcript, writing down exactly what had been said with a precision I’d never seen before. As the lines piled up, I could feel my computer getting hotter. This was one of the few times in recent memory that my laptop had actually computed something complicated—mostly I just use it to browse the Web, watch TV, and write. Now it was running cutting-edge A.I.

Despite being one of the more sophisticated programs ever to run on my laptop, Whisper.cpp is also one of the simplest. If you showed its source code to A.I. researchers from the early days of speech recognition, they might laugh in disbelief, or cry—it would be like revealing to a nuclear physicist that the process for achieving cold fusion can be written on a napkin. Whisper.cpp is intelligence distilled. It’s rare for modern software in that it has virtually no dependencies—in other words, it works without the help of other programs. Instead, it is ten thousand lines of stand-alone code, most of which does little more than fairly complicated arithmetic. It was written in five days by Georgi Gerganov, a Bulgarian programmer who, by his own admission, knows next to nothing about speech recognition. Gerganov adapted it from a program called Whisper, released in September by OpenAI, the same organization behind ChatGPT and dall-e. Whisper transcribes speech in more than ninety languages. In some of them, the software is capable of superhuman performance—that is, it can actually parse what somebody’s saying better than a human can.

What’s so unusual about Whisper is that OpenAI open-sourced it, releasing not just the code but a detailed description of its architecture. They also included the all-important “model weights”: a giant file of numbers specifying the synaptic strength of every connection in the software’s neural network. In so doing, OpenAI made it possible for anyone, including an amateur like Gerganov, to modify the program. Gerganov converted Whisper to C++, a widely supported programming language, to make it easier to download and run on practically any device. This sounds like a logistical detail, but it’s actually the mark of a wider sea change. Until recently, world-beating A.I.s like Whisper were the exclusive province of the big tech firms that developed them. Read More

#audio, #nlp

Extracting Training Data from Diffusion Models

Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted significant attention due to their ability to generate high-quality synthetic images. In this work, we show that diffusion models memorize individual images from their training data and emit them at generation time. With a generate-and-filter pipeline, we extract over a thousand training examples from state-of-the-art models, ranging from photographs of individual people to trademarked company logos. We also train hundreds of diffusion models in various settings to analyze how different modeling and data decisions affect privacy. Overall, our results show that diffusion models are much less private than prior generative models such as GANs, and that mitigating these vulnerabilities may require new advances in privacy-preserving training. Read More

#chatbots, #nlp, #Diffusion

FOLIO: Natural Language Reasoning with First-Order Logic

We present FOLIO, a human-annotated, open-domain, and logically complex and diverse dataset for reasoning in natural language (NL), equipped with first order logic (FOL) annotations. FOLIO consists of 1,435 examples (unique conclusions), each paired with one of 487 sets of premises which serve as rules to be used to deductively reason for the validity of each conclusion. The logical correctness of premises and conclusions is ensured by their parallel FOL annotations, which are automatically verified by our FOL inference engine. In addition to the main NL reasoning task, NL-FOL pairs in FOLIO automatically constitute a new NL-FOL translation dataset using FOL as the logical form. Our experiments on FOLIO systematically evaluate the FOL reasoning ability of supervised fine-tuning on medium-sized language models (BERT, RoBERTa) and few-shot prompting on large language models (GPT-NeoX, OPT, GPT-3, Codex). For NL-FOL translation, we experiment with GPT-3 and Codex. Our results show that one of the most capable Large Language Model (LLM) publicly available, GPT-3 davinci, achieves only slightly better than random results with few-shot prompting on a subset of FOLIO, and the model is especially bad at predicting the correct truth values for False and Unknown conclusions.  Read More

#nlp