Amazon Is Building an LLM Twice the Size of OpenAI’s GPT-4

Few markets have grown as fast, in as short a time, as artificial intelligence (AI).

And as the technology is increasingly deployed across industries ranging from marketing, to payments, to insurance, execution speed is only becoming more important.

This, as per a Tuesday (Nov. 7) report, Amazon is working on an ambitious new large language model (LLM), which it could announce as soon as December.

Code named “Olympus,” the rumored LLM is set to be one of the largest foundation models ever trained at an alleged 2 trillion parameters — double that of the closest competitor, OpenAI’s state-of-the-art GPT-4 model, which has 1 trillion parameters. — Read More

#big7

AGI is Being Achieved Incrementally (OpenAI DevDay w/ Simon Willison, Alex Volkov, Jim Fan, Raza Habib, Shreya Rajpal, Rahul Ligma, et al)

We summon all friends of the pod, and past and future guests including leaders from Nvidia, Zapier, HumanLoop, Weights and Biases, MultiOn, Guardrails, Bloop.ai, Julius AI to process what happened. — Read More

#podcasts

Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models

Transformer models, notably large language models (LLMs), have the remarkable ability to perform in-context learning (ICL) — to perform new tasks when prompted with unseen input-output examples without any explicit model training. In this work, we study how effectively transformers can bridge between their pretraining data mixture, comprised of multiple distinct task families, to identify and learn new tasks in-context which are both inside and outside the pretraining distribution. Building on previous work, we investigate this question in a controlled setting, where we study transformer models trained on sequences of (x,f(x)) pairs rather than natural language. Our empirical results show transformers demonstrate near-optimal unsupervised model selection capabilities, in their ability to first in-context identify different task families and in-context learn within them when the task families are well-represented in their pretraining data. However when presented with tasks or functions which are out-of-domain of their pretraining data, we demonstrate various failure modes of transformers and degradation of their generalization for even simple extrapolation tasks. Together our results highlight that the impressive ICL abilities of high-capacity sequence models may be more closely tied to the coverage of their pretraining data mixtures than inductive biases that create fundamental generalization capabilities. — Read More

#human

Announcing Elon Musk’s Grok

Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy, so intended to answer almost anything and, far harder, even suggest what questions to ask!

Grok is designed to answer questions with a bit of wit and has a rebellious streak, so please don’t use it if you hate humor!

A unique and fundamental advantage of Grok is that it has real-time knowledge of the world via the 𝕏 platform. It will also answer spicy questions that are rejected by most other AI systems. — Read More

#chatbots

AI companies have all kinds of arguments against paying for copyrighted content

The US Copyright Office is taking public comment on potential new rules around generative AI’s use of copyrighted materials, and the biggest AI companies in the world had plenty to say. We’ve collected the arguments from MetaGoogleMicrosoftAdobeHugging FaceStabilityAI, and Anthropic below, as well as a response from Apple that focused on copyrighting AI-written code.

There are some differences in their approaches, but the overall message for most is the same: They don’t think they should have to pay to train AI models on copyrighted work. — Read More

#legal

OpenAI turbocharges GPT-4 and makes it cheaper

OpenAI announced more improvements to its large language models, GPT-4 and GPT-3.5, including updated knowledge bases and a much longer context window. The company says it will also follow Google and Microsoft’s lead and begin protecting customers against copyright lawsuits.

GPT-4 Turbo, currently available via an API preview, has been trained with information dating to April 2023, the company announced Monday at its first-ever developer conference. The earlier version of GPT-4 released in March only learned from data dated up to September 2021. OpenAI plans to release a production-ready Turbo model in the next few weeks but did not give an exact date. — Read More

#chatbots, #nlp

Silicon Valley is getting into the spy business

New Yorkers may have noticed an unwelcome guest hovering around their parties in early September. In the lead up to Labor Day weekend, the New York Police Department (NYPD) said it will use drones to look into complaints about celebrations, including backyard gatherings. Police drone spying is common in America. Nearly a quarter of police departments now use them, according to a recent survey by researchers at Northwestern Pritzker School of Law.

Even more surprising is where the technology is coming from. Among the NYPD’s suppliers is Skydio, a Silicon Valley firm that uses artificial intelligence (AI) to make drones easier to fly, allowing officers to control them with little training. Skydio is backed by venture-capital (VC) giant Andreessen Horowitz and one of its partners, Accel. The NYPD is also buying from another startup, BRINC, which makes flying machines equipped with night-vision cameras that can break window glass. BRINC investors include Sam Altman, the boss of OpenAI, the startup behind ChatGPT; and Index Ventures, another VC giant. — Read More

#surveillance

Will we be replaced? The future of work in the age of Generative AI w/Jonny Gilmore, CEO of Ai8

“How can we affect education for the better?” In this thought-provoking AI Talk, Jonny Gilmore, CEO of Ai8, explains the transformative potential of human:machine teams in the education to career value chain. Ai8 aims to redefine the entire system of education, training, employment, and upskilling, making it more bespoke, affordable, and accessible. — Read More

#augmented-intelligence

SALMONN, the First Model that Hears like Humans do

People often underestimate the importance of hearing to function correctly in our world and, more importantly, as an essential tool for learning.

As the famed Helen Keller once said, “Blindness cuts us off from things, but deafness cuts us off from people” and let’s not forget that this woman was blind and deaf.

Therefore, it’s only natural to see hearing as an indispensable requirement for AI to become the sought-after superior ‘being’ that some people predict it will become.

Sadly, current AI systems suck at hearing.

… Now, a new model created by the company behind TikTok, ByteDance, challenges this vision.

SALMONN is the first-ever multimodal audio-language AI system for generic hearing, a model that can process random audio signals from the three main sound types: speech, audio events, and music. — Read More

Read the Paper

#audio

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

OpenChat is an innovative library of open-source language models, fine-tuned with C-RLFT – a strategy inspired by offline reinforcement learning. Our models learn from mixed-quality data without preference labels, delivering exceptional performance on par with ChatGPT, even with a 7B model. Despite our simple approach, we are committed to developing a high-performance, commercially viable, open-source large language model, and we continue to make significant strides toward this vision. — Read More

#devops