Google’s update to Gemini 1.5 Pro gives the model ears. The model can now listen to uploaded audio files and churn out information from things like earnings calls or audio from videos without the need to refer to a written transcript.
During its Google Next event, Google also announced it’ll make Gemini 1.5 Pro available to the public for the first time through its platform to build AI applications, Vertex AI. Gemini 1.5 Pro was first announced in February. — Read More
Recent Updates Page 120
New Google and Intel Chips
Google is stepping up its competition with Nvidia in the artificial intelligence (AI) chip market by developing custom hardware solutions.
Google has unveiled a new lineup of custom chips designed to bolster its position in the rapidly evolving artificial intelligence (AI) market. The tech giant introduced the Tensor Processing Units (TPUs) and an Arm-based central processing unit (CPU) named Axion, showcasing its commitment to innovate in AI hardware.
While the TPUs offer a competitive alternative to Nvidia’s AI chips, they are exclusively accessible through Google Cloud and unavailable for direct purchase. — Read More
Meta confirms that its Llama 3 open source LLM is coming in the next month
At an event in London on Tuesday, Meta confirmed that it plans an initial release of Llama 3 — the next generation of its large language model used to power generative AI assistants — within the next month.
This confirms a report published on Monday by The Information that Meta was getting close to launch.
“Within the next month, actually less, hopefully in a very short period of time, we hope to start rolling out our new suite of next-generation foundation models, Llama 3,” said Nick Clegg, Meta’s president of global affairs. — Read More
AMAZON GIVES ANTHROPIC $2.75 BILLION SO IT CAN SPEND IT ON AWS XPUS
If Microsoft has the half of OpenAI that didn’t leave, then Amazon and its Amazon Web Services cloud division needs the half of OpenAI that did leave – meaning Anthropic. And that means Amazon needs to pony up a lot more money than Google, which has also invested in Anthropic but which also has its own Gemini LLM, if it hopes to have more leverage – and get the GPU system rentals in return.
We live in strange times. … Microsoft investing $13 billion in OpenAI – with a $10 billion promise last year – and now Amazon making good on its promise to invest $4 billion in Anthropic by kicking in the second traunch of $2.75 billion is a brilliant way to buy a stake in any AI startup. You get access to the startup’s models, you get a sense of their roadmap, and you get to be the first one to commercialize their products at scale.
As we have pointed out before, … [t]here is a danger of this looking like roundtripping, where the money just moves from the IT giant to the AI startup as an investment and then back again to the IT giant. (This kind of thing used to happen in the IT channel from time to time.) It would be enlightening to see how these deals are really structured. But there is a likelihood that they are really minority stakes in the AI startups for enormous sums and an actual exchange of goods and services on the part of both parties. — Read More
How Hollywood’s Most-Feared AI Video Tool Works — and What Filmmakers May Worry About
As generative artificial intelligence marches on the entertainment industry, Hollywood is taking stock of the tech and its potential to be incorporated into the filmmaking process. No tool has piqued the town’s interest more than OpenAI’s Sora, which was unveiled in February as capable of creating hyperrealistic clips in response to a text prompt of just a couple of sentences. In recent days, the Sam Altman-led firm released a series of videos from beta testers who are providing feedback to improve the tech. The Hollywood Reporter spoke with some of those Sora testers about what it can, and can’t, really do.
… [Walter] Woodman [of Shy Kids, a Toronto-based production company,] says he considers Sora another tool in his arsenal, similar to Adobe After Effects or Premiere. “It’s something where you bring your energy and your talents and you work with it to make something,” he explains. “There’s a lot of hot air about just how powerful this is and how this is going to replace everything and how we don’t need to do anything. That’s really undervaluing what a story is and what the components of a story are and what the role of storytellers is.” — Read More
A ‘Law Firm’ of AI Generated Lawyers Is Sending Fake Threats as an SEO Scam
Last week, Ernie Smith, the publisher of the website Tedium, got a “copyright infringement notice” from a law firm called Commonwealth Legal: “We’re reaching out on behalf of the Intellectual Property division of a notable entity, in relation to an image connected to our client,” it read.
… In this case, though, the email didn’t demand that the photo be taken down or specifically threaten a lawsuit. Instead, it demanded that Smith place a “visible and clickable link” beneath the photo in question to a website called “tech4gods” or the law firm would “take action.” Smith began looking into the law firm. And he found that Commonwealth Legal is not real, and that the images of its “lawyers” are AI generated. — Read More
Weapons of Mass Production
Post Malone is having a good month.
The artist was featured on Beyoncé’s new album Cowboy Carter in the song “LEVII’S JEANS.” And in a few weeks, Post Malone will feature again on spring’s other big release—Taylor Swift’s The Tortured Poets Department.
Post Malone’s feature on Tortured Poets comes in a song called “Fortnight,” and the song already leaked online. Well, not actually—but a lot of people were fooled into thinking so. An AI-generated version of “Fortnight” took TikTok by storm last month (it’s actually a banger) and duped everyone into believing the track leaked. — Read More
Meta’s AI image generator can’t imagine an Asian man with a white woman
Have you ever seen an Asian person with a white person, whether that’s a mixed-race couple or two friends of different races? Seems pretty common to me — I have lots of white friends!
To Meta’s AI-powered image generator, apparently this is impossible to imagine. I tried dozens of times to create an image using prompts like “Asian man and Caucasian friend,” “Asian man and white wife,” and “Asian woman and Caucasian husband.” Only once was Meta’s image generator able to return an accurate image featuring the races I specified. — Read More
Jamba: A Hybrid Transformer-Mamba Language Model
We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase model capacity while keeping active parameter usage manageable. This flexible architecture allows resource- and objective-specific configurations. In the particular configuration we have implemented, we end up with a powerful model that fits in a single 80GB GPU. Built at large scale, Jamba provides high throughput and small memory footprint compared to vanilla Transformers, and at the same time state-of-the-art performance on standard language model benchmarks and long-context evaluations. Remarkably, the model presents strong results for up to 256K tokens context length. We study various architectural decisions, such as how to combine Transformer and Mamba layers, and how to mix experts, and show that some of them are crucial in large scale modeling. We also describe several interesting properties of these architectures which the training and evaluation of Jamba have revealed, and plan to release checkpoints from various ablation runs, to encourage further exploration of this novel architecture. We make the weights of our implementation of Jamba publicly available under a permissive license. — Read More
Fine-tuning Language Models for Factuality
he fluency and creativity of large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines. Yet language models are prone to making convincing but factually inaccurate claims, often referred to as ‘hallucinations.’ These errors can inadvertently spread misinformation or harmfully perpetuate misconceptions. Further, manual fact-checking of model responses is a time-consuming process, making human factuality labels expensive to acquire. In this work, we fine-tune language models to be more factual, without human labeling and targeting more open-ended generation settings than past work. We leverage two key recent innovations in NLP to do so. First, several recent works have proposed methods for judging the factuality of open-ended text by measuring consistency with an external knowledge base or simply a large model’s confidence scores. Second, the direct preference optimization algorithm enables straightforward fine-tuning of language models on objectives other than supervised imitation, using a preference ranking over possible model responses. We show that learning from automatically generated factuality preference rankings, generated either through existing retrieval systems or our novel retrieval-free approach, significantly improves the factuality (percent of generated claims that are correct) of Llama-2 on held-out topics compared with RLHF or decoding strategies targeted at factuality. At 7B scale, compared to Llama-2-chat, we observe 58% and 40% reduction in factual error rate when generating biographies and answering medical questions, respectively. — Read More