EARLIER THIS MONTH, Sundar Pichai was struggling to write a letter to Alphabet’s 180,000 employees. The 51-year-old CEO wanted to laud Google on its 25th birthday, which could have been easy enough. Alphabet’s stock market value was around $1.7 trillion. Its vast cloud-computing operation had turned its first profit. Its self-driving cars were ferrying people around San Francisco. And then there was the usual stuff—Google Search still dominated the field, as it had for every minute of this century. The company sucks up almost 40 percent of all global digital advertising revenue.
But not all was well on Alphabet’s vast Mountain View campus. The US government was about to put Google on trial for abusing its monopoly in search. And the comity that once pervaded Google’s workforce was frayed. Some high-profile employees had left, complaining that the company moved too slowly. Perhaps most troubling, Google—a long-standing world leader in artificial intelligence—had been rudely upstaged by an upstart outsider, OpenAI. Google’s longtime rival Microsoft had beaten it to the punch with a large language model built into its also-ran search engine Bing, causing panic in Mountain View. Microsoft CEO Satya Nadella boasted, “I want people to know we made Google dance.” — Read More
Tag Archives: Big7
LLMs and Tool Use
Last March, just two weeks after GPT-4 was released, researchers at Microsoft quietly announced a plan to compile millions of APIs—tools that can do everything from ordering a pizza to solving physics equations to controlling the TV in your living room—into a compendium that would be made accessible to large language models (LLMs). This was just one milestone in the race across industry and academia to find the best ways to teach LLMs how to manipulate tools, which would supercharge the potential of AI more than any of the impressive advancements we’ve seen to date.
The Microsoft project aims to teach AI how to use any and all digital tools in one fell swoop, a clever and efficient approach. Today, LLMs can do a pretty good job of recommending pizza toppings to you if you describe your dietary preferences and can draft dialog that you could use when you call the restaurant. But most AI tools can’t place the order, not even online. In contrast, Google’s seven-year-old Assistant tool can synthesize a voice on the telephone and fill out an online order form, but it can’t pick a restaurant or guess your order. By combining these capabilities, though, a tool-using AI could do it all. An LLM with access to your past conversations and tools like calorie calculators, a restaurant menu database, and your digital payment wallet could feasibly judge that you are trying to lose weight and want a low-calorie option, find the nearest restaurant with toppings you like, and place the delivery order. If it has access to your payment history, it could even guess at how generously you usually tip. If it has access to the sensors on your smartwatch or fitness tracker, it might be able to sense when your blood sugar is low and order the pie before you even realize you’re hungry.
Perhaps the most compelling potential applications of tool use are those that give AIs the ability to improve themselves. — Read More
Artist-created images and animations about artificial intelligence (AI) made freely available online
What does artificial intelligence (AI) look like? Searching online, the answer is likely streams of code, glowing blue brains or white robots with men in suits.
… Since launching, Visualising AI has commissioned 13 artists to create more than 100 artworks, gaining over 100 million views, 800,000 downloads, and our imagery has been used by media outlets, research and civil society organisations. — Read More
View images on Unsplash
View videos on Pexels
Introducing SeamlessM4T, a Multimodal AI Model for Speech and Text Translations
The world we live in has never been more interconnected, giving people access to more multilingual content than ever before. This also makes the ability to communicate and understand information in any language increasingly important.
Today, we’re introducing SeamlessM4T, the first all-in-one multimodal and multilingual AI translation model that allows people to communicate effortlessly through speech and text across different languages. — Read More
Google: Supercharging Search with generative AI
For the past 25 years, we’ve been devoted to the science and the craft of building a search engine. We’ve developed completely new ways to search, powered by our latest advancements in AI — whether that’s searching visually with Lens, or across modalities, using both images and text with multisearch. In fact, people now use Lens for 12 billion visual searches a month — a four-fold increase in just two years, and a growing number of those searches are multimodal.
With new breakthroughs in generative AI, we’re again reimagining what a search engine can do. With this powerful new technology, we can unlock entirely new types of questions you never thought Search could answer, and transform the way information is organized, to help you sort through and make sense of what’s out there.
Today we’re sharing a look at our first steps in this new era of Search, and you’ll be able to first try these generative AI capabilities in Search Labs, a new way to access early experiments in Search. — Read More
Every Amazon division is working on generative AI projects
Just like pretty much every other major tech company, Amazon is placing a heavy focus on generative artificial intelligence. CEO Andy Jassy noted on Amazon’s latest earnings call that every division has multiple generative AI projects in the works.
“Inside Amazon, every one of our teams is working on building generative AI applications that reinvent and enhance their customers’ experience,” Jassy said. “But while we will build a number of these applications ourselves, most will be built by other companies, and we’re optimistic that the largest number of these will be built on [Amazon Web Services]. Remember, the core of AI is data. People want to bring generative AI models to the data, not the other way around.” — Read More
Google’s AI search is getting more video and better links
Google’s AI-powered Search Generative Experience is getting a big new feature: images and video. If you’ve enabled the AI-based SGE feature in Search Labs, you’ll now start to see more multimedia in the colorful summary box at the top of your search results. Google’s also working on making that summary box appear faster and adding more context to the links it puts in the box.
SGE may still be in the “experiment” phase, but it’s very clearly the future of Google Search. “It really gives us a chance to, now, not always be constrained in the way search was working before,” CEO Sundar Pichai said on Alphabet’s most recent earnings call. “It allows us to think outside the box.” He then said that “over time, this will just be how search works.” — Read More
A Silent New AI Bombshell Launch Nobody Saw Coming
Would you use a (great) free AI product that makes you the product?
Meta’s pulling out the big guns. LLaMA 2, their shiny new AI, is now open-source. Free for anyone. And I mean anyone. Your grandma, your dog, even your weird neighbor who still uses a flip phone.
But why?
Is it a noble quest for democratizing AI? Or a desperate attempt to catch up with the cool kids, Microsoft and Google? — Read More
No More Paperwork? Amazon AI Tool Transcribes Patient Visits for Doctors
Amazon’s AWS division today unveiled a new AI and speech-recogition tool intended to help doctors enter patient visit notes into their systems.
For now, AWS HealthScribe is only available as a preview in Northern Virginia (home of Amazon HQ2). But it promises to generate transcripts with “word-level timestamps” of patient visits, and automatically “identifies speaker roles, like patient and clinician, for each dialogue in the transcript,” Amazon says. — Read More