People use AI for a wide range of speech recognition and understanding tasks, from enabling smart speakers to developing tools for people who are hard of hearing or who have speech impairments. But oftentimes these speech understanding systems don’t work well in the everyday situations when we need them most: Where multiple people are speaking simultaneously or when there’s lots of background noise. Even sophisticated noise-suppression techniques are often no match for, say, the sound of the ocean during a family beach trip or the background chatter of a bustling street market.
One reason why people can understand speech better than AI in these instances is that we use not just our ears but also our eyes. We might see someone’s mouth moving and intuitively know the voice we’re hearing must be coming from her, for example. That’s why Meta AI is working on new conversational AI systems that can recognize the nuanced correlations between what they see and what they hear in conversation, like we do.
To help us build these more versatile and robust speech recognition tools, we are announcing Audio-Visual Hidden Unit BERT (AV-HuBERT), a state-of-the-art self-supervised framework for understanding speech that learns by both seeing and hearing people speak. It is the first system to jointly model speech and lip movements from unlabeled data — raw video that has not already been transcribed. Read More
Tag Archives: Big7
AWS makes AI and machine learning tangible with first major art debut at Smithsonian
Amazon Web Services Inc. has commissioned its first-ever major art piece, a site-specific sculpture powered by artificial intelligence and designed by artist and architect Suchi Reddy that will be the centerpiece of the Smithsonian’s “Futures” exhibit.
The artwork, called “me + you,” was unveiled today in the 90-foot-tall central rotunda of the Smithsonian’s historic Arts and Industries Building in Washinton, D.C. It’s an important locale as America’s first national museum and because the interactive sculpture itself is nearly two stories tall. Read More
Microsoft Metaverse vs Facebook Metaverse
Google Wants to Work With the Pentagon Again, Despite Employee Concerns
Three years ago, the company walked away from a Defense Department project after employees objected to it. Now the company is working on a new proposal for the Pentagon.
Three years after an employee revolt forced Google to abandon work on a Pentagon program that used artificial intelligence, the company is aggressively pursuing a major contract to provide its technology to the military.
The company’s plan to land the potentially lucrative contract, known as the Joint Warfighting Cloud Capability, could raise a furor among its outspoken work force and test the resolve of management to resist employee demands.
In 2018, thousands of Google employees signed a letter protesting the company’s involvement in Project Maven, a military program that uses artificial intelligence to interpret video images and could be used to refine the targeting of drone strikes. Google management caved and agreed to not renew the contract once it expired. Read More
Microsoft Editor receives new Context IQ to improve your workflow
Microsoft has this week announced it has implemented Context IQ into the popular Microsoft Editor to help improve your workflow, “whether you’re a seasoned author or drafting your second blog post ever.” Context IQ has been created to provide intelligent suggestions informed by your own writing taking the Microsoft Editor application to a new level by providing users automatic assistance :
– When you need to attach, insert, or share a file with colleagues, Editor suggests a relevant file or document based on similar subjects or because you have created or worked on them before.
– When tagging colleagues in a file using the @ symbol in a comment or email, Editor recommends potential people to tag based on colleagues you currently work with or stakeholders you have previously tagged for document reviews.
– When you may be collaborating on a sales opportunity, and need to pull in Dynamics data, Editor will suggest related Dynamics 365 information as a Loop component allowing you to update and take an action on it in the flow of your work.
– When entering data or objects as you write, Editor suggests information such as a frequent flyer number when booking a flight online or a sales message when collaborating in Teams. Read More
New Azure OpenAI Service combines access to powerful GPT-3 language models with Azure’s enterprise capabilities
Since OpenAI, an AI research and deployment company, introduced its groundbreaking GPT-3 natural language model platform last year, users have discovered countless things that these AI models can do with their powerful and comprehensive understanding of language.
For instance, a sports franchise that’s developing a new app to engage with fans during games could use the models’ ability to quickly and abstractly summarize information to convert transcripts of live television commentary into game highlights that someone could choose to include within the app.
The marketing team could use GPT-3’s capability to generate original content and its understanding of what’s happening in the game to help the team brainstorm ideas for social media or blog posts and engage with fans more quickly.
At its Ignite conference today, Microsoft announced it will help its customers uncover these kinds of experiences with the new Azure OpenAI Service, which allows access to OpenAI’s API through the Azure platform and will initially be available by invite only. The new Azure Cognitive Service will give customers access to OpenAI’s powerful GPT-3 models, along with security, reliability, compliance, data privacy and other enterprise-grade capabilities that are built into Microsoft Azure. Read More
Facebook battles the challenges of tactile sensing
Facebook this morning announced ReSkin, an open source touch-sensing synthetic “skin” created by researchers at the company in collaboration with Carnegie Mellon University. Leveraging machine learning and magnetic sensing, ReSkin is designed to offer an inexpensive, versatile, durable, and replaceable solution for long-term use, employing an unsupervised learning algorithm to help auto-calibrate the sensor. Read More
#big7, #roboticsMissing the Point
When AI manipulates free speech, censorship is not the solution. Better code is.
Every issue is easy — if you just ignore the facts. And Glenn Greenwald has now given us a beautiful example of this eternal, and increasingly vital, truth.
In his Substack, Glenn attacks the Facebook whistleblower (he doesn’t call her that; he calls her a quote-whistleblower-unquote), Frances Haugen, for being an unwitting dupe of the Vast Leftwing Conspiracy that is now focused so intently on censoring free speech. To criticize what Facebook has done, in Glenn’s simple world, is to endorse the repeal of the First Amendment. To regulate Facebook is to start us down the road, if not to serfdom, then certainly to a Substack-less world.
But all this looks so simple to Glenn, because he’s so good at ignoring how technology matters — to everything, and especially to modern media. Glenn doesn’t do technology. Read More
DeepMind and Alphabet: who needs markets?
DeepMind, the artificial intelligence company founded in 2010 by Demis Hassabis, Shane Legg and Mustafa Suleyman, and acquired by Alphabet in 2014 for $650 million, has published its financial results, revealing what might be politely called a “creative accounting” issue.
In principle, it all sounds very promising: after a few years, DeepMind is now apparently profitable, with revenues of $1.13 billion in 2020, three times 2019’s $361 million, in the face of relatively restrained expenses that rose from $976 million in 2019 to $1.06 billion in 2020. Seen in this light, the picture is one of a cutting-edge company that, after years of heavy investment and significant losses, achieves profitability thanks to strong revenue growth and relative containment of its expenses. At last, Alphabet can put DeepMind among the companies that, under its umbrella, generate revenue. From red to black in just a few years. When all is said and done, it is fairly common for pioneering companies like this one to often spend long periods investing and incurring in heavy losses. Read More
Big Tech & Their Favourite Deep Learning Techniques
Every week, the top AI labs globally — Google, Facebook, Microsoft, Apple, etc. — release tons of new research work, tools, datasets, models, libraries and frameworks in artificial intelligence (AI) and machine learning (ML).
Interestingly, they all seem to have picked a particular school of thought in deep learning. With time, this pattern is becoming more and more clear. Read More
