Researchers develop AI that reads lips from video footage

AI and machine learning algorithms capable of reading lips from videos aren’t anything out of the ordinary, in truth. Back in 2016, researchers from Google and the University of Oxford detailed a system that could annotate video footage with 46.8% accuracy, outperforming a professional human lip-reader’s 12.4% accuracy. But even state-of-the-art systems struggle to overcome ambiguities in lip movements, preventing their performance from surpassing that of audio-based speech recognition.

In pursuit of a more performant system, researchers at Alibaba, Zhejiang University, and the Stevens Institute of Technology devised a method dubbed Lip by Speech (LIBS), which uses features extracted from speech recognizers to serve as complementary clues. They say it manages industry-leading accuracy on two benchmarks, besting the baseline by a margin of 7.66% and 2.75% in character error rate. Read More

#nlp, #voice

Chinese Public AI R&D Spending: Provisional Findings

China aims to become “the world’s primary AI innovation center” by 2030.1 Toward that end, the Chinese government is spending heavily on AI research and development (R&D). This memo provides a provisional, open-source estimate of China’s spending.

We assess with low to moderate confidence that China’s public investment in AI R&D was on the order of a few billion dollars in 2018. With higher confidence, we assess that China’s government is not investing tens of billions of dollars annually in AI R&D, as some have suggested. Read More

#china-ai