Apple vs. Meta: Who Will Win the Battle for Your Face?

Facebook CEO Mark Zuckerberg isn’t subtle about picking a fight with Apple over the future of the metaverse.

Facebook co-founder Mark Zuckerberg sees the metaverse as a wondrous new stage of tech’s advancement, filled with opportunities to work, play and communicate in completely new ways than we do today. You could be watching an IMAX movie on the moon, or you could be holding a work conference in a Pirates of the Caribbean-inspired tavern. Or maybe you could be rocking out on stage with your favorite band

But while you look forward to how the tech industry’s vision of how the metaverse plays out, Zuckerberg is preparing for what appears to be the fight of his life. And it’ll be against Apple. Read More

#metaverse

PHENAKI: Variable Length Video Generation from Open Domain Textual Descriptions

We present Phenaki, a model capable of realistic video synthesis, given a sequence of textual prompts. Generating videos from text is particularly challenging due to the computational cost, limited quantities of high quality text-video data and variable length of videos. To address these issues, we introduce a new model for learning video representation which compresses the video to a small representation of discrete tokens. This tokenizer uses causal attention in time, which allows it to work with variable-length videos. To generate video tokens from text we are using a bidirectional masked transformer conditioned on pre-computed text tokens. The generated video tokens are subsequently de-tokenized to create the actual video. To address data issues, we demonstrate how joint training on a large corpus of image-text pairs as well as a smaller number of video-text examples can result in generalization beyond what is available in the video datasets. Compared to the previous video generation methods, Phenaki can generate arbitrary long videos conditioned on a sequence of prompts (i.e. time variable text or a story) in open domain. To the best of our knowledge, this is the first time a paper studies generating videos from time variable prompts. In addition, compared to the per- frame baselines, the proposed video encoder-decoder computes fewer tokens per video but results in better spatio-temporal consistency. Read More

#image-recognition, #nlp

AI-generated imagery is the new clip art as Microsoft adds DALL-E to its Office suite

Microsoft is adding AI-generated art to its suite of Office software with a new app named Microsoft Designer.

The app functions the same way as AI text-to-image models like DALL-E and Stable Diffusion, letting users type prompts to “instantly generate a variety of designs with minimal effort.” Microsoft says Designer can be used to create everything from greeting cards and social media posts to illustrations for PowerPoint presentations and logos for businesses.

Essentially, AI-generated imagery looks set to become the new clip art. Read More

#big7, #image-recognition, #nlp

Growth in AI and robotics research accelerates

It may not be unusual for burgeoning areas of science, especially those related to rapid technological changes in society, to take off quickly, but even by these standards the rise of artificial intelligence (AI) has been impressive. Together with robotics, AI is representing an increasingly significant portion of research volume at various levels, as these charts show.

The number of AI and robotics papers published in the 82 high-quality science journals in the Nature Index (Count) has been rising year-on-year — so rapidly that it resembles an exponential growth curve. A similar increase is also happening more generally in journals and proceedings not included in the Nature Index, as is shown by data from the Dimensions database of research publications. Read More

#artificial-intelligence, #robotics