Tom Cruise deepfake creator says public shouldn’t be worried about ‘one-click fakes’

19

Weeks of work and a top impersonator were needed to make the viral clips

When a series of spookily convincing Tom Cruise deepfakes went viral on TikTok, some suggested it was a chilling sign of things to come — harbinger of an era where AI will let anyone make fake videos of anyone else. The video’s creator, though, Belgium VFX specialist Chris Ume, says this is far from the case. Speaking to The Verge about his viral clips, Ume stresses the amount of time and effort that went into making each deepfake, as well as the importance of working with a top-flight Tom Cruise impersonator, Miles Fisher.

“You can’t do it by just pressing a button,” says Ume. “That’s important, that’s a message I want to tell people.” Each clip took weeks of work, he says, using the open-source DeepFaceLab algorithm as well as established video editing tools. “By combining traditional CGI and VFX with deepfakes, it makes it better. I make sure you don’t see any of the glitches.” Read More

#fake, #image-recognition

Self-supervised Pretraining of Visual Features in the Wild

Recently,self-supervised learning methods like MoCo [22], SimCLR [8], BYOL [20] and SwAV [7] have reduced the gap with supervised methods.These results have been achieved in a control environment, that is the highly curated ImageNet dataset. However, the premise of self-supervised learning is that it can learn from any random image and from any unbounded dataset. In this work, we explore if self-supervision lives to its expectation by training large models on random, uncurated images with no supervision. Our final SElf-supERvised (SEER) model,a RegNetY with 1.3B parameters trained on 1B random images with 512 GPUs achieves 84.2% top-1 accuracy,surpassing the best self-supervised pretrained model by 1%and confirming that self-supervised learning works in areal world setting. Interestingly, we also observe that self-supervised models are good few-shot learners achieving77.9% top-1 with access to only 10% of ImageNet. Read More

#big7, #image-recognition

Intel and EXOS Pilot 3D Athlete Tracking with Pro Football Hopefuls

Read More

#image-recognition, #videos

New AI ‘Deep Nostalgia’ brings old photos, including very old ones, to life

It seems like a nice idea in theory but it’s a tiny bit creepy as well

An AI-powered service called Deep Nostalgia that animates still photos has become the main character on Twitter this fine Sunday, as people try to create the creepiest fake “video” possible, apparently.

The Deep Nostalgia service, offered by online genealogy company MyHeritage, uses AI licensed from D-ID to create the effect that a still photo is moving. It’s kinda like the iOS Live Photos feature, which adds a few seconds of video to help smartphone photographers find the best shot. Read More

#image-recognition

VinVL: Making Visual Representations Matter in Vision-Language Models

This paper presents a detailed study of improving visual representations for vision language (VL)tasks and develops an improved object detection model to provide object-centric representations of images. Compared to the most widely used bottom-up and top-down model [2], the new model is bigger,better-designed for VL tasks, and pre-trained on much larger training corpora that combine multiple public annotated object detection datasets. Therefore, it can generate representations of a richer collection of visual objects and concepts. While previous VL research focuses mainly on improving the vision-language fusion model and leaves the object detection model improvement untouched, we show that visual features matter significantly in VL models. In our experiments we feed the visual features generated by the new object detection model into a Transformer-based VL fusion model OSCAR[21],and utilize an improved approach OSCAR+ to pre-train the VL model and fine-tune it on a wide range of downstream VL tasks. Our results show that the new visual features significantly improve the performance across all VL tasks, creating new state-of-the-art results on seven public benchmarks. We will release the new object detection model to public. Read More

#image-recognition, #nlp

This is how we lost control of our faces

The largest ever study of facial-recognition data shows how much the rise of deep learning has fueled a loss of privacy.

In 1964, mathematician and computer scientist Woodrow Bledsoe first attempted the task of matching suspects’ faces to mugshots. He measured out the distances between different facial features in printed photographs and fed them into a computer program. His rudimentary successes would set off decades of research into teaching machines to recognize human faces.

Now a new study shows just how much this enterprise has eroded our privacy. It hasn’t just fueled an increasingly powerful tool of surveillance. The latest generation of deep-learning-based facial recognition has completely disrupted our norms of consent. Read More

#image-recognition, #surveillance

China Develops Monkey Facial Recognition Using AI Technology

A research team from China’s Northwest University is using artificial intelligence (AI) and other new technologies to develop a facial recognition technology for monkey to identify thousands of Sichuan golden snub-nosed monkeys in the Qinling Mountain in Shaanxi Province.

Similar to the current facial recognition technology, the technology for monkey can extract the facial feature information of the monkey to establish the identity database of the individual monkey in Qinling Mountains, the Xinhua News Agency reported.

“When monkey facial recognition technology is fully developed, we can integrate the technology into an infrared camera sets in the mountains. The system will automatically recognize the monkeys, name them and analyze their behavior,” said Zhang He, a member of the Northwest University research team. Read More

#china-ai, #image-recognition, #surveillance

@TomerUllman: I had an AI (GPT3) generate 10 “thought experiments” (based on classic ones as input), and asked @WhiteBoardG to sketch them.

Image
Read More
#image-recognition, #nlp

ArtEmis: Affective Language for Visual Art

We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicate the dominant emotion they feel for a given image and, crucially, to also provide a grounded verbal explanation for their emotion choice. As we demonstrate below, this leads to a rich set of signals for both the objective content and the affective impact of an image, creating associations with abstract concepts (e.g., “freedom” or “love”), or references that go beyond what is directly visible, including visual similes and metaphors, or subjective references to personal experiences. We focus on visual art (e.g., paintings, artistic photographs) as it is a prime example of imagery created to elicit emotional responses from its viewers. Our dataset, termed ArtEmis, contains 439K emotion attributions and explanations from humans, on 81K artworks from WikiArt. Building on this data, we train and demonstrate a series of captioning systems capable of expressing and explaining emotions from visual stimuli. Remarkably, the captions produced by these systems often succeed in reflecting the semantic and abstract content of the image, going well beyond systems trained on existing datasets. Read More

Demonstration Wedsite

#image-recognition

Ai-Da, the first robot artist to exhibit herself

Ai-Da , a humanoid artificial intelligence robot, will exhibit a series of self-portraits that she created by “looking” into a mirror integrated with her camera eyes. Read More

Read More
#image-recognition, #robotics, #videos