PeopleLens: Using AI to support social interaction between children who are blind and their peers

Read More

#image-recognition, #videos

CSLIM Brings AI-Generated Art to ‘The World,’ or Vice Versa

Welcome to the world of tomorrow, where artificial intelligence can create East Asian landscape paintings that rival 11th century masters’ — with ownership that can be verified on the blockchain. Such is the vision of South Korean generative artist CSLIM, whose upcoming NFT collection will feature 5,000 artworks that the engineer and developer attests were created by AI that has studied the data of 80,000 classical East Asian landscape paintings from the sixth through 13th centuries via machine learning. Exquisitely combining cultural tradition and cutting-edge technology, “The World” will drop on Feb. 21 — only at Crypto.com/NFT. Read More

#blockchain, #image-recognition

I asked an AI to paint 10 famous sci-fi book titles in one minute. Here are the results.

Fifty years ago, top scientists believed AI would never be able to beat humans at chess.

We all know how that turned out.

The newest goalpost is artAs an AI artist, people routinely tell me how my art isn’t real, or how it lacks humanity because it’s machine generated.

True, there’s some credence to these claims. But I would venture most people are simply uncomfortable with the notion that AI is starting to produce art faster & better than humans.

After my last post on AI art blew up, it naturally attracted a fair amount of similar criticism. Much was in reference to a supposed lack of diversity in the samples (apparently all of the landscapes looked the same).

To respond to this criticism, I decided to run an A100 GPU for one minute. Read More

#image-recognition

People Trust Deepfake Faces Generated by AI More Than Real Ones, Study Finds

The proliferation of deepfake technology is raising concerns that AI could start to warp our sense of shared reality. New research suggests AI-synthesized faces don’t simply dupe us into thinking they’re real people, we actually trust them more than our fellow humans.

In 2018, Nvidia wowed the world with an AI that could churn out ultra-realistic photos of people that don’t exist. Its researchers relied on a type of algorithm known as a generative adversarial network (GAN), which pits two neural networks against each other, one trying to spot fakes and the other trying to generate more convincing ones. Given enough time, GANS can generate remarkably good counterfeits.

Since then, capabilities have improved considerably, with some worrying implications: enabling scammers to trick people, making it possible to splice people into porn movies without their consent, and undermining trust in online media. While it’s possible to use AI itself to spot deepfakes, tech companies’ failures to effectively moderate much less complicated material suggests this won’t be a silver bullet. Read More

#fake, #image-recognition

FILM: Frame Interpolation for Large Scene Motion

Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn’t use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone. Read More

#big7, #devops, #image-recognition

Fake It Till You Make It

We demonstrate that it is possible to perform face-related computer vision in the wild using synthetic data alone.

The community has long enjoyed the benefits of synthesizing training data with graphics, but the domain gap between real and synthetic data has remained a problem, especially for human faces. Researchers have tried to bridge this gap with data mixing, domain adaptation, and domain-adversarial training, but we show that it is possible to synthesize data with minimal domain gap, so that models trained on synthetic data generalize to real in-the-wild datasets.

We describe how to combine a procedurally-generated parametric 3D face model with a comprehensive library of hand-crafted assets to render training images with unprecedented realism and diversity. We train machine learning systems for face-related tasks such as landmark localization and face parsing, showing that synthetic data can both match real data in accuracy as well as open up new approaches where manual labelling would be impossible. Read More

Dataset

#big7, #fake, #image-recognition

Corsight’s Upcoming DNA to FACE: ‘Terrifying’ Warns Privacy Expert

Corsight plans to release a new product that combines DNA and face recognition technology and could have significant law enforcement and privacy implications.

In this report, we examine Corsight’s product roadmap for “DNA to FACE,” presented at the 2021 Imperial Capital Investors Conference, possible use cases for the technology, and warnings from a privacy expert.

IPVM collaborated with MIT Technology Review on this report, see the MIT Technology Review article: This company says it’s developing a system that can recognize your face from just your DNA Read More

#image-recognition, #privacy

TransformerFusion

Monocular RGB Scene Reconstruction using Transformers

We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach. From an input monocular RGB video, the video frames are processed by a transformer network that fuses the observations into a volumetric feature grid representing the scene; this feature grid is then decoded into an implicit 3D scene representation. Key to our approach is the transformer architecture that enables the network to learn to attend to the most relevant image frames for each 3D location in the scene, supervised only by the scene reconstruction task. Features are fused in a coarse-to-fine fashion, storing fine-level features only where needed, requiring lower memory storage and enabling fusion at interactive rates. The feature grid is then decoded to a higher-resolution scene reconstruction, using an MLP-based surface occupancy prediction from interpolated coarse-to-fine 3D features. Our approach results in an accurate surface reconstruction, outperforming state-of-the-art multi-view stereo depth estimation methods, fully-convolutional 3D reconstruction approaches, and approaches using LSTM- or GRU-based recurrent networks for video sequence fusion. Read More

#image-recognition

Mobile-Former: Bridging MobileNet and Transformer

We present Mobile-Former, a parallel design of MobileNet and transformer with a two-way bridge in between. This structure leverages the advantages of MobileNet at local processing and transformer at global interaction. And the bridge enables bidirectional fusion of local and global features. Different from recent works on vision transformer, the transformer in Mobile-Former contains very few tokens (e.g. 6 or fewer tokens) that are randomly initialized to learn global priors, resulting in low computational cost. Combining with the proposed light-weight cross attention to model the bridge, Mobile-Former is not only computationally efficient, but also has more representation power. It outperforms MobileNetV3 at low FLOP regime from 25M to 500M FLOPs on ImageNet classification. For instance, Mobile-Former achieves 77.9% top-1 accuracy at 294M FLOPs, gaining 1.3% over MobileNetV3 but saving 17% of computations. When transferring to object detection, Mobile-Former outperforms MobileNetV3 by 8.6 AP in RetinaNet framework. Furthermore, we build an efficient end-to-end detector by replacing backbone, encoder and decoder in DETR with Mobile-Former, which outperforms DETR by 1.1 AP but saves 52% of computational cost and 36% of parameters. Read More

#image-recognition

Nvidia’s upgraded AI art tool turned my obscure squiggles into a masterpiece

It’s incredible, the things we can do with AI nowadays. For artists looking to integrate artificial intelligence into their workflow, there are ever more advanced tools popping up all over the net. One such tool is Nvidia Canvas, which has just been updated with the more powerful GauGAN2 AI, to replace the original GauGAN model, along with loads of new features. 

The Nvidia Canvas software is available for free to anyone with an Nvidia RTX graphics card. This is because the software uses the tensor cores present in your GPU to let the AI do it’s job. Read More

#gans, #image-recognition, #nvidia