Sign language recognition using deep learning

TL;DR It is presented a dual-cam first-vision translation system using convolutional neural networks. A prototype was developed to recognize 24 gestures. The vision system is composed of a head-mounted camera and a chest-mounted camera and the machine learning model is composed of two convolutional neural networks, one for each camera. Read More

#image-recognition, #nlp, #vision

Adventures in PyTorch — Image classification with CalTech Birds 200 — Introduction

This series will explore the power of Facebook AI Research’s (FAIR) powerful neural network and machine learning architecture, PyTorch. In this series of articles, we will explore the power of PyTorch in application to an image classification problem, to identify 200 species of North American bird using the CalTech 200 birds dataset, by using various CNN architectures including GoogLeNet, ResNet152 and ResNeXt101, among others. Read More

#image-recognition, #python

An AI Learned To See Through Obstructions!

Read More

#image-recognition, #videos

Neuroevolution of Self-Interpretable Agents

Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from irrelevant details. Motivated by selective attention, we study the properties of artificial agents that perceive the world through the lens of a self-attention bottleneck. By constraining access to only a small fraction of the visual input, we show that their policies are directly interpretable in pixel space. We find neuroevolution ideal for training self-attention architectures for vision-based reinforcement learning (RL) tasks,allowing us to incorporate modules that can include discrete, non-differentiable operations which are useful for our agent. We argue that self-attention has similar properties as indirect encoding, in the sense that large implicit weight matrices are generated from a small number of key-query parameters, thus enabling our agent to solve challenging vision based tasks with at least 1000x fewer parameters than existing methods. Since our agent attends to only task critical visual hints, they are able to generalize to environments where task irrelevant elements are modified while conventional methods fail. Read More

#image-recognition, #reinforcement-learning, #vision

NIST Launches Investigation of Face Masks’ Effect on Face Recognition Software

Algorithms created before the pandemic generally perform less accurately with digitally masked faces.

Now that so many of us are covering our faces to help reduce the spread of COVID-19, how well do face recognition algorithms identify people wearing masks? The answer, according to a preliminary study by the National Institute of Standards and Technology (NIST), is with great difficulty. Even the best of the 89 commercial facial recognition algorithms tested had error rates between 5% and 50% in matching digitally applied face masks with photos of the same person without a mask.

The results were published today as a NIST Interagency Report (NISTIR 8311), the first in a planned series from NIST’s Face Recognition Vendor Test (FRVT) program on the performance of face recognition algorithms on faces partially covered by protective masks.  Read More

#image-recognition

Learning to Cartoonize Using White-box Cartoon Representations

This paper presents an approach for image cartoonization. By observing the cartoon painting behavior and consulting artists, we propose to separately identify three white-box representations from images: the surface representation that contains a smooth surface of cartoon images, the structure representation that refers to the sparse color-blocks and flatten global content in the celluloid style workflow, and the texture representation that reflects high-frequency texture, contours, and details in cartoon images. A Generative Adversarial Network (GAN) framework is used to learn the extracted representations and to cartoonize images.

The learning objectives of our method are separately based on each extracted representations, making our framework controllable and adjustable. This enables our approach to meet artists’ requirements in different styles and diverse use cases. Qualitative comparisons and quantitative analyses, as well as user studies, have been conducted to validate the effectiveness of this approach, and our method outperforms previous methods in all comparisons. Finally, the ablation study demonstrates the influence of each component in our framework. Read More

#gans, #image-recognition

How an AI graphic designer convinced clients it was human

Nikolay Ironov had been working as a graphic designer for more than a year before he revealed his secret.

As an employee of Art. Lebedev Studio — Russia’s largest design company — Ironov had already worked on more than 20 commercial projects, creating everything from beer bottle labels to startup logos.

But Ironov was not the person he claimed to be. In fact, the designer was not a person at all. Read More

#image-recognition, #nlp, #vfx

Deepfake used to attack activist couple shows new disinformation frontier

Oliver Taylor, a student at England’s University of Birmingham, is a twenty-something with brown eyes, light stubble, and a slightly stiff smile.

Online profiles describe him as a coffee lover and politics junkie who was raised in a traditional Jewish home. His half dozen freelance editorials and blog posts reveal an active interest in anti-Semitism and Jewish affairs, with bylines in the Jerusalem Post and the Times of Israel.

The catch? Oliver Taylor seems to be an elaborate fiction. Read More

#fake, #image-recognition

Google’s quiet experiments may lead to smart tattoos, holographic glasses

A simple pair of sunglasses that projects holographic icons. A smartwatch that has a digital screen but analog hands. A temporary tattoo that, when applied to your skin, transforms your body into a living touchpad. A virtual reality controller that lets you pick up objects in digital worlds and feel their weight as you swing them around. Those are some of the projects Google has quietly been developing or funding, according to white papers and demo videos, in an effort to create the next generation of wearable technology devices. Read More

#image-recognition, #iot

Image Search with Text Feedback by Visiolinguistic Attention Learning

Image search with text feedback has promising impacts in various real-world applications, such as e-commerce and internet search. Given a reference image and text feedback from user, the goal is to retrieve images that not only resemble the input image, but also change certain aspects in accordance with the given text. This is a challenging task as it requires the synergistic understanding of both image and text. In this work, we tackle this task by a novel Visiolinguistic Attention Learning (VAL) framework. Specifically, we propose a composite transformer that can be seamlessly plugged in a CNN to selectively preserve and transform the visual features conditioned on language semantics. By inserting multiple composite transformers at varying depths,VAL is incentive to encapsulate the multi-granular visiolinguistic information, thus yielding an expressive representation for effective image search. We conduct comprehensive evaluation on three datasets: Fashion200k, Shoes and FashionIQ. Extensive experiments show our model exceedsexisting approaches on all datasets, demonstrating consistent superiority in coping with various text feedbacks, including attribute-like and natural language descriptions. Read More

#big7, #image-recognition, #nlp