Tag Archives: Image Recognition
Researchers use AI to deblur human faces in photos
We’ve all been there: You’re snapping pics with your phone — perhaps of a high-speed bike ride or of a hockey match — and don’t think to check whether the autofocus is in lockstep with the action. It isn’t, as you later discover, and you’re stuck with a gallery of unusably blurry photos.
In search of a solution, scientists at the Inception Institute of Artificial Intelligence in the United Arab Emirates, the Beijing Institute of Technology, and Stony Brook University developed an AI system that removes blur from images in post-production. Read More
The Secretive Company That Might End Privacy as We Know It
The New York Times has a long story about a little-known start-up, Clearview AI, that helps law enforcement match photos of unknown people to their online images — and “might lead to a dystopian future or something,” a backer says. Read More
#image-recognition, #privacySelf-training with Noisy Student improves ImageNet classification
We present a simple self-training method that achieves 87.4% top-1 accuracy on ImageNet, which is 1.0% better than the state-of-the-art model that requires 3.5B weakly labeled Instagram images. On robustness test sets, it improves ImageNet-A top-1 accuracy from 16.6% to 74.2%, reduces ImageNet-C mean corruption error from 45.7 to 31.2, and reduces ImageNet-P mean flip rate from 27.8 to 16.1.
To achieve this result, we first train an EfficientNet model on labeled ImageNet images and use it as a teacher to generate pseudo labels on 300M unlabeled images. We then train a larger EfficientNet as a student model on the combination of labeled and pseudo labeled images. We iterate this process by putting back the student as the teacher. During the generation of the pseudo labels, the teacher is not noised so that the pseudo labels are as good as possible. But during the learning of the student, we inject noise such as data augmentation, dropout, stochastic depth to the student so that the noised student is forced to learn harder from the pseudo labels. Read More
Going Beyond GAN? New DeepMind VAE Model Generates High Fidelity Human
Generative adversarial networks (GANs) have become AI researchers’ “go-to” technique for generating photo-realistic synthetic images. Now, DeepMind researchers say that there may be a better option.
In a new paper, the Google-owned research company introduces its VQ-VAE 2 model for large scale image generation. The model is said to yield results competitive with state-of-the-art generative model BigGAN in synthesizing high-resolution images while delivering broader diversity and overcoming some native shortcomings of GANs. Read More
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer
Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering. Enabling ML models to understand image formation might be key for generalization. However, due to an essential rasterization step involving discrete assignment operations, rendering pipelines are non-differentiable and thus largely inaccessible to gradient-based ML techniques. In this paper, we present DIB-R, a differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image. Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as an distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models. We showcase our approach in two ML applications: single-image 3D object prediction, and 3D textured object generation, both trained using exclusively using 2D supervision. Our project website is: https://nv-tlabs.github.io/DIB-R/ Read More
This is how Facebook’s AI looks for bad stuff
The vast majority of Facebook’s moderation is now done automatically by the company’s machine-learning systems, reducing the amount of harrowing content its moderators have to review. In its latest community standards enforcement report, published earlier this month, the company claimed that 98% of terrorist videos and photos are removed before anyone has the chance to see them, let alone report them. Read More
First-ever humanoid robot powered by cloud artificial intelligence
Who needs to use that delicate tiny sewing staple, when there’s now a robot that can thread a needle for you? CloudMinds XR-1, 5G Humanoid Robots with vision-controlled grasping tech and intricate manual tasks, interacted with guests at the Sprint exhibit at the Mobile World Congress 2019 Los Angeles, (MWC19) in Los Angeles.
The XR-1 robot is powered by cloud artificial intelligence (AI)–one of the first of its kind–Sprint True Mobile 5G, and proprietary vision-controlled grasping tech, which means it not only can thread a needle, but can serve drinks and can be programmed to do other tasks, including manufacturing. Read More
We See in 3D – So Should Our CNN Models
Summary: Autonomous vehicles (AUVs) and many other systems that need to accurately perceive the world around them will be much better off when image classification moves from 2D to 3D. Here we examine the two leading approaches to 3D classification, Point Clouds and Voxel Grids.
One of the well-known problems in CNN image classification is that because the CNN classifier sees only a 2D image of the object it won’t recognize that same object if it’s rotated. The solution thus far has been to train on many different orthogonal views of the same object and that vastly expands the problem of training data and training time. Read More
These Machine Learning Techniques Make Google Lens A Success
Google Lens was introduced a couple of years ago by Google in a move to spearhead the ‘AI first’ products movement. Now, with the enhancement of machine learning techniques, especially in the domain of image processing and NLP, Google Lens has scaled to new heights. Here we take a look at a few algorithmic based solutions that power up Google Lens:
Lens uses computer vision, machine learning and Google’s Knowledge Graph to let people turn the things they see in the real world into a visual search box, enabling them to identify objects like plants and animals, or to copy and paste text from the real world into their phone. Read More