Herein, a human identification system for smart spaces called Vein‐ID (referred to as VID) is presented, which leverage the uniqueness of vein patterns embedded in dorsum of an individual’s hand. VID extracts vein patterns using the depth information and infrared (IR) images, both obtained from a commodity depth camera. Two deep learning models (CNN and Stacked‐Autoencoders) are presented for precisely identifying a target individual from a set of N enrolled users. VID also incorporates a strategy for identifying an intruder—that is a person whose vein patterns are not included in the set of enrolled individuals. The performance of VID by collecting a comprehensive data set of approximately 17,500 images from 35 subjects is evaluated. The tests reveal that VID can identify an individual with an average accuracy of over 99% from a group of up to 35 individuals. It is demonstrated that VID can detect intruders with an average accuracy of about 96%. The execution time for training and testing the two deep learning models on different hardware platforms is also investigated and the differences are reported. Read More
Daily Archives: February 18, 2021
Switch Transformers: Scaling to Trillion Parameter Models With Simple and Efficient Sparsity
In deep learning, models typically reuse the same parameters for all inputs. Mixture of Experts (MoE) models defy this and instead select different parameters for each incoming example. The result is a sparsely-activated model – with an outrageous number of parameters – but a constant computational cost. However, despite several notable successes of MoE, widespread adoption has been hindered by complexity, communication costs, and training instability. We address these with the Switch Transformer. We simplify the MoE routing algorithm and design intuitive improved models with reduced communication and computational costs. Our proposed training techniques mitigate the instabilities, and we show large sparse models may be trained, for the first time, with lower precision (bfloat16) formats. We design models based off T5-Base and T5-Large (Raffel et al., 2019) to obtain up to 7x increases in pre-training speed with the same computational resources. These improvements extend into multilingual settings where we measure gains over the mT5-Base version across all 101 languages. Finally, we advance the current scale of language models by pre-training up to trillion parameter models on the “Colossal Clean Crawled Corpus”, and achieve a 4x speedup over the T5-XXL model. Read More
#performanceNSCAI Draft Final Report
Artificial Intelligence (AI) technologies promise to be the most powerful tools in generations for expanding knowledge, increasing prosperity, and enriching the human experience. The technologies will be the foundation of the innovation economy and a source of enormous power for countries that harness them. AI will fuel competition between governments and companies racing to field it. And it will be employed by nation states to pursue their strategic ambitions.
Americans have not yet seriously grappled with how profoundly the AI revolution will impact society, the economy, and national security. Recent AI breakthroughs, such as a computer defeating a human in the popular strategy game of Go,1 shocked other nations into action, but did not inspire the same response in the United States. Americans have not recognized the assertive role the government will have to play in ensuring the United States wins this innovation competition. And they have not contemplated the scale of public resources required to achieve it. Despite our private sector and university leadership in AI, the United States remains unprepared for the coming era.
The magnitude of the technological opportunity coincides with a moment of strategic vulnerability. China is a competitor possessing the might, talent, and ambition to challenge America’s technological leadership, military superiority, and its broader position in the world. AI is deepening the threat posed by cyber attacks and disinformation campaigns that Russia, China, and other state and non-state actors use to infiltrate our society, steal our data, and interfere in our democracy. Global crises exemplified in the global pandemic and climate change are expanding the definition of national security and crying out for innovative solutions. AI can help us navigate many of these challenges.
We are fortunate. The AI revolution is not a strategic surprise. We are experiencing its impact in our daily lives and can anticipate how research progress will translate into real world applications before we have to confront the full national security ramifications. This commission can warn of national security challenges and articulate the benefits, rather than explain why previous warnings were ignored and opportunities were missed. We still have a window to make the changes to build a safer and better future. Read More
Improving Zero-Shot Translation by Disentangling Positional Information
Multilingual neural machine translation has shown the capability of directly translating between language pairs unseen in training, i.e. zero-shot translation. Despite being conceptually attractive, it often suffers from low output quality. The difficulty of generalizing to new translation directions suggests the model representations are highly specific to those language pairs seen in training. We demonstrate that a main factor causing the language-specific representations is the positional correspondence to input tokens. We show that this can be easily alleviated by removing residual connections in an encoder layer. With this modification, we gain up to 18.5 BLEU points on zero-shot translation while retaining quality on supervised directions. The improvements are particularly prominent between related languages, where our proposed model outperforms pivot-based translation. Moreover, our approach allows easy integration of new languages, which substantially expands translation coverage. By thorough inspections of the hidden layer outputs, we show that our approach indeed leads to more language independent representations. Read More
#nlpArtEmis: Affective Language for Visual Art
We present a novel large-scale dataset and accompanying machine learning models aimed at providing a detailed understanding of the interplay between visual content, its emotional effect, and explanations for the latter in language. In contrast to most existing annotation datasets in computer vision, we focus on the affective experience triggered by visual artworks and ask the annotators to indicate the dominant emotion they feel for a given image and, crucially, to also provide a grounded verbal explanation for their emotion choice. As we demonstrate below, this leads to a rich set of signals for both the objective content and the affective impact of an image, creating associations with abstract concepts (e.g., “freedom” or “love”), or references that go beyond what is directly visible, including visual similes and metaphors, or subjective references to personal experiences. We focus on visual art (e.g., paintings, artistic photographs) as it is a prime example of imagery created to elicit emotional responses from its viewers. Our dataset, termed ArtEmis, contains 439K emotion attributions and explanations from humans, on 81K artworks from WikiArt. Building on this data, we train and demonstrate a series of captioning systems capable of expressing and explaining emotions from visual stimuli. Remarkably, the captions produced by these systems often succeed in reflecting the semantic and abstract content of the image, going well beyond systems trained on existing datasets. Read More
Demonstration Wedsite