The public can now explore more than 1.5 million historical newspaper images online and free of charge. The latest machine learning experience from Library of Congress Labs, Newspaper Navigator allows users to search visual content in American newspapers dating 1789-1963.
… Through the creative ingenuity of Innovator in Residence Benjamin Lee and advances in machine learning, Newspaper Navigator now makes images in the newspapers searchable by enabling users to search by visual similarity. Read More
Monthly Archives: September 2020
This unheard Steve Jobs tape is part of an amazing trove of tech history
When Steve Jobs demoed his NeXT computer at a 1988 user meeting, Charles Mann was there to record it—along with dozens of other talks by computing pioneers. Read More
How to Select the Right Machine Learning Algorithm
Seven key factors to consider when implementing an algorithm
or any given machine learning problem, numerous algorithms can be applied and multiple models can be generated. … Having a wealth of options is good, but deciding on which model to implement in production is crucial. …Here is the list of factors to consider when implementing an algorithm:
- Interpretability
- The number of data points and features
- Data format
- Linearity of data
- Training time
- Prediction time
- Memory requirements
#machine-learning
Uncovering the structure of clinical EEG signals with self-supervised learning
Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data,such as electroencephalography (EEG), where labeling can be costly in terms of specialized expertise and human processing time. Consequently, deep learning architectures designed to learn on EEG data have yielded relatively shallow models and performances at best similar to those of traditional feature-based approaches. However, in most situations, unlabeled data is available in abundance. By extracting information from this unlabeled data, it might be possible to reach competitive performance with deep neural networks despite limited access to labels.Approach.We investigated self-supervised learning (SSL), a promising technique for discovering structure in unlabeled data, to learn representations of EEG signals. Specifically,we explored two tasks based on temporal context prediction as well as contrastive predictive coding on two clinically-relevant problems: EEG-based sleep staging and pathology detection.We conducted experiments on two large public datasets with thousands of recordings and performed baseline comparisons with purely supervised and hand-engineered approaches.Main results.Linear classifiers trained on SSL-learned features consistently outperformed purely supervised deep neural networks in low-labeled data regimes while reaching competitive performance when all labels were available. Additionally, the embeddings learned with each method revealed clear latent structures related to physiological and clinical phenomena, such as age effects.Significance.We demonstrate the benefit of self-supervised learning approaches on EEG data. Our results suggest that SSL may pave the way to a wider use of deep learning models on EEG data. Read More
Does BERT Solve Commonsense Task via Commonsense Knowledge?
The success of pretrained contextualized language models such as BERT motivates a line of work that investigates linguistic knowledge inside such models in order to explain the huge improvement in downstream tasks.While previous work shows syntactic, semantic and word sense knowledge in BERT, little work has been done on investigating how BERT solves Commonsense QA tasks. In particular,it is an interesting research question whether BERT relies on shallow syntactic patterns or deeper commonsense knowledge for disambiguation. We propose two attention-based methods to analyze commonsense knowledge inside BERT, and the contribution of such knowledge for the model prediction. We find that attention heads successfully capture the structured commonsense knowledge encoded in CONCEPTNET, which helps BERT solve commonsense tasks directly. Fine-tuning further makes BERT learn to use the common-sense knowledge on higher layers. Read More
DeepSpeed: Extreme-scale model training for everyone
In February, we announced DeepSpeed, an open-source deep learning training optimization library, and ZeRO (Zero Redundancy Optimizer), a novel memory optimization technology in the library, which vastly advances large model training by improving scale, speed, cost, and usability.
… Today, we are happy to share our new advancements that not only push deep learning training to the extreme, but also democratize it for more people—from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. More specifically, DeepSpeed adds four new system technologies that further the AI at Scale initiative to innovate across Microsoft’s AI products and platforms. These offer extreme compute, memory, and communication efficiency, and they power model training with billions to trillions of parameters. Read More
Oracle open-sources Java machine learning library
Tribuo offers tools for building and deploying classification, clustering, and regression models in Java, along with interfaces to TensorFlow, XGBoost, and ONNX
Looking to meet enterprise needs in the machine learning space, Oracle is making its Tribuo Java machine learning library available free under an open source license. Read More
A deep learning model achieves super-human performance at Gran Turismo Sport
Over the past few decades, research teams worldwide have developed machine learning and deep learning techniques that can achieve human-comparable performance on a variety of tasks. Some of these models were also trained to play renowned board or videogames, such as the Ancient Chinese game Go or Atari arcade games, in order to further assess their capabilities and performance.
Researchers at University of Zurich and SONY AI Zurich have recently tested the performance of a deep reinforcement learning-based approach that was trained to play Gran Turismo Sport, the renowned car racing video game developed by Polyphony Digital and published by Sony Interactive Entertainment. Their findings, presented in a paper pre-published on arXiv, further highlight the potential of deep learning techniques for controlling cars in simulated environments. Read More
Pay Attention to Evolution: Time Series Forecasting with Deep Graph-Evolution Learning
Time-series forecasting is one of the most active research topics in predictive analysis. A still open gap in that literature is that statistical and ensemble learning approaches systematically present lower predictive performance than deep learning methods as they generally disregard the data sequence aspect entangled with multivariate data represented in more than one time series. Conversely, this work presents a novel neural network architecture for time-series forecasting that combines the power of graph evolution with deep recurrent learning on distinct data distributions; we named our method Recurrent Graph Evolution Neural Network (ReGENN). The idea is to infer multiple multivariate relationships between co-occurring time-series by assuming that the temporal data depends not only on inner variables and intra-temporal relationships (i.e., observations from itself) but also on outer variables and inter-temporal relationships (i.e., observations from other-selves). An extensive set of experiments was conducted comparing ReGENN with dozens of ensemble methods and classical statistical ones, showing sound improvement of up to 64.87 present an analysis of the intermediate weights arising from ReGENN, showing that by looking at inter and intra-temporal relationships simultaneously, time-series forecasting is majorly improved if paying attention to how multiple multivariate data synchronously evolve. Read More
#recurrent-neural-networksDeepFaceDrawing: Deep Generation of Face Images from Sketches
Recent deep image-to-image translation techniques allow fast generation of face images from freehand sketches. However, existing solutions tend to overfit to sketches, thus requiring professional sketches or even edge maps as input. To address this issue, our key idea is to implicitly model the shape space of plausible face images and synthesize a face image in this space to approximate an input sketch. We take a local-to-global approach. We first learn feature embeddings of key face components, and push corresponding parts of input sketches towards underlying component manifolds defined by the feature vectors of face component samples. We also propose another deep neural network to learn the mapping from the embedded component features to realistic images with multi-channel feature maps as intermediate results to improve the information flow. Our method essentially uses input sketches as soft constraints and is thus able to produce high-quality face images even from rough and/or incomplete sketches. Our tool is easy to use even for non-artists, while still supporting fine-grained control of shape details. Both qualitative and quantitative evaluations show the superior generation ability of our system to existing and alternative solutions. The usability and expressiveness of our system are confirmed by a user study. Read More