Artificial intelligence promises to transform — and indeed, has already transformed — entire industries, from civic planning and health care to cybersecurity. But privacy remains an unsolved challenge in the industry, particularly where compliance and regulation are concerned.
Recent controversies put the problem into sharp relief. The Royal Free London NHS Foundation Trust, a division of the U.K.’s National Health Service based in London, provided Alphabet’s DeepMind with data on 1.6 million patients without their consent. Google — whose health data-sharing partnership with Ascension became the subject of scrutiny in November — abandoned plans to publish scans of chest X-rays over concerns that they contained personally identifiable information. This past summer, Microsoft quietly removed a data set (MS Celeb) with more than 10 million images of people after it was revealed that some weren’t aware they had been included. Read More
Tag Archives: Split Learning
CryptoNN: Training Neural Networks over Encrypted Data
Emerging neural networks based machine learning techniques such as deep learning and its variants have shown tremendous potential in many application domains. However,they raise serious privacy concerns due to the risk of leakage of highly privacy-sensitive data when data collected from usersis used to train neural network models to support predictive tasks. To tackle such serious privacy concerns, several privacy-preserving approaches have been proposed in the literature that use either secure multi-party computation (SMC) or homomorphic encryption (HE) as the underlying mechanisms. However, neither of these cryptographic approaches provides an efficient solution towards constructing a privacy-preserving machine learning model, as well as supporting both the training and inference phases.To tackle the above issue, we propose a CryptoNN framework that supports training a neural network model over encrypted data by using the emerging functional encryption scheme instead of SMC or HE. We also construct a functional encryption scheme for basic arithmetic computation to support the requirement ofthe proposed CryptoNN framework. We present performance evaluation and security analysis of the underlying crypto scheme and show through our experiments that CryptoNN achieves accuracy that is similar to those of the baseline neural network models on the MNIST dataset. Read More
A Hitchhiker’s Guide On Distributed Training of Deep Neural Networks
Deep learning has led to tremendous advancements in the field of Artificial Intelligence. One caveat however is the substantial amount of compute needed to train these deep learning models. Training a benchmark dataset like ImageNet on a single machine with a modern GPU can take up to a week,distributing training on multiple machines has been observed to drastically bring this time down. Recent work has brought down ImageNet training time to a time as low as 4 minutes by using a cluster of 2048 GPUs. This paper surveys the various algorithms and techniques used to distribute training and presents the current state of the art for a modern distributed training framework. More specifically, we explore the synchronous and asynchronous variants of distributed Stochastic Gradient Descent, various All Reduce gradient aggregation strategies and best practices for obtaining higher throughout and lower latency over a cluster such as mixed precision training, large batch training and gradient compression. Read More
Privacy and machine learning: two unexpected allies?
In many applications of machine learning, such as machine learning for medical diagnosis, we would like to have machine learning algorithms that do not memorize sensitive information about the training set, such as the specific medical histories of individual patients. Differential privacy is a framework for measuring the privacy guarantees provided by an algorithm. Through the lens of differential privacy, we can design machine learning algorithms that responsibly train models on private data. Our works (with Martín Abadi, Úlfar Erlingsson, Ilya Mironov, Ananth Raghunathan, Shuang Song and Kunal Talwar) on differential privacy for machine learning have made it very easy for machine learning researchers to contribute to privacy research—even without being an expert on the mathematics of differential privacy. In this blog post, we’ll show you how to do it. Read More
Semi-supervised knowledge transfer for deep learning from private training data
Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information.To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data:Private Aggregation of Teacher Ensembles(PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as “teachers” for a “student” model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student’s privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student’s training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings.Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning. Read More
Efficient Decentralized Deep Learning by Dynamic Model Averaging
We propose an efficient protocol for decentralized training of deep neural networks from distributed data sources. The proposed protocol allows to handle different phases of model training equally well and to quickly adapt to concept drifts. This leads to a reduction of communication by an order of magnitude compared to periodically communicating state-of-the-art approaches. Moreover, we derive a communication bound that scales well with the hardness of the serialized learning problem. The reduction in communication comes at almost no cost, as the predictive performance remains virtually unchanged. Indeed, the proposed protocol retains loss bounds of periodically averaging schemes. An extensive empirical evaluation validates major improvement of the trade-off between model performance and communication which could be beneficial for numerous decentralized learning applications, such as autonomous driving, or voice recognition and image classification on mobile phones. Read More
Ian Goodfellow- Machine Learning Privacy and Security
Incremental Convolutional Neural Network Training
Experimenting novel ideas on deep convolutional neural networks (DCNNs) with big datasets is hampered by the fact that network training requires huge computational resources in the terms of CPU and GPU power and hours. One option is to downscale the problem, e.g., less classes and less samples, but this is undesirable with DCNNs whose performance is largely data-dependent. In this work, we take an alternative route and downscale the networks and input images. For example, the ImageNet problem of 1,000 classes and 1,2M training images can be solved in hours on a commodity laptop without GPU by downscaling images and the network to the resolution of 8 8. We attempt to provide the solution to transfer the knowledge (parameters) of a trained DCNN with lower resolution to improve the efficiency of training a DCNN with higher resolution, and continue training incrementally until the full resolution is achieved. In our experiments, this approach achieves clear boost in computing time without the loss of performance. Read More
Federated Machine Learning: Concept and Applications
Today’s AI still faces two major challenges. One is that in most industries, data exists in the form of isolatedislands. The other is the strengthening of data privacy and security. We propose a possible solution to thesechallenges: secure federated learning. Beyond the federated learning framework first proposed by Google in2016, we introduce a comprehensive secure federated learning framework, which includes horizontal federatedlearning, vertical federated learning and federated transfer learning. We provide definitions, architectures andapplications for the federated learning framework, and provide a comprehensive survey of existing workson this subject. In addition, we propose building data networks among organizations based on federatedmechanisms as an effective solution to allow knowledge to be shared without compromising user privacy. Read More
Federated Learning with Non-IID Data
Federated learning enables resource-constrained edge compute devices, such asmobile phones and IoT devices, to learn a shared model for prediction, while keep-ing the training data local. This decentralized approach to train models providesprivacy, security, regulatory and economic benefits. In this work, we focus on thestatistical challenge of federated learning when local data is non-IID. We first showthat the accuracy of federated learning reduces significantly, by up to ~55% forneural networks trained for highly skewed non-IID data, where each client devicetrains only on a single class of data. We further show that this accuracy reductioncan be explained by the weight divergence, which can be quantified by the earthmover’s distance (EMD) between the distribution over classes on each device andthe population distribution. As a solution, we propose a strategy to improve trainingon non-IID data by creating a small subset of data which is globally shared betweenall the edge devices. Experiments show that accuracy can be increased by ~30%for the CIFAR-10 dataset with only 5% globally shared data. Read More