Voice-enabled interactions provide more human-like experiences in many popular IoT systems. Cloud-based speech analysis services extract useful information from voice input using speech recognition techniques. The voice signal is a rich resource that discloses several possible states of a speaker, such as emotional state, confidence and stress levels,physical condition, age, gender, and personal traits. Service providers can build a very accurate profile of a user’s demographic category, personal preferences, and may compromise privacy. To address this problem, a privacy-preserving intermediate layer between users and cloud services is proposed to sanitize the voice input. It aims to maintain utility while preserving user privacy. It achieves this by collecting real time speech data and analyzes the signal to ensure privacy protection prior to sharing of this data with services providers. Precisely, the sensitive representations are extracted from the raw signal by using transformation functions and then wrapped it via voice conversion technology.Experimental evaluation based on emotion recognition to assess the efficacy of the proposed method shows that identification of sensitive emotional state of the speaker is reduced by∼96 %. Read More
Tag Archives: Privacy
China's hackers are ransacking databases for your health data
In May 2017, the WannaCry ransomware spread around the globe. As the worm locked Windows PCs, the UK’s National Health Service quickly ground to a halt. 19,000 appointments were cancelled, doctor’s couldn’t access patient files and email accounts were taken offline.
But North Korean hackers behind WannaCry didn’t touch one thing: patient data. No personal information was stolen, the NHS has concluded. The cyberattack was purely to cause disruption and an attempt to earn the hermit state some much-needed cash.
The same can’t be said for China. New analysis has indicated that state-sponsored hackers from the country are targetting medical data from the healthcare industry. Research from security firm FireEye, has identified multiple groups with links to China attacking medical systems and databases around the world. These attacks include incidents in 2019, but also date back as far as 2013. Read More
Google proposes new privacy and anti-fingerprinting controls for the web
Google today announced a new long-term initiative that, if fully realized, will make it harder for online marketers and advertisers to track you across the web. This new proposal follows the company’s plans to change how cookies in Chrome work and to make it easier for users to block tracking cookies.
Today’s proposal for a new open standard extends this by looking at how Chrome can close the loopholes that the digital advertising ecosystem can use to circumvent that. And soon, that may mean that your browser will feature new options that give you more control over how much you share without losing your anonymity. Read More
Google Turns to Retro Cryptography to Keep Datasets Private
Certain studies require sensitive datasets: the relationship between nutritious school lunch and student health, the effectiveness of salary equity initiatives, and so on. Valuable insights require navigating a minefield of private, personal information. Now, after years of work, cryptographers and data scientists at Google have come up with a technique to enable this “multi-party computation” without exposing information to anyone who didn’t already have it.
Today Google will release an open source cryptographic tool known as Private Join and Compute. It facilitates the process of joining numeric columns from different datasets to calculate a sum, count, or average on data that is encrypted and unreadable during its entire mathematical journey. Only the results of the computation can be decrypted and viewed by all parties, meaning that you only get the results, not the data you didn’t already own. Read More
The New Wilderness
The need to regulate online privacy is a truth so universally acknowledged that even Facebook and Google have joined the chorus of voices crying for change.
Writing in the New York Times last month, Google CEO Sundar Pichai argued that it is “vital for companies to give people clear, individual choices around how their data is used.” Like all Times opinion pieces, his editorial included multiple Google tracking scripts served without the reader’s knowledge or consent. Had he wanted to, Mr. Pichai could have learned down to the second when a particular reader had read his assurance that Google “stayed focused on the products and features that make privacy a reality.”
Writing in a similar vein in the Washington Post this March, Facebook CEO Mark Zuckerberg called for Congress to pass privacy laws modeled on the European General Data Protection Regulation (GDPR). That editorial was served to readers with a similar bouquet of non-consensual tracking scripts that violated both the letter and spirit of the law Mr. Zuckerberg wants Congress to enact.
This odd situation recalls the cigarette ads in the 1930’s in which tobacco companies brought out rival doctors to argue over which brand was most soothing to the throat. Read More
A Hitchhiker’s Guide On Distributed Training of Deep Neural Networks
Deep learning has led to tremendous advancements in the field of Artificial Intelligence. One caveat however is the substantial amount of compute needed to train these deep learning models. Training a benchmark dataset like ImageNet on a single machine with a modern GPU can take up to a week,distributing training on multiple machines has been observed to drastically bring this time down. Recent work has brought down ImageNet training time to a time as low as 4 minutes by using a cluster of 2048 GPUs. This paper surveys the various algorithms and techniques used to distribute training and presents the current state of the art for a modern distributed training framework. More specifically, we explore the synchronous and asynchronous variants of distributed Stochastic Gradient Descent, various All Reduce gradient aggregation strategies and best practices for obtaining higher throughout and lower latency over a cluster such as mixed precision training, large batch training and gradient compression. Read More
Incremental learning algorithms and applications
Incremental learning refers to learning from streaming data, which arrive over time, with limited memory resources and, ideally, without sacrificing model accuracy. This setting fits different application scenarios such as learning in changing environments, model personalisation, or lifelong learning, and it offers an elegant scheme for big data processing by means of its sequential treatment. In this contribution, we formalise the concept of incremental learning, we discuss particular challenges which arise in this setting, and we give an overview about popular approaches, its theoretical foundations, and applications which emerged in the last years. Read More
Incremental Learning in Deep Convolutional Neural Networks Using Partial Network Sharing
Deep convolutional neural network (DCNN) based supervised learning is a widely practiced approach for large-scale image classification. However, retraining these large networks to accommodate new, previously unseen data demands high computational time and energy requirements. Also, previously seen training samples may not be available at the time of retraining. We propose an efficient training methodology and incrementally growing DCNN to allow new classes to be learned while sharing part of the base network. Our proposed methodology is inspired by transfer learning techniques, although it does not forget previously learned classes. An updated network for learning new set of classes is formed using previously learned convolutional layers (shared from initial part of base network) with addition of few newly added convolutional kernels included in the later layers of the network. We evaluated the proposed scheme on several recognition applications. The classification accuracy achieved by our approach is comparable to the regular incremental learning approach (where networks are updated with new training samples only, without any network sharing), while achieving energy efficiency, reduction in storage requirements, memory access and training time. Read More
Transfer Incremental Learning Using Data Augmentation
Due to catastrophic forgetting, deep learning remains highly inappropriate when facing incremental learning of new classes and examples over time. In this contribution, we introduce Transfer Incremental Learning using Data Augmentation (TILDA). TILDA combines transfer learning from a pre-trained Deep Neural Network (DNN) as feature extractor, a Nearest Class Mean (NCM) inspired classifier and majority vote using data augmentation on both training and test vectors. The obtained methodology allows learning new examples or classes on the fly with very limited computational and memory footprints. We perform experiments on challenging vision datasets and obtain performance significantly better than existing incremental counterparts. Read More
Using Transfer Learning to Introduce Generalization in Models
Researchers often try to capture as much information as they can, either by using existing architectures, creating new ones, going deeper, or employing different training methods. This paper compares different ideas and methods that are used heavily in Machine Learning to determine what works best. These methods are prevalent in various domains of Machine Learning, such as Computer Vision and Natural Language Processing (NLP). Read More