We present VILLA, the first known effort on large-scale adversarial training for vision-and-language (V+L) representation learning.VILLA consists of two training stages: (i) task-agnostic adversarial pretraining; followed by (ii) task-specific adversarial fine tuning. Instead of adding adversarial perturbations on image pixels and textual tokens, we propose to perform adversarial training in the embedding space of each modality. To enable large-scale training, we adopt the “free” adversarial training strategy, and combine it with KL-divergence-based regularization to promote higher invariance in the embedding space. We apply VILLA to current best-performing V+L models, and achieve new state of the art on a wide range of tasks, including Visual Question Answering, Visual Commonsense Reasoning,Image-Text Retrieval, Referring Expression Comprehension, Visual Entailment,and NLVR. Read More
Tag Archives: Adversarial
Extracting Training Data from Large Language Models
It has become common to publish large (billion parameter) language models that have been trained on private datasets. This paper demonstrates that in such settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model.
We demonstrate our attack on GPT-2, a language model trained on scrapes of the public Internet, and are able to extract hundreds of verbatim text sequences from the model’s training data. These extracted examples include (public) personally identifiable information (names, phone numbers, and email addresses), IRC conversations, code, and 128-bit UUIDs. Our attack is possible even though each of the above sequences are included in just one document in the training data.
We comprehensively evaluate our extraction attack to understand the factors that contribute to its success. For example, we find that larger models are more vulnerable than smaller models. We conclude by drawing lessons and discussing possible safeguards for training large language models. Read More
Adversarial Examples in Deep Learning — A Primer
Introducing adversarial examples in deep learning vision models
We have seen the advent of state-of-the-art (SOTA) deep learning models for computer vision ever since we started getting bigger and better compute (GPUs and TPUs), more data (ImageNet etc.) and easy to use open-source software and tools (TensorFlow and PyTorch). Every year (and now every few months!) we see the next SOTA deep learning model dethrone the previous model in terms of Top-k accuracy for benchmark datasets. The following figure depicts some of the latest SOTA deep learning vision models (and doesn’t depict some like Google’s BigTransfer!). Read More
For under $40, you can learn all about Python, machine learning and artificial intelligence
This week in thinking machines news, a Harvard professor and his students have now raised $14 million to create artificial intelligence so smart that even hackers can’t crack it. Meanwhile, reports from the White House suggest the federal government is close to issuing their directives on how agencies should regulate AI going forward.
And if story no. 1 makes you at all dubious about the impact of story no. 2…well, welcome to the amazing world of Python, machine learning and the tech wonders and ethical quandaries of creating human-based artificial life. Read More
Microsoft, MITRE Release Adversarial Machine Learning Threat Matrix
Microsoft and MITRE, in collaboration with a dozen other organizations, have developed a framework designed to help identify, respond to, and remediate attacks targeting machine learning (ML) systems.
Many companies today do not have the necessary tools to secure machine learning systems. …The Adversarial ML Threat Matrix, which Microsoft has released in collaboration with MITRE, among others, is an industry-focused open framework that aims to address this issue. Read More
Phantom of the ADAS: Securing Advanced Driver-Assistance Systems from Split-Second Phantom Attacks
In this paper, we investigate “split-second phantom attacks,” a scientific gap that causes two commercial advanced driver-assistance systems (ADASs), Telsa Model X (HW 2.5 and HW 3) and Mobileye 630, to treat a depthless object that appears for a few milliseconds as a real obstacle/object. We discuss the challenge that split-second phantom attacks create for ADASs. We demonstrate how attackers can apply split-second phantom attacks remotely by embedding phantom road signs into an advertisement presented on a digital billboard which causes Tesla’s autopilot to suddenly stop the car in the middle of a road and Mobileye 630 to issue false notifications. We also demonstrate how attackers can use a projector in order to cause Tesla’s autopilot to apply the brakes in response to a phantom of a pedestrian that was projected on the road and Mobileye 630 to issue false notifications in response to a projected road sign. To counter this threat, we propose a countermeasure which can determine whether a detected object is a phantom or real using just the camera sensor. The countermeasure (GhostBusters) uses a “committee of experts” approach and combines the results obtained from four lightweight deep convolutional neural networks that assess the authenticity of an object based on the object’s light, context, surface, and depth. We demonstrate our countermeasure’s effectiveness (it obtains a TPR of 0.994 with an FPR of zero) and test its robustness to adversarial machine learning attacks. Read More
Why Adversarial Machine Learning Is the Next Big Threat to National Security
The Joint Artificial Intelligence Center (JAIC), a division of the United States Department of Defense (DoD) tasked with accelerating the adoption of artificial intelligence (AI) across the branches of the military, has stated that AI will eventually impact every mission carried out by the DoD.
… In particular, adversarial machine learning (AML), an emerging AI practice that involves independent and state-sponsored actors manipulating machine learning algorithms to cause model malfunctions, could have catastrophic consequences. Read More
Can 3D Adversarial Logos Cloak Humans?
With the trend of adversarial attacks, researchers attempt to fool trained object detectors in 2D scenes. Among many of them, an intriguing new form of attack with potential real-world usage is to append adversarial patches (e.g. logos) to images. Nevertheless, much less have we known about adversarial attacks from 3D rendering views, which is essential for the attack to be persistently strong in the physical world. This paper presents a new 3D adversarial logo attack: we construct an arbitrary shape logo from a 2D texture image and map this image into a 3D adversarial logo via a texture mapping called logo transformation. The resulting 3D adversarial logo is then viewed as an adversarial texture enabling easy manipulation of its shape and position. This greatly extends the versatility of adversarial training for computer graphics synthesized imagery. Contrary to the traditional adversarial patch, this new form of attack is mapped into the 3D object world and back-propagates to the 2D image domain through differentiable rendering. In addition, and unlike existing adversarial patches, our new 3D adversarial logo is shown to fool state-of-the-art deep object detectors robustly under model rotations, leading to one step further for realistic attacks in the physical world. Read More
#adversarialFoolChecker: A platform to check how robust an image is against adversarial attacks
Deep neural networks (DNNs) have so far proved to be highly promising for a wide range of applications, including image and audio classification. Nonetheless, their performance heavily relies on the amount of data used to train them, and large datasets are not always readily available.
When DNNs are not adequately trained, they are more prone to misclassifying data. This makes them vulnerable to a particular class of cyber-attacks known as adversarial attacks. In an adversarial attack, an attacker creates replicas of real data that are designed to fool a DNN (i.e., adversarial data), tricking it into misclassifying data and thus impairing its function.
In recent years, computer scientists and developers have proposed a variety of tools that could protect deep neural architectures from these attacks, by detecting the differences between original and adversarial data. However, so far, none of these solutions has proved universally effective. Read More
Legal Risks of Adversarial Machine Learning Research
Adversarial machine learning is the systematic study of how motivated adversaries can compromise the confidentiality, integrity, and availability of machine learning (ML) systems through targeted or blanket attacks. The problem of attacking ML systems is so prevalent that CERT, the federally funded research and development center tasked with studying attacks, issued a broad vulnerability note on how most ML classifiers are vulnerable to adversarial manipulation. Corporations and governments are paying attention. Google, IBM, Facebook, and Microsoft have committed to investing in securing machine learning systems. The US is putting security and safety of AI systems as a top priority when defining AI regulation, with the EU releasing a complete set of non-binding6checklists as part of its Trustworthy AI initiative.
Research in this field is booming. Read More