Automatic live commenting aims to provide real-time comments on videos for viewers. It encourages users engagement on online video sites, and is also a good benchmark for video-to-text generation. Recent work on this task adopts encoder-decoder models to generate comments. However, these methods do not model the interaction between videos and comments explicitly, so they tend to generate popular comments that are often irrelevant to the videos. In this work, we aim to improve the relevance between live comments and videos by modeling the cross-modal interactions among different modalities. To this end, we propose a multimodal matching transformer to capture the relationships among comments, vision, and audio. The proposed model is based on the transformer framework and can iteratively learn the attention-aware representations for each modality. We evaluate the model ona publicly available live commenting dataset. Experiments show that the multimodal matching transformer model outperforms the state-of-the-art methods. Read More
Daily Archives: February 13, 2020
Phantom of the ADAS
The absence of deployed vehicular communication systems, which prevents the advanced driving assistance systems (ADASs) and autopilots of semi/fully autonomous cars to validate their virtual perception regarding the physical environment surrounding the car with a third party, has been exploited in various attacks suggested by researchers. Since the application of these attacks comes with a cost (exposure of the attacker’s identity), the delicate exposure vs. application balance has held, and attacks of this kind have not yet been encountered in the wild. In this paper, we investigate a new perceptual challenge that causes the ADASs and autopilots of semi/fully autonomous to consider depthless objects (phantoms) as real. We show how attackers can exploit this perceptual challenge to apply phantom attacks and change the abovementioned balance, without the need to physically approach the attack scene, by projecting a phantom via a drone equipped with a portable projector or by presenting a phantom on a hacked digital billboard that faces the Internet and is located near roads. We show that the car industry has not considered this type of attack by demonstrating the attack on today’s most advanced ADAS and autopilot technologies: Mobileye 630 PRO and the Tesla Model X, HW 2.5; our experiments show that when presented with various phantoms, a car’s ADAS or autopilot considers the phantoms as real objects, causing these systems to trigger the brakes, steer into the lane of oncoming traffic, and issue notifications about fake road signs. In order to mitigate this attack, we present a model that analyzes a detected object’s context, surface, and reflected light, which is capable of detecting phantoms with 0.99 AUC. Finally, we explain why the deployment of vehicular communication systems might reduce attackers’ opportunities to apply phantom attacks but won’t eliminate them. Read More