Deep neural networks for video classification, just like image classification networks, may be subjected to adversarial manipulation. The main difference between image classifiers and video classifiers is that the lat...
详细信息
ISBN:
(纸本)9781665445092
Deep neural networks for video classification, just like image classification networks, may be subjected to adversarial manipulation. The main difference between image classifiers and video classifiers is that the latter usually use temporal information contained within the video. In this work we present a manipulation scheme for fooling video classifiers by introducing a flickering temporal perturbation that in some cases may be unnoticeable by human observers and is implementable in the real world. After demonstrating the manipulation of action classification of single videos, we generalize the procedure to make universal adversarial perturbation, achieving high fooling ratio. In addition, we generalize the universal perturbation and produce a temporal-invariant perturbation, which can be applied to the video without synchronizing the perturbation to the input. The attack was implemented on several target models and the transferability of the attack was demonstrated. These properties allow us to bridge the gap between simulated environment and real-world application, as will be demonstrated in this paper for the first time for an over-the-air flickering attack.
As a great invention that affects human development and progress, writing plays an essential role in human life and learning and the inheritance and development of culture. For thousands of years, the inheritance of C...
详细信息
Facial recognition systems are incredibly important in today's digital world, spanning various industries. It has several benefits and Plays a crucial part in identifying individuals, authentication, and security....
详细信息
Mineral wool production is a non-linear process that makes it hard to control the final quality. Therefore, having a non-destructive method to analyze the product quality and recognize defective products is critical. ...
详细信息
ISBN:
(纸本)9781665493130
Mineral wool production is a non-linear process that makes it hard to control the final quality. Therefore, having a non-destructive method to analyze the product quality and recognize defective products is critical. For this purpose, we developed a visual quality control system for mineral wool. X-ray images of wool specimens were collected to create a training set of defective and non-defective samples. Afterward, we developed several recognition models based on the ResNet architecture to find the most efficient model. In order to have a light-weight and fast inference model for real-life applicability, two structural pruning methods are applied to the classifiers. Considering the low quantity of the dataset, cross-validation and augmentation methods are used during the training. As a result, we obtained a model with more than 98% accuracy, which in comparison to the current procedure used at the company, it can recognize 20% more defective products.
Artificial intelligence technology drives the reform of traditional teaching concepts, models, content, and methods, providing assistance for the informatization and intelligence of education. Classroom teaching activ...
详细信息
Text extraction and character recognition are the computervision tasks which became important after smart phones with good camera. Character recognition from scene text images is still challenging area, because the c...
详细信息
Many computervision systems require users to upload image features to the cloud for processing and storage. These features can be exploited to recover sensitive information about the scene or subjects, e.g., by recon...
详细信息
ISBN:
(纸本)9781665445092
Many computervision systems require users to upload image features to the cloud for processing and storage. These features can be exploited to recover sensitive information about the scene or subjects, e.g., by reconstructing the appearance of the original image. To address this privacy concern, we propose a new privacy-preserving feature representation. The core idea of our work is to drop constraints from each feature descriptor by embedding it within an affine subspace containing the original feature as well as adversarial feature samples. Feature matching on the privacy-preserving representation is enabled based on the notion of subspace-to-subspace distance. We experimentally demonstrate the effectiveness of our method and its high practical relevance for the applications of visual localization and mapping as well as face authentication. Compared to the original features, our approach makes it significantly more difficult for an adversary to recover private information.
We apply computervision pose estimation techniques developed expressly for the data-scarce infant domain to the study of torticollis, a common condition in infants for which early identification and treatment is crit...
详细信息
Previous work [40] shows that a better density map representation can improve the performance of crowd counting. In this paper, we investigate learning the density map representation through an unbalanced optimal tran...
详细信息
ISBN:
(纸本)9781665445092
Previous work [40] shows that a better density map representation can improve the performance of crowd counting. In this paper, we investigate learning the density map representation through an unbalanced optimal transport problem, and propose a generalized loss function to learn density maps for crowd counting and localization. We prove that pixel-wise L2 loss and Bayesian loss [29] are special cases and suboptimal solutions to our proposed loss function. A perspective-guided transport cost function is further proposed to better handle the perspective transformation in crowd images. Since the predicted density will be pushed toward annotation positions, the density map prediction will be sparse and can naturally be used for localization. Finally, the proposed loss outperforms other losses on four large-scale datasets for counting, and achieves the best localization performance on NWPU-Crowd and UCF-QNRF.
In this article, a human posture recognition model combining computervision and deep learning (DL) algorithm is proposed. Under the condition of keeping the data structure and sample labels unchanged, the model rando...
详细信息
暂无评论