Recently, segmentation neural networks have been significantly improved by demonstrating very promising accuracies on public benchmarks. However, these models are very heavy and generally suffer from low inference spe...
详细信息
Magnetic resonance (MR) image acquisition is an inherently prolonged process, whose acceleration has long been the subject of research. This is commonly achieved by obtaining multiple undersampled images, simultaneous...
详细信息
We investigate the systematic convolutional low density generator matrix (SC-LDGM) codes over Rayleigh fading channels with symmetric alpha-stable (SαS) impulsive noise. The performance is analy...
详细信息
This study proposes different standalone models viz: Elman neural network (ENN), Boosted Tree algorithm (BTA), and f relevance vector machine (RVM) for modeling arsenic (As (mg/kg)) and zinc (Zn (mg/kg)) in marine sed...
详细信息
Designing a forensic convolutional neural network (CNN) is usually based on some ad-hoc intuition and domain knowledge. Many methods to automate neural network design have been proposed for computer vision tasks, but ...
详细信息
ISBN:
(数字)9781728113319
ISBN:
(纸本)9781728113326
Designing a forensic convolutional neural network (CNN) is usually based on some ad-hoc intuition and domain knowledge. Many methods to automate neural network design have been proposed for computer vision tasks, but they may not be directly applied to image forensic problems, which tend to detect weak traces signals left by image operations rather than strong image content signals. In this paper, we propose an approach to learn an optimal forensic CNN structure with reinforcement learning for detecting multiple image tampering operations. A learning agent is introduced to select CNN layers sequentially in a limited state-action space using Q-learning with an $\epsilon$-greedy strategy and experience replay. The experiments demonstrate that the auto-generated network performs better than other classic image forensic methods and shows more robustness against JPEG compression. To our knowledge, this is the first attempt to design forensic deep neural networks automatically with reinforcement learning.
Recently, contour information largely improves the performance of saliency detection. However, the discussion on the correlation between saliency and contour remains scarce. In this paper, we first analyze such correl...
详细信息
ISBN:
(数字)9781728171685
ISBN:
(纸本)9781728171692
Recently, contour information largely improves the performance of saliency detection. However, the discussion on the correlation between saliency and contour remains scarce. In this paper, we first analyze such correlation and then propose an interactive two-stream decoder to explore multiple cues, including saliency, contour and their correlation. Specifically, our decoder consists of two branches, a saliency branch and a contour branch. Each branch is assigned to learn distinctive features for predicting the corresponding map. Meanwhile, the intermediate connections are forced to learn the correlation by interactively transmitting the features from each branch to the other one. In addition, we develop an adaptive contour loss to automatically discriminate hard examples during learning process. Extensive experiments on six benchmarks well demonstrate that our network achieves competitive performance with a fast speed around 50 FPS. Moreover, our VGG-based model only contains 17.08 million parameters, which is significantly smaller than other VGG-based approaches. Code has been made availab.e at: https://***/moothes/ITSD-pytorch.
Person re-identification (re-id), the process of matching pedestrian images across different camera views, is an important task in visual surveillance. Substantial development of re-id has recently been observed, and ...
详细信息
Person re-identification (re-id), the process of matching pedestrian images across different camera views, is an important task in visual surveillance. Substantial development of re-id has recently been observed, and the majority of existing models are largely dependent on color appearance and assume that pedestrians do not change their clothes across camera views. This limitation, however, can be an issue for re-id when tracking a person at different places and at different time if that person (e.g., a criminal suspect) changes his/her clothes, causing most existing methods to fail, since they are heavily relying on color appearance and thus they are inclined to match a person to another person wearing similar clothes. In this work, we call the person re-id under clothing change the "cross-clothes person re-id". In particular, we consider the case when a person only changes his clothes moderately as a first attempt at solving this problem based on visible light images;that is we assume that a person wears clothes of a similar thickness, and thus the shape of a person would not change significantly when the weather does not change substantially within a short period of time. We perform cross-clothes person re-id based on a contour sketch of person image to take advantage of the shape of the human body instead of color information for extracting features that are robust to moderate clothing change. To select/sample more reliable and discriminative curve patterns on a body contour sketch, we introduce a learning-based spatial polar transformation (SPT) layer in the deep neural network to transform contour sketch images for extracting reliable and discriminant convolutional neural network (CNN) features in a polar coordinate space. An angle-specific extractor (ASE) is applied in the following layers to extract more fine-grained discriminant angle-specific features. By varying the sampling range of the SPT, we develop a multistream network for aggregating multi-granular
General-purpose forensics on small image patches appears to be feasible and important, but in fact poses a challenge due to insufficient statistics. Furthermore, there is a need to develop a forensic approach that can...
详细信息
We uncover a phenomenon largely overlooked by the scientific community utilizing AI: neural networks exhibit high susceptibility to minute perturbations, resulting in significant deviations in their outputs. Through a...
详细信息
An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of te...
详细信息
ISBN:
(数字)9781728171685
ISBN:
(纸本)9781728171692
An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of temporal activity detection are based on deep learning, and they typically perform very well with large scale annotated videos for training. However, these methods are limited in real applications due to the unavailab.e videos about certain activity classes and the time-consuming data annotation. To solve this challenging problem, we propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. The proposed network is optimized with an innovative loss function that considers the embeddings of activity lab.ls and their super-classes while learning the common semantics of seen and unseen activities. Experiments on both the THUMOS'14 and the Charades datasets show promising performance in terms of detecting unseen activities.
暂无评论