We present a 3D Convolutional neural Networks (CNNs) based single shot detector for spatial-temporal action detection tasks. Our model includes: (i) two short-term appearance and motion streams, with single RGB and op...
详细信息
ISBN:
(纸本)9781538662496
We present a 3D Convolutional neural Networks (CNNs) based single shot detector for spatial-temporal action detection tasks. Our model includes: (i) two short-term appearance and motion streams, with single RGB and optical flow image input separately, in order to capture the spatial and temporal information for the current frame;(ii) two long-term 3D ConvNet based stream, working on sequences of continuous RGB and optical flow images to capture the context from past frames. Our model achieves strong performance for action detection in video and can be easily integrated into any current two-stream action detection methods. We report a frame-mAP of 71:30% on the challenging UCF101-24 [1] actions dataset, achieving the state-of-the-art result of the one-stage methods. To the best of our knowledge, our work is the first system that combined 3D CNN and SSD in action detection tasks.
The capability of globally modeling and reasoning about relations between image regions is crucial for complex scene understanding tasks such as semantic segmentation. Most current semantic segmentation methods fall b...
详细信息
BigNeuron is an open community bench-testing platform with the goal of setting open standards for accurate and fast automatic neuron tracing. We gathered a diverse set of image volumes across several species that is r...
详细信息
BigNeuron is an open community bench-testing platform with the goal of setting open standards for accurate and fast automatic neuron tracing. We gathered a diverse set of image volumes across several species that is representative of the data obtained in many neuroscience laboratories interested in neuron tracing. Here, we report generated gold standard manual annotations for a subset of the available imaging datasets and quantified tracing quality for 35 automatic tracing algorithms. The goal of generating such a hand-curated diverse dataset is to advance the development of tracing algorithms and enable generalizable benchmarking. Together with image quality features, we pooled the data in an interactive web application that enables users and developers to perform principal component analysis, t-distributed stochastic neighbor embedding, correlation and clustering, visualization of imaging and tracing data, and benchmarking of automatic tracing algorithms in user-defined data subsets. The image quality metrics explain most of the variance in the data, followed by neuromorphological features related to neuron size. We observed that diverse algorithms can provide complementary information to obtain accurate results and developed a method to iteratively combine methods and generate consensus reconstructions. The consensus trees obtained provide estimates of the neuron structure ground truth that typically outperform single algorithms in noisy datasets. However, specific algorithms may outperform the consensus tree strategy in specific imaging conditions. Finally, to aid users in predicting the most accurate automatic tracing results without manual annotations for comparison, we used support vector machine regression to predict reconstruction quality given an image volume and a set of automatic tracings.
We propose an unsupervised superpixel segmentation method by optimizing a randomly-initialized convolutional neural network (CNN) in inference time. Our method generates superpixels via CNN from a single image without...
ISBN:
(数字)9781509066315
ISBN:
(纸本)9781509066322
We propose an unsupervised superpixel segmentation method by optimizing a randomly-initialized convolutional neural network (CNN) in inference time. Our method generates superpixels via CNN from a single image without any labels by minimizing a proposed objective function for superpixel segmentation in inference time. There are three advantages to our method compared with many of existing methods: (i) leverages an image prior of CNN for superpixel segmentation, (ii) adaptively changes the number of superpixels according to the given images, and (iii) controls the property of superpixels by adding an auxiliary cost to the objective function. We verify the advantages of our method quantitatively and qualitatively on BSDS500 dataset.
Recent studies have shown that effectively combining rich representations of convolution neural network can significantly boost the performance of single image super resolution. Although dense skip connections can agg...
详细信息
ISBN:
(纸本)9783030367114;9783030367107
Recent studies have shown that effectively combining rich representations of convolution neural network can significantly boost the performance of single image super resolution. Although dense skip connections can aggressively reduce depth and parameter count by feature reuse, it is a memory-intensive fusion operation. In this paper, we proposed a tree-structured deep aggregation block that spans the spectrum of layers to achieve more accuracy with less parameters and memory in super-resolution. Most of methods fuse the all features of blocks by a simple one-step aggregation. But it don't robust enough for train data with discrepancy. So we propose a recursive aggregation structure to get rich semantic information and perform better on propagation features and gradient. We performed our method on three benchmark datasets and get a comparable result in PSNR (Peak signal-to-Noise Ratio) and SSIM (Structural SIMilarity) compared with state-of-the-art methods.
Ces dernières années, les méthodes d'apprentissage profond ont atteint l'état de l'art dans une vaste gamme de tâches d'apprentissage automatique, y compris la classification d...
详细信息
Ces dernières années, les méthodes d'apprentissage profond ont atteint l'état de l'art dans une vaste gamme de tâches d'apprentissage automatique, y compris la classification d'images et la traduction automatique. Ces architectures sont assemblées pour résoudre des tâches d'apprentissage automatique de bout en bout. Afin d'atteindre des performances de haut niveau, ces architectures nécessitent souvent d'un très grand nombre de paramètres. Les conséquences indésirables sont multiples, et pour y remédier, il est souhaitable de pouvoir comprendre ce qui se passe à l'intérieur des architectures d'apprentissage profond. Il est difficile de le faire en raison de: i) la dimension élevée des représentations ; et ii) la stochasticité du processus de formation. Dans cette thèse, nous étudions ces architectures en introduisant un formalisme à base de graphes, s'appuyant notamment sur les récents progrès du traitement de signaux sur graphe (TSG). À savoir, nous utilisons des graphes pour représenter les espaces latents des réseaux neuronaux profonds. Nous montrons que ce formalisme des graphes nous permet de répondre à diverses questions, notamment: i) mesurer des capacités de généralisation ;ii) réduire la quantité de des choix arbitraires dans la conception du processus d'apprentissage ; iii)améliorer la robustesse aux petites perturbations ajoutées sur les entrées ; et iv) réduire la complexité des calculs. In recent years, Deep Learning methods have achieved state of the art performance in a vast range of machine learning tasks, including image classification and multilingual automatic text translation. These architectures are trained to solve machine learning tasks in an end-to-end fashion. In order to reach top-tier performance, these architectures often require a very large number of trainable parameters. There are multiple undesirable consequences, and in order to tackle these issues, it is desired to be able to open the black boxes of deep learning architectures. Prob
The problem of distinguishing deterministic chaos from non-chaotic dynamics has been an area of active research in time series analysis. Since noise contamination is unavoidable, it renders deterministic chaotic dynam...
The problem of distinguishing deterministic chaos from non-chaotic dynamics has been an area of active research in time series analysis. Since noise contamination is unavoidable, it renders deterministic chaotic dynamics corrupted by noise to appear in close resemblance to stochastic dynamics. As a result, the problem of distinguishing noise-corrupted chaotic dynamics from randomness based on observations without access to the measurements of the state variables is difficult. We propose a new angle to tackle this problem by formulating it as a multi-class classification task. The task of classification involves allocating the observations/measurements to the unknown state variables in order to find the nature of these unobserved internal state variables. We employ signal and imageprocessing based methods to characterize the different system dynamics. A deep learning technique using a state-of-the-art image classifier known as the Convolutional neural Network (CNN) is designed to learn the dynamics. The time series are transformed into textured images of spectrogram and unthresholded recurrence plot (UTRP) for learning stochastic and deterministic chaotic dynamical systems in noise. We have designed a CNN that learns the dynamics of systems from the joint representation of the textured patterns from these images, thereby solving the problem as a pattern recognition task. The robustness and scalability of our approach is evaluated at different noise levels. Our approach demonstrates the advantage of applying the dynamical properties of chaotic systems in the form of joint representation of UTRP images along with spectrogram to improve learning dynamical systems in colored noise.
This paper addresses the problem of floods classification and floods aftermath detection based on both social media and satellite imagery. Automatic detection of disasters such as floods is still a very challenging ta...
详细信息
This paper addresses the problem of floods classification and floods aftermath detection based on both social media and satellite imagery. Automatic detection of disasters such as floods is still a very challenging task. The focus lies on identifying passable routes or roads during floods. Two novel solutions are presented, which were developed for two corresponding tasks at the MediaEval 2018 benchmarking challenge. The tasks are (i) identification of images providing evidence for road passability and (ii) differentiation and detection of passable and non-passable roads in images from two complementary sources of information. For the first challenge, we mainly rely on object and scene-level features extracted through multiple deep models pre-trained on the imageNet and Places datasets. The object and scene-level features are then combined using early, late and double fusion techniques. To identify whether or not it is possible for a vehicle to pass a road in satellite images, we rely on Convolutional neural Networks and a transfer learning-based classification approach. The evaluation of the proposed methods is carried out on the large-scale datasets provided for the benchmark competition. The results demonstrate significant improvement in the performance over the recent state-of-art approaches.
While data poisoning attacks on classifiers were originally proposed to degrade a classifier's usability, there has been strong recent interest in backdoor data poisoning attacks, where the classifier learns to cl...
详细信息
ISBN:
(纸本)9781728108247
While data poisoning attacks on classifiers were originally proposed to degrade a classifier's usability, there has been strong recent interest in backdoor data poisoning attacks, where the classifier learns to classify to a target class whenever a backdoor pattern (e.g., a watermark or innocuous pattern) is added to an example from some class other than the target class. In this paper, we conduct a benchmark experimental study to assess the effectiveness of backdoor attacks against deep neural network (DNN) classifiers for images (CIFAR-10 domain), as well as of anomaly detection defenses against these attacks, assuming the defender has access to the (poisoned) training set. We also propose a novel defense scheme (cluster impurity (CI)) based on two ideas: i) backdoor patterns may cluster in a DNN's (e.g. penultimate) deep layer latent space;ii) image filtering (or additive noise) may remove the backdoor patterns, and thus alter the class decision produced by the DNN. We demonstrate that largely imperceptible single-pixel backdoor attacks are highly successful, with no effect on classifier usability. However, the CI approach is highly effective at detecting these attacks, and more successful than previous backdoor detection methods.
Soft dropout, a generalization of standard "hard" dropout, is introduced to regularize the parameters in neural networks and prevent overfitting. We replace the "hard" dropout mask following a Bern...
详细信息
ISBN:
(纸本)9781728108247
Soft dropout, a generalization of standard "hard" dropout, is introduced to regularize the parameters in neural networks and prevent overfitting. We replace the "hard" dropout mask following a Bernoulli distribution with the "soft" mask following a beta distribution to drop the hidden nodes in different levels. The soft dropout method can introduce continuous mask coefficients in the interval of [0, 1], rather than only zero and one. Meanwhile, in order to implement the adaptive dropout rate via adaptive distribution parameters, we respectively utilize the half-Gaussian distributed and the half-Laplace distributed variables to approximate the beta distributed masks and apply a variation of variational Bayes optimization called stochastic gradient variational Bayes (SGVB) algorithm to optimize the distribution parameters. In the experiments, compared with the standard soft dropout with fixed dropout rate, the adaptive soft dropout method generally improves the performance. In addition, the proposed soft dropout and its adaptive versions achieve performance improvement compared with the referred methods on both image classification and regression tasks.
暂无评论