This paper proposed an improved method that can be used in the identification of images within IoT based surveillance systems through the use of deep learning techniques especially the CNNs. The growth of concern and ...
详细信息
As the demand for volunteer services grows, enhancing the efficiency and management of these services has become a key issue. This paper explores methods of integrating intelligent scheduling management functions with...
详细信息
At present, most low-light image enhancement algorithms rely on artificial design with a priori information and constraints, but it cannot accurately capture deep structural features of the images. In this paper, we p...
详细信息
ISBN:
(纸本)9789811697357;9789811697340
At present, most low-light image enhancement algorithms rely on artificial design with a priori information and constraints, but it cannot accurately capture deep structural features of the images. In this paper, we propose an improved algorithm with EnlightenGAN as the base framework and consider the phenomenon that the distance reflected by Mean Square Error (MSE) is quite different from human intuition. By introducing Structural Similarity (SSIM) in the loss function, we try to improve the structural similarity of the enhanced images to make them more consistent with natural and human intuition. For the instability problem during Generative Adversarial Networks (GAN) training, the relatively loose and easy-to-compute Energy-Based GAN (EBGAN) is used instead of Wasserstein GAN (WGAN). The improved algorithm is finally tested by using Low-light image Enhancement (LIME) dataset as well as the test dataset of EnlightenGAN. The Peak Signal to Noise Ratio (PSNR) and SSIM values of the images processed by using the modified algorithm are calculated and compared with other algorithms, and the results show the proposed method is effective.
This research delves into the prospects of using deep learning and data mining to design an English teaching and quality system model. The paper first conducts a research and analysis on the relevant literature that e...
详细信息
ISBN:
(数字)9798350360240
ISBN:
(纸本)9798350384161
This research delves into the prospects of using deep learning and data mining to design an English teaching and quality system model. The paper first conducts a research and analysis on the relevant literature that examine the evaluation systems designed through data mining and deep learning algorithms. As presented in the findings, deep learning and data mining technologies that include the use of neural networks play a critical role in the development of English language assessment and evaluation systems. The findings equally establish that data mining algorithms and techniques significantly decrease the delay times caused in the traditional English teaching and evaluation system designed through the use of deep learning and data mining techniques.
Visual emotion recognition is a very large field. It plays a very important role in different domains such as security, robotics, and medical tasks. The visual tasks could be either image or video. Unlike the image pr...
详细信息
ISBN:
(纸本)9781665427357
Visual emotion recognition is a very large field. It plays a very important role in different domains such as security, robotics, and medical tasks. The visual tasks could be either image or video. Unlike the imageprocessing, the difficulty of video processing is always a challenge due to changes in information over time variation. Significant performance improvements when applying deep learning algorithms to video processing. This paper presents a deep neural network based on ResNet50 model. The latter is conducted on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) due to the variance of the nature of the data exists which is speech and song. The choice of ResNet model is based on the ability of facing different problems such as of vanishing gradients, the performing stability offered by this model, the ability of CNN for feature extraction which is considered to be the base architecture for ResNet, and the ability of improving the accuracy results and minimizing the loss. The achieved results are 57.73% for song and 55.52% for speech. Results shows that the Resnet50 model is suitable for both speech and song while maintaining performance stability.
The proceedings contain 49 papers. The special focus in this conference is on Actual Problems of Applied Mathematics and Computer systems. The topics include: Initial-Boundary Value Problems for the Loaded H...
ISBN:
(纸本)9783031341267
The proceedings contain 49 papers. The special focus in this conference is on Actual Problems of Applied Mathematics and Computer systems. The topics include: Initial-Boundary Value Problems for the Loaded Hallaire Equation with Gerasimov–Caputo Fractional Derivatives of Different Orders;grid Method for Solving Local and Nonlocal Boundary Value Problems for a Loaded Moisture Transfer Equation with Two Fractional Differentiation Operators;determining Frequencies of Free Longitudinal Vibrations of Rods by Analytical and Numerical Methods;forced Longitudinal Oscillations of a Rod with a Mass at the End;on the Unassociated Matrices Number of the n Order and a Given Determinant;fast Calculation of Parameters of Parallelepipedal Nets for Integration and Interpolation;Modeling of the Adjustable DC Voltage Source for Industrial Greenhouse Lighting systems;modeling of Temperature Contribution to the Interphase Energy of the Faces of Cadmium Crystals at the Boundary with Organic Liquids;mathematical Concept of a Model for processing Metadata of Employee’s Psycho-States for Identifying Him as an Internal Violator (Insider);difference Method for Solving the Dirichlet Problem for a Multidimensional Integro-Differential Equation of Convection-Diffusion;analysis of Influence of Byzantine Robots with Random Behaviour Strategy on Collective Desicion-Making in Swarms;beamforming for Dense Networks-Trends and Techniques;Modified CLNet: A Neural Network Based CSI Feedback Compression Model for Massive MIMO System;facemask Wearing Correctness Detection Using Deep Learning Approaches;an Approach to the Implementation of Nonparametric algorithms for Controlling Multidimensional Processes in a Production Problem;analysis of Neural Networks for image Classification;factors of a Mathematical Model for Detection an Internal Attacker of the Company;comparative Analysis of Methods and algorithms for Building a Digital Twin of a Smart City;Model of Error Correction Device in RNS-FRNN.
On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does ...
详细信息
ISBN:
(纸本)9781665491907
On end-to-end driving, human driving demonstrations are used to train perception-based driving models by imitation learning. This process is supervised on vehicle signals (e.g., steering angle, acceleration) but does not require extra costly supervision (human labeling of sensor data). As a representative of such vision-based end-to-end driving models, CILRS is commonly used as a baseline to compare with new driving models. So far, some latest models achieve better performance than CILRS by using expensive sensor suites and/or by using large amounts of human-labeled data for training. Given the difference in performance, one may think that it is not worth pursuing vision-based pure end-to-end driving. However, we argue that this approach still has great value and potential considering cost and maintenance. In this paper, we present CIL++, which improves on CILRS by both processing higher-resolution images using a human-inspired HFOV as an inductive bias and incorporating a proper attention mechanism. CIL++ achieves competitive performance compared to models which are more costly to develop. We propose to replace CILRS with CIL++ as a strong vision-based pure end-to-end driving baseline supervised by only vehicle signals and trained by conditional imitation learning.
A novel hyperspectral image classification algorithm is proposed and demonstrated on benchmark hyperspectral images. We also introduce a hyperspectral sky imaging dataset that we are collecting for detecting the amoun...
详细信息
Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative pixels, such an adaptive inference paradi...
详细信息
ISBN:
(纸本)9781713871088
Spatial-wise dynamic convolution has become a promising approach to improving the inference efficiency of deep networks. By allocating more computation to the most informative pixels, such an adaptive inference paradigm reduces the spatial redundancy in image features and saves a considerable amount of unnecessary computation. However, the theoretical efficiency achieved by previous methods can hardly translate into a realistic speedup, especially on the multi-core processors (e.g. GPUs). The key challenge is that the existing literature has only focused on designing algorithms with minimal computation, ignoring the fact that the practical latency can also be influenced by scheduling strategies and hardware properties. To bridge the gap between theoretical computation and practical efficiency, we propose a latency-aware spatial-wise dynamic network (LASNet), which performs coarse-grained spatially adaptive inference under the guidance of a novel latency prediction model. The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties. We use the latency predictor to guide both the algorithm design and the scheduling optimization on various hardware platforms. Experiments on image classification, object detection and instance segmentation demonstrate that the proposed framework significantly improves the practical inference efficiency of deep networks. For example, the average latency of a ResNet-101 on the imageNet validation set could be reduced by 36% and 46% on a server GPU (Nvidia Tesla-V100) and an edge device (Nvidia Jetson TX2 GPU) respectively without sacrificing the accuracy. Code is available at https://***/LeapLabTHU/LASNet.
Industrial machine-vision (MV) applications require high-speed stitching of low-textural images from multiple high-resolution cameras for Field-of-View expansion. The most vital step in the stitching process is the ef...
详细信息
暂无评论