Many current research issues of image captioning focus on modifying the CNN (Convolutional Neural Network) or RNN (Recurrent Neural Network), while supplementing the attention mechanism to enhance the long-term memory...
详细信息
ISBN:
(纸本)9781728193625
Many current research issues of image captioning focus on modifying the CNN (Convolutional Neural Network) or RNN (Recurrent Neural Network), while supplementing the attention mechanism to enhance the long-term memory ability of the RNN. However, the relationship with input data and CNN model could be another important point. This paper defines the image complexity to enhance model's accuracy. After analyzing the data set, some criteria of the image complexity are defined according to the image grayscale entropy and the two-dimensional entropy for image Captioning. In this paper, a new model is setup to compare with the other model. Although the result is better than the other model by a revised bilingual evaluation understudy (R-BLEU) evaluation index which is a new calculation formula to evaluate image captioning performance.
In this work, we develop a new approach for learning a deep neural network for image classification with noisy labels using ensemble diversified learning. We first partition the training set into multiple subsets with...
详细信息
In this work, we develop a new approach for learning a deep neural network for image classification with noisy labels using ensemble diversified learning. We first partition the training set into multiple subsets with diversified image characteristics. For each subset, we train a separate deep neural network image classifier. These networks are then used to encode the input image into different feature vectors, providing diversified observations of the input image. The encoded features are then fused together and further analyzed by a decision network to produce the final classification output. We study image classification on noisy labels with and without the access to clean samples. Our extensive experimental results on the CIFAR-10 and MNIST datasets demonstrate that our proposed method outperforms existing methods by a large margin.
Precision in event characterization in connected vehicles has become increasingly important with the responsive connectivity that is available to modern vehicles. Event characterization via vehicular sensors is utiliz...
详细信息
ISBN:
(纸本)9781728171227
Precision in event characterization in connected vehicles has become increasingly important with the responsive connectivity that is available to modern vehicles. Event characterization via vehicular sensors is utilized in safety and autonomous driving applications in vehicles. While characterization systems are capable of predicting risky driving patterns, the precision of such systems remains an open issue. The major issues against the driving event characterization systems need to be addressed in connected vehicle settings, which are the heavy imbalance and the event infrequency of the driving data and the existence of the time-series detection systems that are optimized for vehicular settings. To overcome the problems, we introduce the application of the prior-knowledge input method to the characterization systems. Furthermore, we propose a recurrent-based denoising auto-encoder network to populate the existing data for a more robust training process. The results of the conducted experiments show that the introduction of knowledge-based modeling enables the existing systems to reach significantly higher accuracy and F1-score levels. Ultimately, the combination of the two methods enables the proposed model to attain a 14.7% accuracy boost over the baseline by achieving an accuracy of 0.96.
Recently, neural image compression has made significant progress in reducing rate-distortion and has received widespread attention. However, existing methods focus more on perfecting entropy models yet overlook the ab...
详细信息
ISBN:
(纸本)9798350330991;9798350331004
Recently, neural image compression has made significant progress in reducing rate-distortion and has received widespread attention. However, existing methods focus more on perfecting entropy models yet overlook the ability of their encoder networks to extract non-linear features of images, which can promote compression performance. In this paper, we design a learning-based asymmetric image compression network to enhance the feature representation capability for improved compression quality. Firstly, we propose a high-preserving information block (HPIB) consisting of a high-frequency filtering module (HFM) and a feature modulation module (FMM) to fully utilize the different frequency information in images. Secondly, we progressively use the HPIB layer to design a high-performance encoder network for high-fidelity feature extraction. Results from extensive experiments demonstrate that our network performs superior to the prior art in terms of both PSNR and MS-SSIM metrics and achieves 3.91% and 8.88 % BD-rate over VVC on the Kodak and CLIC datasets, respectively.
暂无评论