Atmospheric turbulence, a common phenomenon in daily life, is primarily caused by the uneven heating of the Earth's surface. This phenomenon results in distorted and blurred acquired images or videos and can signi...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Atmospheric turbulence, a common phenomenon in daily life, is primarily caused by the uneven heating of the Earth's surface. This phenomenon results in distorted and blurred acquired images or videos and can significantly impact downstream vision tasks, particularly those that rely on capturing clear, stable images or videos from outdoor environments, such as accurately detecting or recognizing objects. Therefore, people have proposed ways to simulate atmospheric turbulence and designed effective deeplearning-based methods to remove the atmospheric turbulence effect. However, these synthesized turbulent images can not cover all the range of real-world turbulence effects. Though the models have achieved great performance for synthetic scenarios, there always exists a performance drop when applied to real-world cases. Moreover, reducing real-world turbulence is a more challenging task as there are no clean ground truth counterparts provided to the models during training. In this paper, we propose a real-world atmospheric turbulence mitigation model under a domain adaptation framework, which links the supervised simulated atmospheric turbulence correction with the unsupervised real-world atmospheric turbulence correction. We will show our proposed method enhances performance in real-world atmospheric turbulence scenarios, improving both image quality and downstream vision tasks.
Facial expression recognition (FER) is utilized in various fields that analyze facial expressions. FER is attracting increasing attention for its role in improving the convenience in human life. It is widely applied i...
详细信息
Facial expression recognition (FER) is utilized in various fields that analyze facial expressions. FER is attracting increasing attention for its role in improving the convenience in human life. It is widely applied in human-computer interaction tasks. However, recently, FER tasks have encountered certain data and training issues. To address these issues in FER, few-shot learning (FSL) has been researched as a new approach. In this paper, we focus on analyzing FER techniques based on FSL and consider the computational complexity and processingtime in these models. FSL has been researched as it can solve the problems of training with few datasets and generalizing in a wild-environmental condition. Based on our analysis, we describe certain existing challenges in the use of FSL in FER systems and suggest research directions to resolve these issues. FER using FSL can be time efficient and reduce the complexity in many other real-timeprocessing tasks and is an important area for further research.
Malware analysis is essential for detecting and mitigating the effects of malicious software. This study introduces a novel hybrid approach using a combination of long short-term memory (LSTM) and convolutional neural...
详细信息
Malware analysis is essential for detecting and mitigating the effects of malicious software. This study introduces a novel hybrid approach using a combination of long short-term memory (LSTM) and convolutional neural networks (CNN) to enhance malware analysis. The proposed work uses a malware classification method combining imageprocessing and machine learning. Malware binaries are converted into grayscale images and analyzed with CNN-LSTM networks. Dynamic features are extracted, ranked, and reduced via Principal Component Analysis (PCA). Various classifiers are used, with final classification by a voting scheme, providing a robust solution for accurate malware family classification. Our approach processes binary code inputs, with the LSTM capturing temporal dependencies and the CNN performing parallel feature extraction. PCA is employed for prominent feature selection, reducing computational time. The proposed approach was evaluated on a public malware dataset and captured through network traffic, demonstrating state-of-the-art performance in identifying various malware families. It significantly reduces the resources required for manual analysis and improves system security. Our approach achieved high precision, recall, accuracy, and F1 score, outperforming existing methods. Future research directions include improving feature extraction techniques and developing real-time detection models that offer a powerful malware detection and analysis tool with promising results and potential for further advancements.
With the fast development of AI technologies, deeplearning is widely applied for biomedical data analytics and digital healthcare. However, there remain gaps between AI-aided diagnosis and real-world healthcare deman...
详细信息
With the fast development of AI technologies, deeplearning is widely applied for biomedical data analytics and digital healthcare. However, there remain gaps between AI-aided diagnosis and real-world healthcare demands. For example, hemodynamic parameters of the middle cerebral artery (MCA) have significant clinical value for diagnosing adverse perinatal results. Nevertheless, the current measurement procedure is tedious for sonographers. To reduce the workload of sonographers, we propose MCAS-GP, a deeplearning-empowered framework that tackles the Middle Cerebral Artery Segmentation and Gate Proposition. MCAS-GP can automatically segment the region of the MCA and detect the corresponding position of the gate in the procedure of fetal MCA Doppler assessment. In MCAS-GP, a novel learnable atrous spatial pyramid pooling (LASPP) module is designed to adaptively learn multi-scale features. We also propose a novel evaluation metric, Affiliation Index, for measuring the effectiveness of the position of the output gate. To evaluate our proposed MCAS-GP, we build a large-scale MCA dataset, collaborating with the International Peace Maternity and Child Health Hospital of China welfare institute (IPMCH). Extensive experiments on the MCA dataset and two other public surgical datasets demonstrate that MCAS-GP can achieve considerable performance improvement in both accuracy and inference time.
Smart mobility intelligent traffic services have become critical in intelligent transportation systems (ITS). This involves using advanced sensors and controllers and the ability to respond to real-time traffic situat...
详细信息
In this paper, a unified deeplearning framework is developed for high-precision direction-of-arrival (DOA) estimation. Unlike previous methods that divide the real and imaginary parts of complex-valued sparse problem...
详细信息
In this paper, a unified deeplearning framework is developed for high-precision direction-of-arrival (DOA) estimation. Unlike previous methods that divide the real and imaginary parts of complex-valued sparse problem into two separate input channels, a real-valued transformation is adopted to encode the correlation between them. Then, a novel adaptive attention aggregation residual network (A(3)R-Net) is designed to overcome the challenges in the case of low signal-to-noise ratios or small inter-signal angle separations. First, to alleviate the gradient disappearance and gradient explosion caused by network deepening, a residual learning strategy is introduced to construct a deep estimation network that learns the inverse mapping from the array measurement vector to the original spatial spectrum. Second, since the feature fusion method via simple summation in the shortcut connection ignores the inconsistency on the scale and semantic of features, an adaptive attention aggregation module (A(3)M) with adaptive channel context aggregators is proposed to capture multi-scale channel contexts and generate element-wise fusion weights. Finally, a dilated convolution with a broader receptive field is embedded into the channel context aggregator to learn wider local cross-channel association. Extensive simulation results demonstrate the superiority and robustness of the proposed method compared with other state-of-the-art methods.
This study investigates economical scheduling of charging for an electric vehicle (EV) in a typical household with an intelligent charging management system. The problem formulation considers rooftop solar power gener...
详细信息
This study investigates economical scheduling of charging for an electric vehicle (EV) in a typical household with an intelligent charging management system. The problem formulation considers rooftop solar power generation, time-varying domestic energy consumption, real-time pricing of electricity, and user preferences. This task traditionally takes the form of a mixed-integer linear programming (MILP) problem, but we demonstrate its equivalence to linear programming (LP) to reduce computational complexity. The LP problem can be solved to global optimality if all future information is known, which is unrealistic in practice and replaced with forecasting. learning-based methods such as deep reinforcement learning (DRL) eliminate the need for a forecaster and make online decisions rapidly using a learned policy. We propose an approach based on imitation learning that leverages the knowledge of an LP expert by learning from its optimal demonstrations instead of learning from scratch in DRL. Our approach trains a deep neural network (DNN) based policy efficiently in a supervised manner and incorporates a safety post-processing mechanism that enforces strict constraint satisfaction. Numerical studies on real-world data show that the proposed approach achieves 23 similar to 220 times speedup compared to DRL for DNN training, and the total electricity cost is far lower than DRL as well, which is strikingly close to the lower bound in theory.
Traffic signal light detection poses significant challenges in the intelligent driving sector, with high precision and efficiency being crucial for system safety. Advances in deeplearning have led to significant impr...
详细信息
Traffic signal light detection poses significant challenges in the intelligent driving sector, with high precision and efficiency being crucial for system safety. Advances in deeplearning have led to significant improvements in image object detection. However, existing methods continue to struggle with balancing detection speed and accuracy. We propose a lightweight model for traffic light detection that uses a streamlined backbone network and a Low-GD neck architecture. The model's backbone employs structured reparameterization and lightweight Vision Transformers, using multi-branch and Feed-Forward Network structures to boost informational richness and positional awareness, respectively. The Neck network utilizes the Low-GD structure to enhance the aggregation and integration of multi-scale features, reducing information loss during cross-layer exchanges. We introduce a data augmentation strategy using Stable Diffusion to expand our traffic light dataset in complex weather conditions like fog, rain, and snow, improving model generalization. Our method excels on the YCTL2024 traffic light dataset, achieving a detection speed of 135 FPS and 98.23% accuracy, with only 1.3M model parameters. Testing on the Bosch Small Traffic Lights Dataset confirms the method's strong generalization capabilities. This suggests that our proposed method can effectively provide accurate and real-time traffic light detection.
This paper proposes an innovative algorithm for optimizing intelligent image data systems based on deeplearning. The algorithm combines image feature extraction, data preprocessing and efficient optimization strategi...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
This paper proposes an innovative algorithm for optimizing intelligent image data systems based on deeplearning. The algorithm combines image feature extraction, data preprocessing and efficient optimization strategies to improve the performance and accuracy of image data processing systems. First, by designing a deep CNN architecture, important features in the image are extracted to achieve efficient completion of image recognition and classification tasks. Subsequently, a new multi-level data processing method is proposed, which can optimize image data at different levels, thereby improving processing speed and reducing noise interference. Through a series of simulation experiments, the results show that the image classification accuracy of the algorithm is improved by about 12%, from 85.6% of the traditional method to 97.3%. In addition, the processing efficiency is improved by about 20%, the data processingtime is reduced from 2.5 seconds of the traditional method to 2 seconds, and the stability of the system is significantly enhanced by introducing optimization strategies, and the stability is improved by about 18%. The optimized algorithm shows significant advantages in both accuracy and efficiency, meeting the needs of efficient intelligent imageprocessing systems.
Online detection of action start is a significant and challenging task that requires prompt identification of action start positions and corresponding categories within streaming videos. This task presents challenges ...
详细信息
Online detection of action start is a significant and challenging task that requires prompt identification of action start positions and corresponding categories within streaming videos. This task presents challenges due to data imbalance, similarity in boundary content, and real-time detection requirements. Here, a novel time-Attentive Fusion Network is introduced to address the requirements of improved action detection accuracy and operational efficiency. The time-attentive fusion module is proposed, which consists of long-term memory attention and the fusion feature learning mechanism, to improve spatial-temporal feature learning. The temporal memory attention mechanism captures more effective temporal dependencies by employing weighted linear attention. The fusion feature learning mechanism facilitates the incorporation of current moment action information with historical data, thus enhancing the representation. The proposed method exhibits linear complexity and parallelism, enabling rapid training and inference speed. This method is evaluated on two challenging datasets: THUMOS'14 and ActivityNet v1.3. The experimental results demonstrate that the proposed method significantly outperforms existing state-of-the-art methods in terms of both detection accuracy and inference speed. Here, a novel time-Attentive Fusion Network (TAF-Net) is introduced to address the requirements of improved action detection accuracy and operational efficiency in the task of online detection of action start. The proposed model not only learns valuable sequence information for precise detection but its linear computational complexity and parallelism also contribute to a faster inference speed. image
暂无评论