Due to differences in the quantity and size of observed targets, hyperspectral images are characterized by class imbalance. The standard deep learning classification model training scheme optimizes the overall classif...
详细信息
Due to differences in the quantity and size of observed targets, hyperspectral images are characterized by class imbalance. The standard deep learning classification model training scheme optimizes the overall classification error, which may lead to performance imbalance between classes in hyperspectral image classification frameworks. Therefore, a novel factor annealing decoupling compositional training method is proposed in this paper. Without requiring resampling or reweighting, it implicitly modulates the training process, so standard models can sufficiently learn the representation of the minority classes and further be trained as robust classifiers. Specifically, the label-distribution-aware margin loss is combined with the error-rate-based cross-entropy loss via combination factor, which considers both imbalanced data representation learning and classifier overall performance. Then, a factor annealing optimization training scheme is designed to adjust the combination factor, which solves the stage division problem of two-stage decoupling learning. Experimental results on two hyperspectral image datasets demonstrate that, as compared with other competing approaches, the proposed method can continuously and stably optimize the model parameters, achieving improvements in class average metrics and difficult classes without affecting overall classification performance. A novel factor annealing decoupling compositional training method for imbalanced hyperspectral image classification is proposed in this paper. It considers both imbalanced data representation learning and classifier overall performance and solves the stage division problem of two-stage decoupling learning. image
We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our m...
详细信息
We propose a scheme for supervised image classification that uses privileged information, in the form of keypoint annotations for the training data, to learn strong models from small and/or biased training sets. Our main motivation is the recognition of animal species for ecological applications such as biodiversity modelling, which is challenging because of long-tailed species distributions due to rare species, and strong dataset biases such as repetitive scene background in camera traps. To counteract these challenges, we propose a visual attention mechanism that is supervised via keypoint annotations that highlight important object parts. This privileged information, implemented as a novel privileged pooling operation, is only required during training and helps the model to focus on regions that are discriminative. In experiments with three different animal species datasets, we show that deep networks with privileged pooling can use small training sets more efficiently and generalize better.
Aiming at the difficulties in object detection and recognition in remotesensingimages caused by high background complexity, large scale variations of targets, and the presence of numerous small objects, an improved ...
Aiming at the difficulties in object detection and recognition in remotesensingimages caused by high background complexity, large scale variations of targets, and the presence of numerous small objects, an improved method for remotesensingimage object detection based on YOLOv7-tiny is proposed. This method combines the loss function based on normalized Gaussian Wasserstein distance (NWD) with the CIoU loss function to address the problem of sensitivity to positional deviation of small objects by IoU-Loss. The addition of a global attention mechanism (GAM) in the backbone network reduces information diffusion and enhances the interaction at the global dimension to mitigate the interference of complex backgrounds in remotesensingimages on the model, enabling the model to focus on the feature extraction of the desired targets. Finally, the coupled detection head (Coupled Head) of the model is replaced with a decoupled detection head (Decoupled Head), allowing the classification and regression tasks to output from different branches to achieve decoupling and avoid the decrease in detection accuracy caused by conflicts between classification and regression. The experimental results of this method on the public dataset DIOR achieved 88.73% accuracy, which is an improvement of 1.78% compared to the unimproved method's accuracy of 86.95%. Furthermore, compared to other researchers' methods tested on DIOR, the proposed method also shows improvement, thus validating its effectiveness.
Leveraging visual sensing technologies for the detection and tracking of vehicles represents a critical application domain for unmanned aerial vehicles (UAVs), notably in challenging operational *** study focuses on e...
详细信息
Deep-learning-based models usually require a large amount of data for training, which guarantees the effectiveness of the trained model. Generative models are no exception, and sufficient training data are necessary f...
详细信息
Deep-learning-based models usually require a large amount of data for training, which guarantees the effectiveness of the trained model. Generative models are no exception, and sufficient training data are necessary for the diversity of generated images. However, for synthetic aperture radar (SAR) images, data acquisition is expensive. Therefore, SAR image generation under a few training samples is still a challenging problem to be solved. In this article, we propose an attribute-guided generative adversarial network (AGGAN) with an improved episode training strategy for few-shot SAR image generation. First, we design the AGGAN structure, and spectral normalization is used to stabilize the training in the few-shot situation. The attribute labels of AGGAN are designed to be the category and aspect angle labels, which are essential information for SAR images. Second, an improved episode training strategy is proposed according to the characteristics of the few-shot generative task, and it can improve the quality of generated images in the few-shot situation. In addition, we explore the effectiveness of the proposed method when using different auxiliary data for training and use the Moving and Stationary Target Acquisition and recognition benchmark dataset and a simulated SAR dataset for verification. The experimental results show that AGGAN and the proposed improved episode training strategy can generate images of better quality when compared with some existing methods, which have been verified through visual observation, image similarity measures, and recognition experiments. When applying the generated images to the 5-shot SAR imagerecognition problem, the average recognition accuracy can be improved by at least 4$\%$.
Aiming at the problems of low planning accuracy and long planning time in the traditional spatial planning method of urban landscape architecture distribution pattern, a spatial planning method of urban landscape arch...
详细信息
Aiming at the problems of low planning accuracy and long planning time in the traditional spatial planning method of urban landscape architecture distribution pattern, a spatial planning method of urban landscape architecture distribution pattern based on evolutionary algorithm was proposed. First, we acquire urban landscape remotesensingimages through ETM+ and Landsat TM/OLI images, and use ENVI software to conduct geometric correction, image enhancement and other imageprocessing. Then, we acquire spatial data of landscape distribution pattern from urban landscape green space types, patch area size, number and other aspects. We then use differential evolution algorithm to calculate the fitness value corresponding to the initialised population, extract landscape features, and use mutation operators. The optimal solution is obtained through the three steps of crossover operator and selection operation, which is the optimal spatial planning strategy. The simulation results show that the proposed method has higher precision and shorter planning time in spatial planning of urban landscape architecture distribution pattern.
Visual impairment is one of the most significant challenges facing humanity, Aespecially in an era where information is frequently conveyed through text rather than voice. To address this, the proposed system is desig...
详细信息
Visual impairment is one of the most significant challenges facing humanity, Aespecially in an era where information is frequently conveyed through text rather than voice. To address this, the proposed system is designed to assist individuals with visual impairments. This paper presents the development of a real-time Text-to-Speech (TTS) Aembedded system based on the Raspberry Pi 4. AOur system incorporates a novel approach to enhance the accuracy of text recognition using Optical Character recognition (OCR) from images. Specifically, a series of preprocessing steps are employed, selected dynamically by a decision-making process based on the content of the image. The imageprocessing is handled using OpenCV2, while the conversion of text to speech is achieved through the pyttsx3 Python library. The entire system is implemented and tested on a Raspberry Pi 4, connected to a USB Full HD camera for high-resolution image acquisition, and controlled via the Traffic HAT-LED module. Experimental results demonstrate that our system achieves a minimum accuracy of 88.33% in text recognition from images.
Due to the advantages of high throughput, low latency, and low power consumption, optical neural networks hold great promise in addressing the challenges of energy consumption and computational efficiency faced by cur...
详细信息
Terrain identification of coastal is of great significance for coastal development activities and coastal terrain survey in overseas areas. However, due to the complex characteristics of coastal features, the use of r...
详细信息
The information conveyed through facial expressions accounts for a large proportion of the total information and can effectively express people's intentions and emotions. Facial expression recognition has laid the...
详细信息
暂无评论