Recently, time Series Remote Sensing images (TSRSIs) have been proven to be a significant resource for land use/land cover (LULC) mapping. deeplearning methods perform well in managing and processing temporal depende...
详细信息
Recently, time Series Remote Sensing images (TSRSIs) have been proven to be a significant resource for land use/land cover (LULC) mapping. deeplearning methods perform well in managing and processing temporal dependencies and have shown remarkable advancements within this domain. Although deeplearning methods have exhibited outstanding performance in classifying TSRSIs, they rely on enough labeled time series samples for effective training. Labeling data with a wide geographical range and a long time span is highly time-consuming and labor-intensive. Active learning (AL) is a promising method of selecting the most informative data for labeling to save human labeling efforts. It has been widely applied in the remote sensing community, except for the classification of TSRSIs. The main challenge of AL in TSRSI classification is dealing with the internal temporal dependencies within TSRSIs and evaluating the informativeness of unlabeled time series data. In this paper, we propose a data-driven active deeplearning framework for TSRSI classification to address the problem of limited labeled time series samples. First, a temporal classifier for TSRSI classification tasks is designed. Next, we propose an effective active learning method to select informative time series samples for labeling, which considers representativeness and uncertainty. For representativeness, we use the K-shape method to cluster time series data. For uncertainty, we construct an auxiliary deep network to evaluate the uncertainty of unlabeled data. The features with rich temporal information in the classifier's middle-hidden layers will be fed into the auxiliary deep network. Then, we define a new loss function with the aim of improving the deep model's performance. Finally, the proposed method in this paper was verified on two TSRSI datasets. The results demonstrate a significant advantage of our method over other approaches to TSRSI. On the MUDS dataset, when the initial number of samples was 100
Cognitive workload is a key factor in understanding human cognitive performance, especially in scenarios that require intensive information processing. This study introduces an innovative method to estimate cognitive ...
详细信息
Cognitive workload is a key factor in understanding human cognitive performance, especially in scenarios that require intensive information processing. This study introduces an innovative method to estimate cognitive workload using eye-tracking data and proposes a novel deeplearning model called BiTCADNet (Bidirectional Temporal Convolutional self-Attention Dense Network). Experiments using the newly created dataset "Cognitive-Eye-Movement" and the publicly available dataset "CL-Drive" show that BiTCADNet significantly outperforms traditional deeplearning models in terms of accuracy, precision, recall, and F1 scores are significantly better than traditional machine learning methods. The proposed method provides a more effective way to monitor and evaluate cognitive workload in real-time, opening the way for its applications in various human-computer interaction environments.
This paper presents a deeplearning-based spectral demosaicing technique trained in an unsupervised manner. Many existing deeplearning-based techniques relying on supervised learning with synthetic images, often unde...
详细信息
This paper presents a deeplearning-based spectral demosaicing technique trained in an unsupervised manner. Many existing deeplearning-based techniques relying on supervised learning with synthetic images, often underperform on real-world images, especially as the number of spectral bands increases. This paper presents a comprehensive unsupervised spectral demosaicing (USD) framework based on the characteristics of spectral mosaic images. This framework encompasses a training method, model structure, transformation strategy, and a well-fitted model selection strategy. To enable the network to dynamically model spectral correlation while maintaining a compact parameter space, we reduce the complexity and parameters of the spectral attention module. This is achieved by dividing the spectral attention tensor into spectral attention matrices in the spatial dimension and spectral attention vector in the channel dimension. This paper also presents Mosaic25 , a real 25-band hyperspectral mosaic image dataset featuring various objects, illuminations, and materials for benchmarking purposes. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed method outperforms conventional unsupervised methods in terms of spatial distortion suppression, spectral fidelity, robustness, and computational cost.
Fabric defect detection has always been a key issue, and it positively correlated its efficiency with productivity. From manual visual methods to machine vision and deeplearning-based techniques, a variety of methods...
详细信息
Fabric defect detection has always been a key issue, and it positively correlated its efficiency with productivity. From manual visual methods to machine vision and deeplearning-based techniques, a variety of methods have been studied to improve production efficiency and product quality. Although deeplearning-based methods have proven to be powerful tools for segmentation, there are still many pressing issues that need to be addressed in practical applications. First, the scarcity of defective samples compared to normal samples can cause data imbalance and thus affect accuracy. Second, high real-time performance is also required in the actual detection process. To overcome these problems, we propose a high real-time convolutional neural network, named Mobile-deeplab, to implement end-to-end defect segmentation. In addition, we proposed a loss function to consider the fabric image sample imbalance problem. We evaluated the performance of the model with two public structured datasets and three self-constructed structured datasets. The experimental results show that the segmentation method has better segmentation accuracy than other segmentation models, which verifies the segmentation effect of the method. In addition, 87.11 frames per second on a 256 x 256 size image meet industrial real-time requirements.
The control of the froth flotation process in the mineral industry is a challenging task due to its multiple impacting parameters. Accurate and convenient examination of the concentrate grade is a crucial step in real...
详细信息
The control of the froth flotation process in the mineral industry is a challenging task due to its multiple impacting parameters. Accurate and convenient examination of the concentrate grade is a crucial step in realizing effective and real-time control of the flotation process. The goal of this study is to employ imageprocessing techniques and CNN-based features extraction combined with machine learning and deeplearning to predict the elemental composition of minerals in the flotation froth. A real world dataset has been collected and preprocessed from a differential flotation circuit at the industrial flotation site based in Guemassa, Morocco. Using image-processing algorithms, the extracted features from the flotation froth include: the texture, the bubble size, the velocity and the color distribution. To predict the mineral concentrate grades, our study includes several supervised machine learning algorithms (ML), artificial neural networks (ANN) and convolutional neural networks (CNN). The industrial experimental evaluations revealed relevant performances with an accuracy up to 0.94. Furthermore, our proposed Hybrid method was evaluated in a real flotation process for the Zn, Pb, Fe and Cu concentrate grades, with an error of precision lesser than 4.53. These results demonstrate the significant potential of our proposed online analyzer as an artificial intelligence application in the field of complex polymetallic flotation circuits (Pb, Fe, Cu, Zn).
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not o...
详细信息
ISBN:
(纸本)9798400708473
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also be having zero variance, i.e. predictable. To bridge this gap, we firstly analyse the real-time predictability of components of a state-of-the-art deep-learning-based object video object tracking system. Our detailed experiments indicate the superiority of FPGA implementations in terms of hard real-time behaviour, but unveil important time-predictability bottlenecks. Then, we craft a dedicated hardware accelerator specifically for the bottleneck. Our method seamlessly integrates advanced tracker features and improves greatly the tracker's speed and time-predictability on embedded systems. Implemented on a KV260 board, our quantized tracker demonstrates superior performance. These findings spotlight the immense promise of hardware acceleration in real-time object tracking and set a benchmark for forthcoming hardware-software co-design pursuits focused on achieving time-predictable object tracking.
With the advancement of the wood processing industry, the demand for the detection of surface defects in wood has become increasingly urgent. The application of automated production technology has enhanced the efficie...
详细信息
With the advancement of the wood processing industry, the demand for the detection of surface defects in wood has become increasingly urgent. The application of automated production technology has enhanced the efficiency and precision of wood processing, which can significantly impact product quality and competitiveness. However, current methods for detecting surface defects in wood suffer from issues such as low detection accuracy, high computational complexity, and poor real-time performance. In response to these challenges, this paper proposes a high-precision, lightweight, real-time wood surface defect detection method based on YOLO(GBCD-YOLO) model. Firstly, the Ghost Bottleneck is introduced to improve the computational efficiency and inference speed of deep neural networks. Furthermore, the BiFormer is incorporated in the neck to enhance the performance of natural language processing tasks. Simultaneously, CARAFE is utilized as an upsampling replacement to enhance perceptual and capture abilities for details. In addition, the Dynamic Head is introduced to enhance the method's flexibility and generalization ability, and the loss function is replaced with complete intersection over union (CIoU). The proposed method was evaluated using an optimized dataset and the YOLOv5s model was chosen as the baseline. The experimental results show that compared with the original YOLOv5s, the mAP (0.5) has been improved by 13.45%, reaching 88.72%. The mAP (0.5:0.95) increased by 11.95%, and FPS increased by 6.25%. In addition, the parameter of the improved model has been reduced by 15.49%. These results indicate that the proposed GBCD-YOLO improves the real-time detection performance of wood surface defects.
Consumer demand for automobiles is changing because of the vehicle's dependability and utility, and the superb design and high comfort make the vehicle a wealthy object class. The creation of object classes necess...
详细信息
Consumer demand for automobiles is changing because of the vehicle's dependability and utility, and the superb design and high comfort make the vehicle a wealthy object class. The creation of object classes necessitates the creation of more sophisticated computer vision models. However, the critical issue is image quality, determined by lighting conditions, viewing angle, and physical vehicle construction. This work focuses on creating and implementing a deeplearning-based traffic analysis system. Using a variety of video feeds and vehicle information, the developed model recognizes, categorizes, and counts vehicles in real-time traffic flow. The dynamic skipping method offered in the developed model speeds up the processing of a lengthy video stream while ensuring that the video picture is delivered accurately to the viewer. In real-time traffic, standard vehicle retrieval may assist in determining the make, model, and year of the vehicle. Previous MobileNet and VGG19 models achieved F-values of 0.81 and 0.91, respectively. However, the proposed solution raises MobileNet's frame rate from 71.2 to 89.17 and VGG19's frame rate from 48.2 to 59.14. The method may be applied to a wide range of applications that require a dedicated zone to monitor real-time data analysis and normal multimedia operations.
Shadows significantly hinder computer vision tasks in outdoor environments, particularly in field robotics, where varying lighting conditions complicate object detection and localization. We present FieldNet, a novel ...
详细信息
Shadows significantly hinder computer vision tasks in outdoor environments, particularly in field robotics, where varying lighting conditions complicate object detection and localization. We present FieldNet, a novel deeplearning framework for real-time shadow removal, optimized for resource-constrained hardware. FieldNet introduces a probabilistic enhancement module and a novel loss function to address challenges of inconsistent shadow boundary supervision and artefact generation, achieving enhanced accuracy and simplicity without requiring shadow masks during inference. Trained on a dataset of 10,000 natural images augmented with synthetic shadows, FieldNet outperforms state-of-the-art methods on benchmark datasets (ISTD, ISTD+, SRD), with up to 9x speed improvements (66 FPS on Nvidia 2080Ti) and superior shadow removal quality (PSNR: 38.67, SSIM: 0.991). real-world case studies in precision agriculture robotics demonstrate the practical impact of FieldNet in enhancing weed detection accuracy. These advancements establish FieldNet as a robust, efficient solution for real-time vision tasks in field robotics and beyond.
deep-learning-driven medical image segmentation marks a significant milestone in the evolution of intelligent healthcare systems. Despite remarkable accuracy achievements, real-world clinical applications still grappl...
详细信息
deep-learning-driven medical image segmentation marks a significant milestone in the evolution of intelligent healthcare systems. Despite remarkable accuracy achievements, real-world clinical applications still grapple with complex challenges, particularly in handling multi-scale medical targets. This paper introduces a novel and efficient medical image segmentation network that leverages Transformer technology. The proposed network utilizes the Transformer's global feature extraction capabilities, enriched with spatial context, to substantially elevate segmentation accuracy. Additionally, the fusion encoder we build by combining Transformer modules and Convolutional structures through feature fusion strategies can improve feature extraction capabilities. Acknowledging the computational demands of Transformer models in practical scenarios, we have meticulously optimized our Transformer architecture. This optimization focuses on reducing parameter complexity and inference latency, tailoring the model to address the typical sample scarcity in medical applications. We evaluated our model on two different medical datasets: the 2018 Lesion Boundary Segmentation Challenge, the 2018 Data Science Bowl Challenge and the Kvasir-Instrument dataset. Our model demonstrates state-of-the-art performance in both Dice and MIoU metrics, while maintaining robust real-timeprocessing capabilities. Our code will be released at https://***/migouKang/Multi-TranResUnet.
暂无评论