real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input, has recently received considerable attention with regard to its tremendous app...
详细信息
real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input, has recently received considerable attention with regard to its tremendous application potentials. Although deeplearning-based methods have achieved promising restoration quality on real-world image super-resolution datasets, they ignore the relationship between L1- and perceptual- minimization and roughly adopt auxiliary large-scale datasets for pre-training. In this paper, we discuss the image types within a corrupted image and the property of perceptual- and Euclidean- based evaluation protocols. Then we propose a method, real-World image Super-Resolution by Exclusionary Dual-learning (RWSR-EDL) to address the feature diversity in perceptual- and L1- based cooperative learning. Moreover, a noise-guidance data collection strategy is developed to address the training time consumption in multiple datasets optimization. When an auxiliary dataset is incorporated, RWSR-EDL achieves promising results and repulses any training time increment by adopting the noise-guidance data collection strategy. Extensive experiments show that RWSR-EDL achieves competitive performance over state-of-the-art methods on four in-the-wild image super-resolution datasets.
Cheese production, a globally cherished culinary tradition, faces challenges in ensuring consistent product quality and production efficiency. The critical phase of determining cutting time during curd formation signi...
详细信息
Cheese production, a globally cherished culinary tradition, faces challenges in ensuring consistent product quality and production efficiency. The critical phase of determining cutting time during curd formation significantly influences cheese quality and yield. Traditional methods often struggle to address variability in coagulation conditions, particularly in small-scale factories. In this paper, we present several key practical contributions to the field, including the introduction of CM-IDB, the first publicly available image dataset related to the cheese-making process. Also, we propose an innovative artificial intelligence-based approach to automate the detection of curd-firming time during cheese production using a combination of computer vision and machine learning techniques. The proposed method offers real-time insights into curd firmness, aiding in predicting optimal cutting times. Experimental results show the effectiveness of integrating sequence information with single image features, leading to improved classification performance. In particular, deeplearning-based features demonstrate excellent classification capability when integrated with sequence information. The study suggests the suitability of the proposed approach for integration into real-time systems, especially within dairy production, to enhance product quality and production efficiency.
For target detection tasks in complicated backgrounds, a deeplearning-based radar target detection method is suggested to address the problems of a high false alarm rate and the difficulties of achieving high-perform...
详细信息
For target detection tasks in complicated backgrounds, a deeplearning-based radar target detection method is suggested to address the problems of a high false alarm rate and the difficulties of achieving high-performance detection by conventional methods. Considering the issues of large parameter count and memory occupation of the deeplearning-based target detection models, a lightweight target detection method based on improved YOLOv4-tiny is proposed. The technique applies depthwise separable convolution (DSC) and bottleneck architecture (BA) to the YOLOv4-tiny network. Moreover, it introduces the convolutional block attention module (CBAM) in the improved feature fusion network. It allows the network to be lightweight while ensuring detection accuracy. We choose a certain number of pulses from the pulse-compressed radar data for clutter suppression and Doppler processing to obtain range-Doppler (R-D) images. Experiments are run on the R-D two-dimensional echo images, and the results demonstrate that the proposed method can quickly and accurately detect dim radar targets against complicated backgrounds. Compared with other algorithms, our approach is more balanced regarding detection accuracy, model size, and detection speed.
This paper proposes a deep convolutional neural network (DCNN) to design an accurate active sonar image classifier. In order to have a real-time classifier with low complexity, The LeNet-5 is utilized as the most stra...
详细信息
This paper proposes a deep convolutional neural network (DCNN) to design an accurate active sonar image classifier. In order to have a real-time classifier with low complexity, The LeNet-5 is utilized as the most straightforward deep network with the fewest parameters. For the sake of having a real-time training and test phase, the three fully connected layers are replaced by an extreme learning machine (ELM). However, tuning the ELM's input layer parameters is challenging;therefore, this paper tries to tune them using the grey wolf optimizer (GWO). Contrary to other research works and considering the sonar problem's characteristics, we model the problem as a multimodal function. Therefore, comprehensive learning concepts and a novel constraint-handling technique are exerted on the GWO to address the multimodality and the constraints of the sonar image classification task and to have a robust optimizer. Given the vital role of the reliable dataset in deeplearning approaches, in the following, an operational underwater sonar test scenario is designed, and an experimental dataset is generated. The designed model is then benchmarked on two benchmark active sonar datasets. The results are investigated by qualified research with classic DCNN, Block-wise Classifier (BWC), and Matched Subspace classifier with Adaptive Dictionaries (MSAD). The investigation outcomes confirm that the designed model, with an average accuracy of 98.69% and computation time equal to 883.44 s, reports the best accuracy and time complexity among other benchmark models.
Combining dual-energy computed tomography (DECT) with positron emission tomography (PET) offers many potential clinical applications but typically requires expensive hardware upgrades or increases radiation doses on P...
详细信息
Combining dual-energy computed tomography (DECT) with positron emission tomography (PET) offers many potential clinical applications but typically requires expensive hardware upgrades or increases radiation doses on PET/CT scanners due to an extra X-ray CT scan. The recent PET-enabled DECT method allows DECT imaging on PET/CT without requiring a second X-ray CT scan. It combines the already existing X-ray CT image with a 511 keV $\gamma $ -ray CT (gCT) image reconstructed from time-of-flight PET emission data. A kernelized framework has been developed for reconstructing gCT image but this method has not fully exploited the potential of prior knowledge. Use of deep neural networks may explore the power of deeplearning in this application. However, common approaches require a large database for training, which is impractical for a new imaging method like PET-enabled DECT. Here, we propose a single-subject method by using neural-network representation as a deep coefficient prior to improving gCT image reconstruction without population-based pre-training. The resulting optimization problem becomes the tomographic estimation of nonlinear neural-network parameters from gCT projection data. This complicated problem can be efficiently solved by utilizing the optimization transfer strategy with quadratic surrogates. Each iteration of the proposed neural optimization transfer algorithm includes: PET activity image update;gCT image update;and least-square neural-network learning in the gCT image domain. This algorithm is guaranteed to monotonically increase the data likelihood. Results from computer simulation, real phantom data and real patient data have demonstrated that the proposed method can significantly improve gCT image quality and consequent multi-material decomposition as compared to other methods.
Automatic extraction of coal flow region of coal mine belt conveyor plays an important role in coal flow monitoring, and real-time control of belt speed through real-time accurate monitoring of coal flow, which realiz...
详细信息
Automatic extraction of coal flow region of coal mine belt conveyor plays an important role in coal flow monitoring, and real-time control of belt speed through real-time accurate monitoring of coal flow, which realizes the purpose of energy saving and consumption reduction of belt conveyor. In this paper, a real-time semantic segmentation network with detail enhancement for pixel-level coal flow monitoring, called DENet, is proposed. First, to ensure the strong real-time performance of the network, a two-branch coding structure is used to extract the semantic information and spatial detail information. Second, to improve the feature representation of spatial detail information, we design the Parameter-free Attention-Guided Enhancement Module (PF-AGEM) and the detail enhancement module (DEM), which fully integrate the semantic information features in the semantic branch into the detail branch and further enhance the detail features. Third, we design the multi-scale channel attention (MSCA) module in the semantic branch to extract the semantic information features of small targets earlier in the high-resolution feature maps, which solves the problem that the semantic information features of small targets are easily lost in the low-resolution feature maps. Finally, we propose a selective feature fusion module (FFM) to better realize the fusion of semantic information and spatial detail information. Experimental results show that the proposed DENet achieves a mean intersection over union (mIoU) of 96.23% at 87.1 frames per second (FPS) on the Coal Flow Segmentation (CFS) dataset and 74.9% mIoU at 207 FPS on the Camvid dataset, which is competitive with the state-of-the-art real-time semantic segmentation models.
To quickly measure the water absorption (WA) of Recycled Coarse Aggregates (RCA), we utilize a detection platform designed for RCA to collect two-dimensional images. Utilizing the RCA-net network, we segment the areas...
详细信息
To quickly measure the water absorption (WA) of Recycled Coarse Aggregates (RCA), we utilize a detection platform designed for RCA to collect two-dimensional images. Utilizing the RCA-net network, we segment the areas of the mortar and aggregate on the RCA surface. Segmentations allow us to extract critical parameters for characterizing the quality of RCA, the proportion of mortar area (PMA). Subsequently, we construct three regression functions between PMA and WA. The experimental results demonstrate that our proposed segmentation method effectively separates both adhered particles of RCA and distinct areas of mortar and aggregate on RCA surfaces. Next, sprinkling water on RCA surfaces can enhance the accuracy of the segmentation. Notably, within particle size ranges of 5-10 mm, 10-20 mm, and 20-31.5 mm, we all observed a significant linear relationship between PMA and WA. We used those linear relationships and the equivalent mass of RCA detected by the image method in each particle size range to construct the prediction model of water absorption. According to the validation result of 24 groups RCA, this model's maximum relative error of RCA water absorption predicted value was 10.6 %. The detection time of this method is short, and the detection time of 2 kg RCA is 3.8 min, with an average computation time per image of merely 0.659 s. This efficiency fulfills the requirements for real-time industrial inspection.
The increasing number of passengers and services using railways and the corresponding increase in rail use has caused the acceleration of rail wear and surface defects which makes rail defect identification an importa...
详细信息
The increasing number of passengers and services using railways and the corresponding increase in rail use has caused the acceleration of rail wear and surface defects which makes rail defect identification an important issue for rail maintenance and monitoring to ensure safe and efficient operation. Traditional visual inspection methods for identifying rail defects are time-consuming, less accurate, and associated with human errors. deeplearning has been used to improve railway maintenance and monitoring tasks. This study aims to develop a structured model for detecting railway artifacts and defects by comparing different deep-learning models using ultrasonic image data. This research showed whether it is practical to identify rail indications using image classification and object detection techniques from ultrasonic data and which model performs better among the above-mentioned methods. The methodology includes data processing, labeling, and using different conventional neural networks to develop the model for both image classification and object detection. The results of CNNs for image classification, and YOLOv5 for object detection show 98%, and 99% accuracy respectively. These models can identify rail artifacts efficiently and accurately in real-life scenarios, which can improve automated railway infrastructure monitoring and maintenance.
The large size tolerance and positional differences of burrs in cast iron blanks make it easy for traditional teaching polishing paths to cause overcutting or undercutting. Rapid and accurate identification of burrs a...
详细信息
The large size tolerance and positional differences of burrs in cast iron blanks make it easy for traditional teaching polishing paths to cause overcutting or undercutting. Rapid and accurate identification of burrs and real-time correction of polishing trajectories are key technical issues for achieving high-precision polishing. Here, a deeplearning-based method for defect detection in cast iron parts and surfaces is proposed. Firstly, a self-made dataset of cast iron parts and surface defects is created and annotated, and a variety of data augmentation methods are used to expand the number of samples in the original dataset, alleviating the problem of small sample size. Then, the coordinate attention mechanism is introduced into the backbone network to allocate more attention to the defect target. Finally, the bidirectional weighted feature pyramid network (BiFPN) is used in the feature fusion network to replace the original path aggregation network, improving the model's ability to fuse features of different sizes. Experimental results show that compared with the original model, the mean average precision (mAP) is increased by 3.1%, and the average precision (AP) in defect classification is increased by 7.6%, with an FPS of 112, achieving accurate and efficient real-time detection of cast iron parts and surface defects. First, this article used multiple data augmentation methods to alleviate the problem of small sample size in casting datasets. Second, attention mechanism was introduced. Finally, a novel feature fusion layer structure was adopted to improve the original network model. The experiment shows that compared with the original network model, the improved model proposed here has increased the accuracy of casting surface defect recognition category by 7.6%.image
Diabetic Retinopathy (DR) is a disease that happens in the patient eyes of long-term diabetics. It also affects the retina which causes eye blindness. Therefore, DR has to be detected at its early stage to decrease th...
详细信息
Diabetic Retinopathy (DR) is a disease that happens in the patient eyes of long-term diabetics. It also affects the retina which causes eye blindness. Therefore, DR has to be detected at its early stage to decrease the risk of blindness. Several researchers suggested approaches to detect the blood abnormalities (hemorrhages, Hard and soft exudates, and micro-aneurysms) in the retina images using deeplearning models. The limitation with these approaches is the performance degradation and required high training time. To solve this, we suggest a model for automated detection of DR severity using a convolutional neural network (CNN) and residual blocks (DRCNNRB). deeplearning models work effectively when they have been trained on vast datasets. Data Augmentation helps to increase the training samples as a result avoids the data imbalance problem. In our model, basic data augmentation techniques such as zooming, shearing, rotation, flipping, and rescaling are applied in DRCNNRB to solve the data imbalance problem. Pre-processing techniques are used to enhance the quality of the image. Extensive experimental results on the Diabetic Retinopathy 2015 Data Colored Resized database conclude that DRCNNRB provides better performance compared to other state-of-the-art works. Thus, DRCNNRB achieves better efficiency for real-time diagnosis.
暂无评论