As a prevailing research in artificial intelligence, the application of computer vision is widely used in many fields which are closely related to people's livelihood, such as industrial automation, new retail ind...
详细信息
As a prevailing research in artificial intelligence, the application of computer vision is widely used in many fields which are closely related to people's livelihood, such as industrial automation, new retail industry, smart transportation and security monitoring. And the proposed face recognition method is a branch in the field of computer vision, it integrates neural networks, biology, image signal processing, machine learning and other fields, which promote research and cross-development among different disciplines. Hence, this paper focuses on face recognition method by using convolutional neural network(CNN), and CNN has the property of "weight sharing", which has been widely popularized in image recognition, it can greatly simplify the work of large-scale network training. The experiments demonstrate that the proposed face recognition method is successful, and the accuracy of the proposed method can be as high as 98%.
The goal of functional error correction is to preserve neural network performance when stored network weights are corrupted by noise. To achieve this goal, a selective protection (SP) scheme was proposed to optimally ...
详细信息
ISBN:
(数字)9798350374889
ISBN:
(纸本)9798350374896
The goal of functional error correction is to preserve neural network performance when stored network weights are corrupted by noise. To achieve this goal, a selective protection (SP) scheme was proposed to optimally protect the functionally important bits in binary weight representations in a layer-dependent manner. Although it showed its effectiveness in image classification tasks on some relatively simple networks such as ResNet-18 and VGG-16, it becomes inadequate for emerging complex machine learning tasks generated from natural language processing and vision-language association domains. To solve this problem, we extend the SP scheme in three directions: task complexity, model complexity, and storage complexity. Extensions to complex natural language and vision-language tasks include text categorization and “zero-shot” textual classification of images. Extensions to more complex models with deeper block structures and attention mechanisms consist of Very Deep Convolutional Neural Network (VDCNN) and Contrastive Language-image Pre-Training (CLIP) networks. Extensions to more complex storage configurations focus on distributed storage architectures to support model parallelism. Experimental results show that the optimized SP scheme preserves network performance in all of these settings. The results also provide insights into redundancy-performance tradeoffs, generalizability of SP across datasets and tasks, and robustness of partitioned network architectures.
Due to the increased use of inconsistent low-grade biogenic solid fuels and simultaneously stricter regulatory limitations for biomass combustion, there is a rising demand for suitable methods for an online characteri...
详细信息
Due to the increased use of inconsistent low-grade biogenic solid fuels and simultaneously stricter regulatory limitations for biomass combustion, there is a rising demand for suitable methods for an online characterization of fuels to allow for optimal plant operation. Aim of this work is to evaluate two camera-based machinevision approaches using a conventional feature extraction method based classification and a deep learning approach with a convolutional neural network (CNN) to distinguish six types of biogenic solid fuels and 20 fuel mixtures thereof. A comparison of four machine-learning algorithms applied to classify the samples based on all extracted color and texture features reaches a prediction accuracy of 93.5 %. An evaluation of the feature performance of single features shows that the selected color features (color thresholds and histogram-based features) are more relevant for distinguishing the fuels than the textural features (Haralick features and features extracted from the frequency domain). Despite the small size of the image dataset, also the CNN achieves a good prediction accuracy of 73.7 % for the given classification task. An increase of the number of images by fragmentation leads to a slightly increasing prediction accuracy for the deep learning approach while the accuracy of the conventional features learning approach decreases. Both approaches are suitable to distinguish six typical biogenic solid fuels and achieve high accuracies for the classification of 26 fuels and mixtures. While the feature learning is more accurate for the mentioned classification task, the CNN does not require prior feature extraction and would benefit from a larger dataset. These are promising results for an implementation as an online fuel monitoring in various applications, with further development in the robustness and extensions by means of a broader range of fuel mixtures being required.
Medical image segmentation plays a key role in the early detection and diagnosis of breast cancer. Based on the Swin-Unet model in this paper, combined with the Multi-Scale Dilated Fusion Attention (MDFA) module and t...
详细信息
ISBN:
(数字)9798350355413
ISBN:
(纸本)9798350355420
Medical image segmentation plays a key role in the early detection and diagnosis of breast cancer. Based on the Swin-Unet model in this paper, combined with the Multi-Scale Dilated Fusion Attention (MDFA) module and the Simple, Parameter-Free Attention Module (SimAM) module, improvements and optimizations have been made for the breast cancer image segmentation task. The BUSI dataset is employed, which offers abundant breast cancer image samples to assist in evaluating the effect of the model in practical applications. The MDFA module enhances the model's ability to capture features of various scales through multi-scale dilated convolution techniques, overcoming the deficiencies of traditional convolutions when dealing with complex lesion areas. The SimAM module improves the recognition ability of key features by simulating the activation process of neurons and calculating the similarity between feature points and their expected states, further enhancing the accuracy of the segmentation results. Experimental results indicate that, compared with mainstream models such as FCN, UNet, SegNet, and ENC-Net, the improved Swin-Unet demonstrates a significant improvement in performance on the BUSI dataset. The introduction of the MDFA and SimAM modules endows the model with higher precision and robustness when handling breast cancer images. This research not only validates the effectiveness of the improved model but also provides a new direction for the application of deep learning in future breast cancer image segmentation tasks.
image dehazing is a critical task in image restoration, aiming to retrieve clear images from hazy scenes. This process is vital for various applications, including machine recognition, security monitoring, and aerial ...
详细信息
image dehazing is a critical task in image restoration, aiming to retrieve clear images from hazy scenes. This process is vital for various applications, including machine recognition, security monitoring, and aerial photography. Current dehazing algorithms often encounter challenges in multi-scale feature extraction, detail preservation, effective haze removal, and maintaining color fidelity. To address these limitations, this paper introduces a novel Parallel image-Dehazing Network (PID-Net). PID-Net uniquely combines a Convolutional Neural Network (CNN) for precise local feature extraction and a vision Transformer (ViT) to capture global contextual information, overcoming the shortcomings of methods relying solely on either local or global features. A multi-scale CNN branch effectively extracts diverse local details through varying receptive fields, thereby enhancing the restoration of fine textures and details. To optimize the ViT component, a lightweight attention mechanism with CNN compensation is integrated, maintaining performance while minimizing the parameter count. Furthermore, a Redundant Feature Filtering Module is incorporated to filter out noise and haze-related artifacts, promoting the learning of subtle details. Our extensive experiments on public datasets demonstrated PID-Net's significant superiority over state-of-the-art dehazing algorithms in both quantitative metrics and visual quality.
This study proposes a way to detect vitamin deficiency by combining machine learning and imageprocessing. Computer vision enables the system to recognise visual symptoms of specific vitamin deficiencies. The recommen...
详细信息
ISBN:
(数字)9798331539948
ISBN:
(纸本)9798331539955
This study proposes a way to detect vitamin deficiency by combining machine learning and imageprocessing. Computer vision enables the system to recognise visual symptoms of specific vitamin deficiencies. The recommended approach is that the entire procedure can be subdivided into specific key steps, which are initiated from image acquisition, followed by the image preprocessing steps used to enhance their quality. It catches the confusing patterns that are directed toward various abnormalities through a pretrained Convolutional Neural Network (CNN) model. Finally, with such patterns at hand, the categorisation takes place, which in turn helps to identify specific *** extensive experimentation across a diverse dataset, the system demonstrates remarkable accuracy in detecting deficiency. Its non-invasive nature permits early screening. This proves its potential for widespread implementation and directions for future enhancement, such as dataset expansion and exploration of other advanced architectures apart from CNN. With its promising capabilities, this approach represents a significant stride towards enhancing healthcare diagnostics and preventive measures related to vitamin deficiencies.
This article primarily introduces a design and algorithm system of a one-click measuring platform, which can quickly and accurately measure the size and dimensions of parts. Accomplishing by combining a camera and an ...
详细信息
ISBN:
(纸本)9798400708268
This article primarily introduces a design and algorithm system of a one-click measuring platform, which can quickly and accurately measure the size and dimensions of parts. Accomplishing by combining a camera and an algorithm, the one-click detection function effectively lowers the measurement time and labor costs in industrial production, increasing production efficiency. The platform's algorithm system is based on the Halcon platform and may employ various algorithms depending on the form, size, and backdrop of the gathered photos to extract models and highlight data. This essay concludes by summarizing the various operators utilized in the algorithm design and implementation process of the platform, including threshold segmentation techniques, binary imageprocessing, grayscale processing, and contrast-increasing techniques.
Synthetic Aperture Radar (SAR) has a wide range of applications in the military and civilian fields. Due to the angle sensitivity of targets in SAR images and the neglect of correlation information between images from...
详细信息
ISBN:
(数字)9798350390254
ISBN:
(纸本)9798350390261
Synthetic Aperture Radar (SAR) has a wide range of applications in the military and civilian fields. Due to the angle sensitivity of targets in SAR images and the neglect of correlation information between images from different viewpoints in single-view recognition, multi-view recognition methods have received widespread attention. We propose a multi-view rotation double-layer fusion (MVRDF) CNN-LSTM network to fuse the rotation features of multi-view images. It uses three parallel CNNs to extract features from randomly extracted multi-view images, rotates the feature sequence twice to obtain three sets of features. We add a layer of rotation angle dimension LSTM after a layer of angle dimension LSTM for double-layer fusion. The experimental results based on MSTAR and a measured dataset demonstrate the effectiveness and generalization of the proposed method.
In the robot application system incorporating dexterous hand, a vision-based robot grasping system is proposed to address the lack of robustness of dexterous hand in grasping fixed attitude objects. First, a 6DOF robo...
详细信息
ISBN:
(数字)9798350355642
ISBN:
(纸本)9798350355659
In the robot application system incorporating dexterous hand, a vision-based robot grasping system is proposed to address the lack of robustness of dexterous hand in grasping fixed attitude objects. First, a 6DOF robot grasping system based on machinevision is constructed using dexterous hand, depth camera and 6DOF collaborative robot, which realizes accurate grasping under vision guidance; second, to solve the problem of vision system's poor localization accuracy due to the loss of image information and features caused by image noise, occlusion and complex background in the process of imageprocessing, a pooling layer and attention mechanism to enhance the feature extraction ability; moreover, an optimized dexterous hand grasping strategy is proposed through exhaustive grasping action design and analysis, which effectively improves the robustness of the system. The experimental results show that the accuracy of the target detection model reaches 87% through the localization measurement of the experimental objects, which is 2.1% higher than the original method, and the grasping success rate of the robotic system equipped with dexterous hand and depth camera is improved by 3.5%. These results validate the feasibility of the robotic grasping system incorporating dexterous hands in practical applications and significantly enhance the robustness of the system.
Along with computer technology, the demand of digital imageprocessing is too high and it is used massively in every sector like organization, business, medical and so on. image segmentation enables us to analyze any ...
详细信息
暂无评论