Total p-norm Variation (TpV) is a well-established technique in imageprocessing, used to denoise and preserve edges. However, the related non-convex minimization is still a challenging task in optimization, both for ...
详细信息
Aiming at the problems of slow detection speed and low detection accuracy in existing fatigue driving detection algorithms, a fatigue driving detection algorithm based on YOLOv5 is proposed. In order to improve the fe...
详细信息
In recent years, although the task of fine-grained image classification has achieved remarkable results, these algorithms need to be trained on large datasets in order to obtain good results, otherwise it is easy to c...
详细信息
To article proposes an approach to improve the quality of data used in various processes based on machine vision systems. The paper proposes a combined approach to applying the method of multi-criteria processing base...
详细信息
ISBN:
(数字)9781510662117
ISBN:
(纸本)9781510662100;9781510662117
To article proposes an approach to improve the quality of data used in various processes based on machine vision systems. The paper proposes a combined approach to applying the method of multi-criteria processing based on the use of a combined criterion in order to implement an edge detector, smoothing and separation areas of the background / object in the image. The application of the method allows eliminating the noise caused by external factors (such as dust and water suspension on the lens or space). The generated data make it possible to form an adaptive criterion for changing the correction parameters for a non-linear change in color balance in areas of increased detail or selected masks of changes blocks. The proposed algorithms make it possible to increase the visibility of small elements, reduce the noise component, while maintaining the boundaries of objects, increase the accuracy of selecting the boundaries of objects and the visual quality of data. As test data used to evaluate the effectiveness, nature data and expert evaluation results for test images obtained by a machine vision system with a sensor with a resolution of 1024x768 (8-bit, color image, visible range) are used. images of simple shapes are used as analyzed objects.
A vital and rapidly growing application, remote sensing offers vast yet sparsely labeled, spatially aligned multimodal data;this makes self-supervised learning algorithms invaluable. We present CROMA: a framework that...
详细信息
ISBN:
(纸本)9781713899921
A vital and rapidly growing application, remote sensing offers vast yet sparsely labeled, spatially aligned multimodal data;this makes self-supervised learning algorithms invaluable. We present CROMA: a framework that combines contrastive and reconstruction self-supervised objectives to learn rich unimodal and multimodal representations. Our method separately encodes masked-out multispectral optical and synthetic aperture radar samples-aligned in space and time-and performs cross-modal contrastive learning. Another encoder fuses these sensors, producing joint multimodal encodings that are used to predict the masked patches via a lightweight decoder. We show that these objectives are complementary when leveraged on spatially aligned multimodal data. We also introduce X- and 2D-ALiBi, which spatially biases our cross- and self-attention matrices. These strategies improve representations and allow our models to effectively extrapolate to images up to 17.6x larger at test-time. CROMA outperforms the current SoTA multispectral model, evaluated on: four classification benchmarks-finetuning (*** arrow 1.8%), linear (*** arrow 2.4%) and nonlinear (*** arrow 1.4%) probing, kNN classification (*** arrow 3.5%), and K-means clustering (*** arrow 8.4%);and three segmentation benchmarks (*** arrow 6.4%). CROMA's rich, optionally multimodal representations can be widely leveraged across remote sensing applications.
In order to improve the processing quality of traditional Chinese medicine rhubarb and control the processing process of rhubarb in real time, a programmable logic control system with Ethernet and Internet of things r...
详细信息
This paper presents a study of end-to-end methods for predicting autonomous vehicle navigation parameters. image-based and image & Lidar points-based end-to-end models have been trained under Nvidia learning archi...
详细信息
ISBN:
(数字)9781665460262
ISBN:
(纸本)9781665460262
This paper presents a study of end-to-end methods for predicting autonomous vehicle navigation parameters. image-based and image & Lidar points-based end-to-end models have been trained under Nvidia learning architectures as well as Densenet-169, Resnet-152 and Inception-v4. Various learning parameters for autonomous vehicle navigation, input models and pre-processing data algorithms i.e. image cropping, noise removing, semantic segmentation for image data have been investigated and tested. The best ones, from the rigorous investigation, are selected for the main framework of the study. Results reveal that the Nvidia architecture trained image & Lidar points-based method offers the better results accuracy rate-wise for steering angle and speed.
The rapid development of the Internet of Things (IoT) is enabling a wide range of applications in intelligent medical systems. Among others, medical imaging equipment produces sensitive user privacy information, howev...
详细信息
ISBN:
(纸本)9798350366457;9798350366440
The rapid development of the Internet of Things (IoT) is enabling a wide range of applications in intelligent medical systems. Among others, medical imaging equipment produces sensitive user privacy information, however, current solutions from academia and industry often neglect the importance of secure communication mechanisms. There are open research challenges such as low real-time processing and poor security, even when using cryptography. This paper proposes EDBNet, an efficient encryption network based on deep and broad learning to improve patient privacy for medical images. To be specific, a four-layer convolutional neural network is employed to extract the horizontal and vertical factors and utilize broad learning to guide the training model to obtain two feature matrices. The training process includes pre-training and fine-tuning, with the open-source COVID-CT-Dataset enabling dual-stream encryption. To further enhance ciphertext image security in a privacy-protected environment, chaotic cryptography is utilized to consummate the encryption network, which includes scrambling and diffusion combing with the SHA3-256 algorithm. The proposed EDBNet is evaluated by extensive experiments, which show that it outperforms several state-of-the-art algorithms, such as the average cipher entropy of 7.9971, encryption quality of 248, and encrypted/decrypted time of around 1 second.
Aiming at the problems of incomplete dehazing and detail distortion in the dehazing process of existing algorithms, an end-to-end dehazing method SemanticGridDehazeNet is proposed, which is semantic feature-driven and...
详细信息
ISBN:
(数字)9798350376548
ISBN:
(纸本)9798350376555
Aiming at the problems of incomplete dehazing and detail distortion in the dehazing process of existing algorithms, an end-to-end dehazing method SemanticGridDehazeNet is proposed, which is semantic feature-driven and combines multi-scale feature fusion. The method consists of three modules: preprocessing, backbone, and post-processing. First, preprocessing is performed using a newly designed residual dense fusion block (RDFB) and convolutional layer. The Backbone module is based on the extended GridN et, which contains three processing scales, and semantic features extracted based on VGG16 Net are added in front of each scale to enhance learning. Each scale contains five RDFB blocks to fuse the feature map to the last column. Then an RDFB post-processing module improves the output quality. The main innovations include: (1) the introduction of the RDFB fusion module to focus on haze features, dynamically adjust the level of attention paid to channel or pixel locations to more effectively learn and utilize key information about the haze distribution in the input data; (2) Semantic features are integrated into the image dehazing process by incorporating semantic features extracted from VGG16 Net before each processing scale. This enhances multi-scale learning and enables more accurate focus on haze features, improving dehazing effectiveness. Experimental results demonstrate superior performance in target enhancement, robustness, and generalization. The method effectively removes residual hzae and addresses detail distortion issues.
A primary strategy for the energy-efficient operation of commercial office buildings is to offer dynamic building services, including lighting, heating, ventilating, and air conditioning (HVAC). Therefore, it is neces...
详细信息
暂无评论