Globally, diabetic retinopathy (DR) continues to be a major cause of vision impairment. Early diagnosis and detection of DR are essential to preventing blindness and vision loss. But the traditional manual method of s...
详细信息
ISBN:
(数字)9798350375190
ISBN:
(纸本)9798350375206
Globally, diabetic retinopathy (DR) continues to be a major cause of vision impairment. Early diagnosis and detection of DR are essential to preventing blindness and vision loss. But the traditional manual method of screening retinal pictures takes a lot of time and is prone to human mistake. Using imageprocessing to apply automated denoising techniques can increase image quality, make DR lesions more visible, and help with correct diagnosis. In order to improve the quality of images of diabetic retinopathy, this study investigates the ability of four image denoising filters: guided image filtering, Gaussian-bilateral filtering, Gabor filtering, and Haar Wavelet filtering. AGAR300 (ieee DataPort), Diabetic Retinopathy Detection, Stare, Indian Diabetic Retinopathy image Dataset, and APTOS 2019 Blindness Detection are the five datasets used to evaluate the efficacy of these filters. The findings demonstrate that noise in DR images may be efficiently reduced by all four filters. The one that shows the biggest overall improvement in lesion visibility and image quality among them is guided image filtering.
Given the high prevalence of stress and anxiety in today's society, there is an urgent need to explore effective methods to help people manage stress. This research aims to develop a relaxation support system usin...
详细信息
Instance segmentation for low-light imagery remains largely unexplored due to the challenges imposed by such conditions, for example shot noise due to low photon count, color distortions and reduced contrast. In this ...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Instance segmentation for low-light imagery remains largely unexplored due to the challenges imposed by such conditions, for example shot noise due to low photon count, color distortions and reduced contrast. In this paper, we propose an end-to-end solution to address this challenging task. Our proposed method implements weighted non-local blocks (wNLB) in the feature extractor. This integration enables an inherent denoising process at the feature level. As a result, our method eliminates the need for aligned ground truth images during training, thus supporting training on real-world low-light datasets. We introduce additional learnable weights at each layer in order to enhance the network’s adaptability to real-world noise characteristics, which affect different feature scales in different ways. Experimental results on several object detectors show that the proposed method outperforms the pre-trained networks with an Average Precision (AP) improvement of at least +7.6, with the introduction of wNLB further enhancing AP by upto +1.3.
The purpose of color constancy algorithm is to eliminate the influence of illumination on the color of objects in the scene, so that the computer has the same color constancy ability as human visual *** order to furth...
详细信息
Depth-image-based rendering (DIBR) is a key method for synthesizing virtual views from multiple RGB-D video streams. A challenging issue inherent in this approach is the disocclusion that occurs as the virtual viewpoi...
详细信息
ISBN:
(纸本)9781665440578
Depth-image-based rendering (DIBR) is a key method for synthesizing virtual views from multiple RGB-D video streams. A challenging issue inherent in this approach is the disocclusion that occurs as the virtual viewpoint moves away from a reference view. In this work, we present a technique for extracting 3D geometric data, called the disocclusion-reducing geometry, from the input video streams. This auxiliary information, represented as a 3D point cloud, can then be combined easily with a conventional DIBR pipeline to reduce the disoccluded region as much as possible during the view warping process, eventually decreasing the visual artifacts by the subsequent hole-filling process.
Convolutional neural networks with powerful visualimage analysis of deep structures are gaining popularity in many research fields. The main difference in convolutional neural networks compared to other artificial ne...
详细信息
Online Handwritten Text Recognition (OLHTR) has gained considerable attention for its diverse range of applications. Current approaches usually treat OLHTR as a sequence recognition task, employing either a single tra...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Online Handwritten Text Recognition (OLHTR) has gained considerable attention for its diverse range of applications. Current approaches usually treat OLHTR as a sequence recognition task, employing either a single trajectory or image encoder, or multi-stream encoders, combined with a CTC or attention-based recognition decoder. However, these approaches face several drawbacks: 1) single encoders typically focus on either local trajectories or visual regions, lacking the ability to dynamically capture relevant global features in challenging cases; 2) multi-stream encoders, while more comprehensive, suffer from complex structures and increased inference costs. To tackle this, we propose a Collaborative learning-based OLHTR framework, called Col-OLHTR, that learns multimodal features during training while maintaining a single-stream inference process. Col-OLHTR consists of a trajectory encoder, a Point-to-Spatial Alignment (P2SA) module, and an attention-based decoder. The P2SA module is designed to learn image-level spatial features through trajectory-encoded features and 2D rotary position embeddings. During training, an additional image-stream encoder-decoder is collaboratively trained to provide supervision for P2SA features. At inference, the extra streams are discarded, and only the P2SA module is used and merged before the decoder, simplifying the process while preserving high performance. Extensive experimental results on several OLHTR benchmarks demonstrate the state-of-the-art (SOTA) performance, proving the effectiveness and robustness of our design.
Low earth orbit (LEO) satellite networks can provide global broadband access services and thus are complementary to current terrestrial mobile communication systems. The limited resources of LEO satellite platform and...
详细信息
To improve the viewers' quality of experience (QoE) in computer graphics applications, the visual quality assessment (VQA) of 3D meshes is becoming a popular task in the multimedia area. Since 3D meshes are quite ...
详细信息
ISBN:
(纸本)9781665449892
To improve the viewers' quality of experience (QoE) in computer graphics applications, the visual quality assessment (VQA) of 3D meshes is becoming a popular task in the multimedia area. Since 3D meshes are quite sensitive to the processing operations like simplification and compression, many studies concerning the VQA of 3D meshes have been carried out to measure the caused degradations. However, the previous studies mostly utilize full-reference metrics and focus mainly on the geometry attributes. While in some application scenarios such as 3D reconstruction, digital entertainment, and medical modeling, the reference 3D mesh is not always available and the color information can not be ignored. Therefore, in this paper, we propose a no-reference visual quality metric for 3D color meshes, which is based on the statistical parameters and entropy estimated from the probability distributions of curvature, dihedral angles, face area, face angle, and diffuse color information. The performance of the proposed method is validated on a database specially built for VQA of 3D color meshes. Experimental results show that our metric operates stably and achieves the highest correlation with subjective judgement.
Identification and forecast of weather conditions are important for transportation safety, environment, meteorology. Under the background of Artificial intelligence, the methods of weather conditions recognition based...
详细信息
暂无评论