Thanks to the emergence and continued devel-opment of machine learning, particularly deep learning, the research on visual question and answer, also known as VQA, has advanced dramatically, with great theoretical rese...
详细信息
Multi-focus image fusion, which is the fusion of two or more images focused on different targets into one clear image, is a worthwhile problem in digital imageprocessing. Traditional methods are usually based on freq...
详细信息
Multi-focus image fusion, which is the fusion of two or more images focused on different targets into one clear image, is a worthwhile problem in digital imageprocessing. Traditional methods are usually based on frequency domain or space domain, but they cannot guarantee the accurate measurement of all the image details of the activity level, and also cannot perfect the selection of image fusion rules. Therefore, the deep learning method with strong feature representation ability is called the mainstream of multi-focus image fusion. However, until now, most of the deep learning frameworks have not balanced the relationship between the two input features, the shallow features and the feature fusion. In order to improve the defects of previous work, we propose an end-to-end deep network, which includes an encoder and a decoder. Encoder is a pseudo-Siamese network. It extracts the same and different feature sets by using the features of double encoder, then reuses the shallow features and finally forms the coding. In decoder, the coding will be analyzed and dimensionally reduced enough to generate high-quality fusion image. We carried out extensive experiments. The results show that our network structure is better. Compared with various image fusion methods based on deep learning and traditional multi-focus image fusion methods in recent years, our method is slightly better than theirs in both objective metric contrast and subjective visual contrast.
image compression constitutes a significant challenge amid the era of information explosion. Recent studies employing deep learning methods have demonstrated the superior performance of learning-based image compressio...
详细信息
image compression constitutes a significant challenge amid the era of information explosion. Recent studies employing deep learning methods have demonstrated the superior performance of learning-based image compression methods over traditional codecs. However, an inherent challenge associated with these methods lies in their lack of interpretability. Following an analysis of the varying degrees of compression degradation across different frequency bands, we propose the end-to-end optimized image compression model facilitated by the frequency-oriented transform. The proposed end-to-end image compression model consists of four components: spatial sampling, frequency-oriented transform, entropy estimation, and frequency-aware fusion. The frequency-oriented transform separates the original image signal into distinct frequency bands, aligning with the human-interpretable concept. Leveraging the non-overlapping hypothesis, the model enables scalable coding through the selective transmission of arbitrary frequency components. Extensive experiments are conducted to demonstrate that our model outperforms all traditional codecs including next-generation standard H.266/VVC on MS-SSIM metric. Moreover, visual analysis tasks (i.e., object detection and semantic segmentation) are conducted to verify the proposed compression method that could preserve semantic fidelity besides signal-level precision.
Deep learning advancements have significantly enhanced computer visionapplications in precision agriculture. While RGB cameras operating in visible light are affordable, they provide limited information compared to m...
详细信息
object detection based on event vision has been a dynamically growing field in computer vision for the last 16 years. In this work, we create multiple channels from a single event camera and propose an event fusion me...
详细信息
ISBN:
(数字)9798331506520
ISBN:
(纸本)9798331506537
object detection based on event vision has been a dynamically growing field in computer vision for the last 16 years. In this work, we create multiple channels from a single event camera and propose an event fusion method (EFM) to enhance object detection in event-based vision systems. Each channel uses a different accumulation buffer to collect events from the event camera. We implement YOLOv7 for object detection, followed by a fusion algorithm. Our multichannel approach outperforms single-channel-based object detection by 0.7% in mean Average Precision (mAP) for detection overlapping ground truth with IOU = 0.5.
Objective To provide a comprehensive overview on the applications of artificial intelligence (AI) in rhinology, highlight its limitations, and propose strategies for its integration into surgical practice. Data Source...
详细信息
Objective To provide a comprehensive overview on the applications of artificial intelligence (AI) in rhinology, highlight its limitations, and propose strategies for its integration into surgical practice. Data Sources Medline, Embase, CENTRAL, Ei Compendex, IEEE, and Web of Science. Review Methods English studies from inception until January 2022 and those focusing on any application of AI in rhinology were included. Study selection was independently performed by 2 authors;discrepancies were resolved by the senior author. Studies were categorized by rhinology theme, and data collection comprised type of AI utilized, sample size, and outcomes, including accuracy and precision among others. Conclusions An overall 5435 articles were identified. Following abstract and title screening, 130 articles underwent full-text review, and 59 articles were selected for analysis. Eleven studies were from the gray literature. Articles were stratified into imageprocessing, segmentation, and diagnostics (n = 27);rhinosinusitis classification (n = 14);treatment and disease outcome prediction (n = 8);optimizing surgical navigation and phase assessment (n = 3);robotic surgery (n = 2);olfactory dysfunction (n = 2);and diagnosis of allergic rhinitis (n = 3). Most AI studies were published from 2016 onward (n = 45). Implications for Practice This state of the art review aimed to highlight the increasing applications of AI in rhinology. Next steps will entail multidisciplinary collaboration to ensure data integrity, ongoing validation of AI algorithms, and integration into clinical practice. Future research should be tailored at the interplay of AI with robotics and surgical education.
Cataracts are clouding of the lens in the eye, leading to loss of vision that can progress to blindness if not treated. This paper proposed a new method for automatic cataract detection using color fundus images and d...
详细信息
Insect image recognition (iiR) is a specified field in machine learning (ML) and computer vision that efforts to automatically recognise and detection of insect species utilizing visual data attained from images. Leve...
详细信息
The automatic assessment of perceived image quality is crucial in the field of imageprocessing. To achieve this idea, we propose an image quality assessment (IQA) method for blurriness. The features of gradient and s...
详细信息
The automatic assessment of perceived image quality is crucial in the field of imageprocessing. To achieve this idea, we propose an image quality assessment (IQA) method for blurriness. The features of gradient and singular value were extracted in this method instead of the single feature in the traditional IQA algorithms. According to the insufficient size of existing public image quality assessment datasets to support deep learning, machine learning was introduced to fuse the features of multiple domains, and a new no-reference (NR) IQA method for blurriness denoted Feature fusion IQA(Ffu-IQA) was proposed. The Ffu-IQA uses a probabilistic model to estimate the probability of each edge detection blur in the image, and then uses machine learning to aggregate the probability information to obtain the edge quality score. After that uses the singular value obtained by singular value decomposition of the image matrix to calculate the singular value score. Finally, machine learning pooling is used to obtain the true quality score. Ffu-IQA achieves PLCC scores of 0.9570 and 0.9616 on CSIQ and TID2013, respectively, and SROCC scores of 0.9380 and 0.9531, which are better than most traditional image quality assessment methods for blurriness.
暂无评论