One of the most important problems faced by broadcasters is the unauthorized use of their images by third parties or organizations in a large-scale database, which contains hundreds of thousands of images. For this re...
详细信息
ISBN:
(纸本)9781665450928
One of the most important problems faced by broadcasters is the unauthorized use of their images by third parties or organizations in a large-scale database, which contains hundreds of thousands of images. For this reason, it is important to perform an efficient and effective image retrieval, whose objective is to find the most similar images to a given test image. In addition, test images often contain text, and the presence of the text together with the visual part complicates the search process. In this paper, we present an image retrieval framework based on a bag of visual words, which has been shown to be effective in the literature. A convolutional neural network model is used to parse the text in the images. Experiments demonstrate the efficacy of this model in a large database.
Fundus image quality assessment (IQA) is essential for controlling the quality of retinal imaging and guaranteeing the reliability of diagnoses by ophthalmologists. Existing fundus IQA methods mainly explore local inf...
详细信息
ISBN:
(纸本)9781665475921
Fundus image quality assessment (IQA) is essential for controlling the quality of retinal imaging and guaranteeing the reliability of diagnoses by ophthalmologists. Existing fundus IQA methods mainly explore local information to consider local distortions from convolutional neural networks (CNNs), yet ignoring global distortions. In this paper, we propose a novel multi-information aggregation network, termed MA-Net, for fundus IQA by extracting both local and global information. Specifically, MA-Net adopts an asymmetric dual-branch structure. For an input image, it uses the ResNet50 and vision transformer (ViT) to obtain the local and global representations from the upper and lower branches, respectively. In addition, MA-Net separately feed different images into the two branches to rank their quality for supplementing the feature representations. Thanks to the exploration of intra- and inter-class information between images, our MA-Net is competent for the fundus IQA task. Experiment results on the EyeQ dataset show that our MA-Net outperforms the baselines (i.e., ResNet50 and ViT) by 3.06% and 7.61% in Acc, and is superior to the mainstream methods.
This paper investigates the imagecommunications in two-hop wireless relay networks (TH-WRNs) where a source node sends images to a destination node with the assistance of a relay node via two hops, i.e. source-To-rel...
详细信息
image captioning models are a type of "Natural Language processing"(NLP) models that are designed to generate textual descriptions of images. These models are trained on large datasets of images and captions...
详细信息
visual camouflage is an effective means of protecting valuable assets that are vulnerable to theft, espionage, or other forms of malicious activity. To overcome the limitation of standardized camouflage patterns in ce...
详细信息
ISBN:
(纸本)9781728198354
visual camouflage is an effective means of protecting valuable assets that are vulnerable to theft, espionage, or other forms of malicious activity. To overcome the limitation of standardized camouflage patterns in certain environments, we need an innovative approach that adapts the camouflage pattern to the specific surroundings of the asset to be concealed. In this paper, we present a novel camouflage image generation method whose results change by the circumstances around the asset to be concealed using the style transfer. In order to remove the influence of stuff and objects that are not advantageous for camouflage in the surrounding, we propose a novel mechanism that introduces the contents-aware information into the calculation of style representation. Experimental results in various situations, including the snowy natural scene, show that the proposed method provides excellent adaptive camouflage outcomes while effectively suppressing conspicuous elements in the surrounding.
In this paper, we propose a method to improve the accuracy of depth map estimation from multi-view images using Neural Radiance Fields (NeRF). A depth map can be estimated from multi-view images using Multi-View Stere...
详细信息
In the era of rapidly advancing technology, the integration of computer vision and natural language processing has emerged as a pivotal area of research, with deep learning playing a central role. The task of generati...
详细信息
ISBN:
(纸本)9783031640636;9783031640643
In the era of rapidly advancing technology, the integration of computer vision and natural language processing has emerged as a pivotal area of research, with deep learning playing a central role. The task of generating descriptive textual captions for images is known as image captioning. It is necessary for enhancing accessibility, aiding visually impaired individuals, and improving human-computer interaction by providing meaningful context to visual content. Generating relevant descriptions for high-level image semantics involves not just recognizing objects and scenes but also analyzing the state, attributes, and relationships among them. This research paper investigates the synergy of Convolutional Neural Networks (CNNs) for effective image feature extraction and Long Short-Term Memory (LSTM) networks for capturing sequential dependencies in generating descriptive and coherent textual captions. It has been demonstrated that it can produce precise and contextually relevant descriptions for a variety of images.
Virtual reality (VR) conference, as a typical social VR application, has gained popularity in recent years. It offers users located at different locations a fully immersive experience and a sense of togetherness. Howe...
详细信息
Sea fog recognition is a challenging and significant semantic segmentation task in remote sensing images. The fully supervised learning method relies on the pixel-level label, which is labor-intensive and time-consumi...
详细信息
ISBN:
(纸本)9781665475921
Sea fog recognition is a challenging and significant semantic segmentation task in remote sensing images. The fully supervised learning method relies on the pixel-level label, which is labor-intensive and time-consuming. Moreover, it is impossible to accurately annotate all pixels of the sea fog region due to the limited ability of the human eye to distinguish between low clouds and sea fog. In this paper, we propose a novel approach of point-based annotation for weakly supervised semantic segmentation with the auxiliary information of International Comprehensive Ocean-Atmosphere Data Set (ICOADS) visibility data. It only needs several definite points for both foreground and background, which significantly reduces the annotation cost of manpower. We conduct extensive experiments on Himawari-8 satellite remote sensing images to demonstrate the effectiveness of our annotation method. The mean intersection over union (mIoU) and overall recognition accuracy of our annotation method reach 82.72% and 95.18%, respectively. Compared with the fully supervised learning method, the accuracy and the recognition rate of sea fog area are improved with a maximum increase of 7.69% and 9.69%, respectively.
The Industrial robot visual servo imageprocessing requires highly autonomous and intelligence robotic manipulators, with goal of performing manipulation tasks independently without human interventions. However, limit...
详细信息
暂无评论