Docker image storage systems, like Docker registry, often employ deduplication to reduce storage overhead. Existing deduplication methods for these systems detect redundancy at either layer or file level, each with it...
详细信息
In imageprocessing, "image fusion" is the amalgamation of attributes and requirements from many images into a singular, more comprehensive representation. Multi-modal medical image fusion is a significant c...
详细信息
ISBN:
(数字)9798331521349
ISBN:
(纸本)9798331521356
In imageprocessing, "image fusion" is the amalgamation of attributes and requirements from many images into a singular, more comprehensive representation. Multi-modal medical image fusion is a significant category of image fusion. It entails the integration of medical images acquired from several modalities. This work utilizes computed tomography (CT) scans, Positron Emission Tomography (PET) and magnetic resonance imaging (MRI) as modalities. This study (M3IF-SBTS) aims to construct a multi modal medical image approach that combines Optimal Sub-band Tree Structuring (SBTS) and Principal Component Analysis (PCA) with MRI and CT images in a manner that maximizes the information content in the fused image. The SBTS is an advanced wavelet transform version of Discrete Wavelet Transform (DWT), where signal is filtered more times. The PCA provides dimensionality reduction and retains the relevant features. The wavelet coefficients are fused using PCA-based fusion method. This combination of SBTS and PCA provides superior results compared to using SBTS, DWT, or PCA as individual methods. This improves overall visual and parametric quality of fusion results than many compared methods.
In this paper, we develop an Attention based Generative Adversarial Networks (AGAN) to augment image data for the purpose of robust training for efficient processing of the hyper spectral imaging. The AGAN model enabl...
详细信息
ISBN:
(数字)9798331521349
ISBN:
(纸本)9798331521356
In this paper, we develop an Attention based Generative Adversarial Networks (AGAN) to augment image data for the purpose of robust training for efficient processing of the hyper spectral imaging. The AGAN model enables generation of images from the sample images that helps in training the classifier and in this study a fundamental classifier namely a convolutional neural network is used. A robust training is conducted to test the accuracy of detecting the instances effectively using the dataset. The simulation shows that the proposed AGAN-CNN attains improved accuracy after robust training than the existing methods.
image-text matching is an important problem at the intersection of computer vision and natural language processing. It aims to establish the semantic link between image and text to achieve high-quality semantic alignm...
详细信息
ISBN:
(数字)9798331515966
ISBN:
(纸本)9798331515973
image-text matching is an important problem at the intersection of computer vision and natural language processing. It aims to establish the semantic link between image and text to achieve high-quality semantic alignment between the two modalities. However, the existing methods have the problem that the meaning expressed in the image or the complex narrative in the text cannot be fully understood due to insufficient feature extraction. Moreover, due to the essential modal differences between images and texts, how to effectively and accurately align the semantic contents in images and texts has become the key of research. In order to solve the above problems, this paper proposes a method based on feature enhancement and relationship interaction. When processingimages, the proposed method fuses labeled features, region features and location features to represent images. When processing text, a combination of Bi-GRU and self-attention mechanism is used to represent the text. In order to further align the semantic content in images and texts accurately, this paper improves two relational interaction mechanisms by identifying connection relationships and learning association relationships. Thus, the relation enhanced embedding is obtained. Finally, it calculated the similarity of the enhanced embedding to judge the matching degree of the image and text. Extensive experiments on the public datasets Flickr30K and MSCOCO demonstrate the effectiveness of our method.
To accurately evaluate the patient’s condition, medical workers usually need to register multiple pathological images of the lesion site samples. Using computer technology to assist in registration work can effective...
详细信息
ISBN:
(数字)9798350391954
ISBN:
(纸本)9798350391961
To accurately evaluate the patient’s condition, medical workers usually need to register multiple pathological images of the lesion site samples. Using computer technology to assist in registration work can effectively improve the efficiency of doctors analyzing pathological images. One of the most advanced methods currently is the Virtual Alignment of Pathology image Series method, which is a multi-staining digital pathology image registration method that combines global and local calculations. However, this method may encounter certain biases when processingimages with significant angle differences. Through a detailed analysis of this method, this article proposes an improvement plan which optimizes the acquisition of non-rigid registration mask images, enabling the method to obtain mask images more reasonably and achieve better registration results for images with significant angle differences. This provides more accurate judgment basis and helps doctors diagnose and develop treatment plans more accurately.
Dynamic channel pruning is a technique aimed at reducing the theoretical computational complexity and inference latency of convolutional neural networks. Dynamic channel pruning methods introduce complex additional mo...
Dynamic channel pruning is a technique aimed at reducing the theoretical computational complexity and inference latency of convolutional neural networks. Dynamic channel pruning methods introduce complex additional modules for dynamically selecting channels for images. Due to the additional modules, dynamic channel pruning methods never achieve optimal acceleration effect in real world. To address this problem, we propose Consecutive Dynamic Channel Pruning (CDCP), a novel dynamic channel pruning framework unified for almost all dynamic pruning methods designed for continuous imageprocessing. The core idea of CDCP stems from our observation that adjusting the network for all frames in semantically continuous scenes is unnecessary since adjacent frames often share similar network structures in dynamic channel pruning. CDCP introduces a simple binary classifier to determine whether the network structure needs to be adjusted for a new frame. Our method can also be used for semantically non-continuous imageprocessing tasks with a slightly lower probability of model reuse. We validate the effectiveness of CDCP on three dynamic channel pruning methods and better acceleration effects are achieved when applied them with CDCP to the semantically continuous Waymo dataset, the nuScenes dataset, and the semantically discontinuous COCO dataset.
The modern synchrotron radiation facilities are producing massive diffraction images, which present a severe problem for data processing due to the high dimensionality of imaging data. Feature recognition and selectio...
详细信息
ISBN:
(纸本)9781665435741
The modern synchrotron radiation facilities are producing massive diffraction images, which present a severe problem for data processing due to the high dimensionality of imaging data. Feature recognition and selection based deep learning methods have been developed to analyze data automatically. One crucial step is to use AI to screen out the diffraction images without Bragg spots. This paper proposes a feature distillation based approach for screening. It helps to reduce over 40% raw data volume and greatly alleviates the post processing workload faced by scientists.
The complex glyph structures and diverse writing styles of ancient Chinese character images lead to suboptimal performance when existing image retrieval methods are directly applied to datasets of these images. Addres...
The complex glyph structures and diverse writing styles of ancient Chinese character images lead to suboptimal performance when existing image retrieval methods are directly applied to datasets of these images. Addressing the impact on retrieval caused by the complex glyph structure and rich detail information characteristic of ancient Chinese character images, a multi-layer feature adaptive fusion model for ancient Chinese character image retrieval is designed. Firstly, a local feature extraction module is constructed to obtain low-level feature maps with different spatial receptive fields. Secondly, to better fuse features of varying scales, an improved ASFF(Adaptively Spatial Feature Fusion) method is employed to build the PSASFF(Pixel Shuffle Adaptively Spatial Feature Fusion) module for adaptive fusion of multi-layer features extracted by the designed network. Finally, in order to reduce the influence of the writing style of ancient Chinese characters on retrieval, the cosine similarity scores of image retrieval before and after fine processing are weighted and used as the ultimate similarity scores. The retrieval method proposed achieves an average retrieval accuracy of mAP@50 and mAP@30 of 0.8537 and 0.9576 respectively on the ancient Chinese character image dataset. Experimental results demonstrate the effectiveness of this method for ancient Chinese character image retrieval.
Sensitive data leakage has become an urgent problem to be solved as more images based functionalities are being developed in vehicles. However, there is a scarcity of evaluation for on-board videos data desensitizatio...
详细信息
ISBN:
(数字)9798350352719
ISBN:
(纸本)9798350352726
Sensitive data leakage has become an urgent problem to be solved as more images based functionalities are being developed in vehicles. However, there is a scarcity of evaluation for on-board videos data desensitization. This research analyzes on-board video desensitization process including the image pre-processing stage, sensitive area localization stage and sensitive area desensitization stage. Considering that, this paper presents several evaluation methods and metrics of sensitive target detection performance, privacy-utility evaluation of video file metadata and image, so as to provide reference for the related research on the desensitization evaluation of on-board video.
City events are getting popular and are attracting a large number of people. This increase needs for methods and tools to provide stakeholders with crowd size information for crowd management purposes. Previous works ...
详细信息
City events are getting popular and are attracting a large number of people. This increase needs for methods and tools to provide stakeholders with crowd size information for crowd management purposes. Previous works proposed a large number of methods to count the crowd using different data in various contexts, but no methods proposed using social media images in city events and no datasets exist to evaluate the effectiveness of these methods. In this study we investigate how social media images can be used to estimate the crowd size in city events. We construct a social media dataset, compare the effectiveness of face recognition, object recognition, and cascaded methods for crowd size estimation, and investigate the impact of image characteristics on the performance of selected methods. Results show that object recognition based methods, reach the highest accuracy in estimating the crowd size using social media images in city events. We also found that face recognition and object recognition methods are more suitable to estimate the crowd size for social media images which are taken in parallel view, with selfies covering people in full face and in which the persons in the background have the same distance to the camera. However, cascaded methods are more suitable for images taken from top view with gatherings distributed in gradient. The created social media dataset is essential for selecting image characteristics and evaluating the accuracy of people counting methods in an urban event context.
暂无评论