In sentence similarity research methods, sentence similarity is often calculated from semantic aspects, however, the influence of other features is ignored. For example, the influence of sentence syntactic structure a...
详细信息
Domain Adaptation for semantic segmentation is of vital significance since it enables effective knowledge transfer from a labeled source domain (i.e., synthetic data) to an unlabeled target domain (i.e., real images),...
详细信息
ISBN:
(纸本)9781665435741
Domain Adaptation for semantic segmentation is of vital significance since it enables effective knowledge transfer from a labeled source domain (i.e., synthetic data) to an unlabeled target domain (i.e., real images), where no effort is devoted to annotating target samples. Prior domain adaptation methods are mainly based on image-to-image translation model to minimize differences in image conditions between source and target domain. However, there is no guarantee that feature representations from different classes in the target domain can be well separated, resulting in poor discriminative representation. In this paper, we propose a unified learning pipeline, called image Translation and Representation Alignment (ITRA), for domain adaptation of segmentation. Specifically, it firstly aligns an image in the source domain with a reference image in the target domain using image style transfer technique (e.g., CycleGAN) and then a novel pixelcentroid triplet loss is designed to explicitly minimize the intraclass feature variance as well as maximize the inter-class feature margin. When the style transfer is finished by the former step, the latter one is easy to learn and further decreases the domain shift. Extensive experiments demonstrate that the proposed pipeline facilitates both image translation and representation alignment and significantly outperforms previous methods in both GTA5 -> Cityscapes and SYNTHIA -> Cityscapes scenarios.
This paper describes a software implementation of a fast distributed scatterer search algorithm for the problem of displacement velocity calculation based on the Apache Spark platform. A complete scheme for calculatin...
详细信息
This paper describes a software implementation of a fast distributed scatterer search algorithm for the problem of displacement velocity calculation based on the Apache Spark platform. A complete scheme for calculating displacement velocities by the persistent scatterer method is considered. The proposed algorithm is integrated into the scheme after the stage of subpixel-accuracy alignment of a stack of time-series images. The search for distributed scatterers is carried out independently in shift windows over the entire area of the image. The presence of distributed scatterers is determined based on the assumption that pairs of samples in the window, which are composed of vectors of complex pixel values in each of the N images, are homogeneous. This assumption stems from the fulfillment of the Kolmogorov-Smirnov criterion for each pair. Toestimate phases of homeogenic pixels, the maximization problem is solved. It is shown that the proposed algorithm is not iterative and can be implemented in the framework of the parallel computing paradigm. Toenable distributed in-memory processing of radar data arrays (from 60 images) across many physical nodes in a network environment, we use the Apache Spark parallelprocessing platform. In this case, the time it takes to find distributed scatterers is reduced by a factor of 10 on average as compared to a single-processor implementation of the algorithm. The comparative results of testing the computing system on a demo cluster are presented. The algorithm is implemented in Python with a detailed description of the objects and methods of the algorithm.
B-mode ultrasound tongue imaging is a non-invasive and real-time method for visualizing vocal tract deformation. However, accurately extracting the tongue's surface contour remains a significant challenge due to t...
详细信息
3D Gaussian Splatting (3DGS) has recently gained increasing attention in novel-view scene synthesis. However, it requires millions of 3D Gaussian spheres to achieve high-quality rendered images, leading to substantial...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
3D Gaussian Splatting (3DGS) has recently gained increasing attention in novel-view scene synthesis. However, it requires millions of 3D Gaussian spheres to achieve high-quality rendered images, leading to substantial GPU resource demands for training and rendering. This paper has developed an efficient framework to address the challenges faced by 3DGS. 1) We propose an asymmetric mixed-precision quantization strategy to efficiently quantize and dequantize the parameters of 3D Gaussian Spheres (excluding position and opacity) and utilized the Straight Through Estimator method to address the issue of non-backpropagatable gradients post-quantization. This approach significantly reduces GPU memory usage and model storage space. 2) Using statistical methods, we identify optimal gradient thresholds to enhance the densification algorithm of 3D Gaussian spheres, thereby further improving performance. 3) To address the speed bottlenecks in parallelprocessing of low-precision data using CUDA, we introduce a fast atomic operation method, increasing training speed tenfold. Overall, we validate the effectiveness of our framework across various datasets. It reduces GPU usage by 50% during training, decreases GPU usage threefold during rendering, halves the model storage space and maintains image quality comparable to 3DGS.
Adversarial attacks are now becoming quite a dangerous means of disrupting imageprocessing systems that use machine learning methods for decision making. Therefore, developing effective countermeasures against advers...
详细信息
ISBN:
(数字)9798331524937
ISBN:
(纸本)9798331524944
Adversarial attacks are now becoming quite a dangerous means of disrupting imageprocessing systems that use machine learning methods for decision making. Therefore, developing effective countermeasures against adversarial attacks is becoming quite an important area of cybersecurity. The paper proposes a noise-based approach to countering adversarial attacks that is augmented with neural-cleanse and jpeg-compression technologies. The idea of the proposed approach is that adding noise distorts the effect of an adversarial attack, and neural cleaning and jpeg compression eliminate the consequences of such an effect. The paper examines the three most well-known types of adversarial attacks: Fast Gradient Sign Method, Zeroth Order Optimization and One Pixel Attack. These attacks manipulate input data, resulting in misclassification or incorrect predictions by exploiting high-frequency components that are undetectable to humans. The research was carried out on two datasets: MNIST-JPG and PC Parts images. Two types of noise were used: Gaussian and Poisson. During the experiments, optimal parameters for these types of noise were found, ensuring maximum accuracy of image recognition after exposure to adversarial attacks.
Skin cancer is a commonly occurring disease, which affects people of all age groups. Automated detection of skin cancer is needed to decrease the death rate by identifying the diseases at the initial stage. The visual...
详细信息
Skin cancer is a commonly occurring disease, which affects people of all age groups. Automated detection of skin cancer is needed to decrease the death rate by identifying the diseases at the initial stage. The visual inspection during the medical examination of skin lesions is a tedious process as the resemblance among the lesions exists. Recently, imaging-based Computer Aided Diagnosis (CAD) model is widely used to screen and detect the skin cancer. This paper is designed with automated Deep Learning with a class attention layer based CAD model for skin lesion detection and classification known as DLCAL-SLDC. The goal of the DLCAL-SLDC model is to detect and classify the different types of skin cancer using dermoscopic images. During image pre-processing, Dull razor approach-based hair removal and average median filtering-based noise removal processes take place. Tsallis entropy based segmentation technique is applied to detect the affected lesion areas in the dermoscopic images. Also, a DLCAL based feature extractor is used for extracting the features from the segmented lesions using Capsule Network (CapsNet) along with CAL and Adagrad optimizer. The CAL layer incorporated into the CapsNet is intended to capture the discriminative class-specific features to cover the class dependencies and effectively bridge the CapsNet for further process. Finally, the classification is carried out by the Swallow Swarm Optimization (SSO) algorithm based Convolutional Sparse Autoencoder (CSAE) known as SSO-CSAE model. The proposed DLCAL-SLDC technique is validated using a benchmark ISIC dataset. The proposed framework has accomplished promising results with 98.50% accuracy, 94.5% sensitivity and 99.1% specificity over the other methods interms of different measures.
Few-shot image classification is a challenging task that aims to recognize image classes based on only a few training images. However, existing methods face the following two main challenges: (1) Ignoring the frequenc...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Few-shot image classification is a challenging task that aims to recognize image classes based on only a few training images. However, existing methods face the following two main challenges: (1) Ignoring the frequency domain information during image feature extraction. (2) It does not take the semantic gap between multiple modalities into consideration, which limits the classification performance. To overcome these limitations, we propose a novel method named Spatial-Frequency Integration Network with Dual Prompt Learning for few-shot image classification. Firstly, we introduce a spatial-frequency integration module that combines spatial domain and low-frequency information to extract discriminative image features from the image modality. Secondly, we design a dual prompting module, which integrates learnable prompts and hand-crafted prompts to improve the generalization of applications to new classes. Thirdly, we propose an image-text interaction module to enhance inter-modal complementary and consistency. Both theoretical and experimental validations confirm the effectiveness of the proposed method in few-shot image classification.
As the availability of SAR images continues to grow, efficient coregistration of massive SAR images presents a greater challenge. Traditional serial coregistration methods impose an unbearable time overhead. To reduce...
As the availability of SAR images continues to grow, efficient coregistration of massive SAR images presents a greater challenge. Traditional serial coregistration methods impose an unbearable time overhead. To reduce this overhead and make full use of computing resources, a parallel coregistration strategy based on Hadoop is proposed for SAR images. The Hadoop distributed File System (HDFS) is used to store SAR image data in chunks, and Hadoop's distributed computing strategy MapReduce is used to realize distributedparallelprocessing of SAR images. Two distributedparallel coregistration methods are presented with the proposed parallel strategy: one based on the maximum correlation method and the other on the DEM-assisted coregistration method. These methods are evaluated through coregistration experiments on the same dataset, and they are verified by comparing the coregistration results and processing time.
The retrieval of encrypted images in cloud computing is a research hotspot at present. However, the existing schemes have the problem of low image retrieval accuracy since it is difficult to obtain accurate feature in...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
The retrieval of encrypted images in cloud computing is a research hotspot at present. However, the existing schemes have the problem of low image retrieval accuracy since it is difficult to obtain accurate feature information through convolutional neural network from encrypted image. In this paper, a secure image retrieval method using dual-stream convolution structure for feature fusion is proposed. Specifically, the stream cryptographic image containing contour features and the Fourier transform image containing frequency domain features are used as the two inputs of the convolutional network stream, where the weighted average gate function helps to integrate the feature information in these streams to achieve a more comprehensive feature representation and improve the retrieval accuracy. Furthermore, to keep the contour features of the encrypted image, the fuzzy image and Arnold mapping algorithm are selected randomly to encrypt the encrypted image twice. Experimental evaluation on three datasets, MNIST, Fashion-MNIST and CIFAR-100, shows that the proposed encryption method has higher security than AES and stream cipher encryption. In addition, the retrieval average accuracy (mAP) of this scheme is superior to the existing methods.
暂无评论