The unmixing of hyperspectral data is a hot topic in the field of r emote s ensing. H owever, in p resence o f various types of noise, especially the noisy channels, the performance of unmixing approaches is seriously...
详细信息
ISBN:
(纸本)9781728180687
The unmixing of hyperspectral data is a hot topic in the field of r emote s ensing. H owever, in p resence o f various types of noise, especially the noisy channels, the performance of unmixing approaches is seriously deteriorated. To enhance the robustness of the unmixing method is a subject worth studying. This paper presents a robust unmixing method based on the recently- proposed multilinear mixing model, where the l(2,1) norm is adopted in the loss function to suppress the influence of noise. The sparseness of abundance is also considered to improve the parameter estimation. The resulting optimization problem is solved by the alternating direction multiplier method (ADMM). Experiments on both synthetic and real images demonstrate the performance of the proposed unmixing strategy.
Computer vision tasks suffer from the high cost of collecting large amounts of labeled data. Few-shot Learning (FSL) is a dominant approach to solve this problem because it provides an insight to learn the knowledge o...
详细信息
ISBN:
(纸本)9781665475921
Computer vision tasks suffer from the high cost of collecting large amounts of labeled data. Few-shot Learning (FSL) is a dominant approach to solve this problem because it provides an insight to learn the knowledge of novel categories with few training samples. In FSL task, Meta-learning and metric learning have achieved impressive results. However, the performance of this task is still limited by large intra-class variance and small inter-class distance caused by limited number of few samples. To solve this problem, In this paper, we propose a new method, which integrates meta-learning and metric learning techniques. Specifically, we first propose a feature representation module (FR) to construct representative support class prototypes and query features. Then, we design bias loss to minimize the bias between support and query samples. Furthermore, we design an intra-class loss to minimize the distance between query class prototype and each query sample. We denote this model as ML-FDA and validate it on standard few-shot classification benchmark datasets (MiniimageNet, CIFAR-FS, FC100). The results show that our method improves the performance over other same paradigm methods and achieves the best performance on most benchmarks. The ablation study and visulization analysis also demonstrate the effectiveness of our method.
image registration is a fundamental topic in imageprocessing and has a wide variety of applications in Computer Vision. It is the process of matching two or more images taken at different times from different sensors...
详细信息
ISBN:
(纸本)0780386744
image registration is a fundamental topic in imageprocessing and has a wide variety of applications in Computer Vision. It is the process of matching two or more images taken at different times from different sensors or from different viewpoints, so that the matched coordinate points in the two images correspond to the same physical region of the scene being imaged. Typically image registration is required in remote sensing (multi spectral classification, mage fusion, environmental monitoring, change detection, image Mosaicing, Weather forecasting) in medicine, in cartography and in computer vision (target localization, automatic quality control) In this paper, a feature based image registration technique is introduced for mosaicing the sequence of UAV images. First, features are extracted using corner detectors and transformation parameters are computed using a discrete randomized approach. Then blending of images is carried out in a tree-based approach. Finally, results of merging for some sequence of UAV images are shown.
Video-based person re-identification (Re-ID) aims to match person images in video sequences captured by disjoint surveillance cameras. Traditional video-based person Re-ID methods focus on exploring appearance informa...
详细信息
ISBN:
(纸本)9781728185514
Video-based person re-identification (Re-ID) aims to match person images in video sequences captured by disjoint surveillance cameras. Traditional video-based person Re-ID methods focus on exploring appearance information, thus, vulnerable against illumination changes, scene noises, camera parameters, and especially clothes/carrying variations. Gait recognition provides an implicit biometric solution to alleviate the above headache. Nonetheless, it experiences severe performance degeneration as camera view varies. In an attempt to address these problems, in this paper, we propose a framework that utilizes the sequence masks (SeqMasks) in the video to integrate appearance information and gait modeling in a close fashion. Specifically, to sufficiently validate the effectiveness of our method, we build a novel dataset named MaskMARS based on MARS. Comprehensive experiments on our proposed large wild video Re-ID dataset MaskMARS evidenced our extraordinary performance and generalization capability. Validations on the gait recognition metric CASIA-B dataset further demonstrated the capability of our hybrid model. Our codes and dataset MaskMARS will be open-sourced as a strong baseline.
The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-sca...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-scale graph transformer network (MGTN) for 3D point cloud semantic segmentation. First, a multi-scale graph convolution (MSG-Conv) is devised to address the limitations faced by existing methods when extracting local and global features of point cloud data with varying densities simultaneously. Subsequently, we employ a graph-transformer (G-T) module to enhance edge details and spatial position information in the point cloud, thereby improving recognition accuracy for small objects and confusing elements such as columns and beams. Extensive testing on ShapeNet parts and S3DIS datasets was conducted to demonstrate the effectiveness of MGTN. Compared to the baseline network DGCNN, our proposed MGTN achieves substantial performance improvements, as evidenced by notable increases in mIoU of 1.5% and 18.5% on the ShapeNet parts and S3DIS datasets respectively. Additionally, MGTN outperforms the recent CFSA-Net by 2.3% and 3.4% on OA and mIoU respectively.
A growing societal awareness about privacy and security push the development of signal processing techniques in the encrypted domain. Data compression in encrypted domain attracts much attention recently years due to ...
详细信息
ISBN:
(纸本)9781479902880
A growing societal awareness about privacy and security push the development of signal processing techniques in the encrypted domain. Data compression in encrypted domain attracts much attention recently years due to its avoiding the leakage of data source during compression. This paper proposes an improved block-by-block compression scheme of encrypted image with flexible compression ratio. The original image is encrypted by permuting the blocks of the image and then permuting the pixels in the blocks. In the compression, pixels chosen randomly used as reference information, and remaining pixels are compressed by coset code. At the decoder side, side information (SI) which is generated by combining correlation among blocks and image restoration from partial random samples (IRPRS) is utilized to assist the decompression. Moreover, an adaptive system parameters selection method is also given in this paper. The experimental results show that the proposed method can achieve a better reconstructed result compared with the earlier method.
In this study, a method is proposed for pasting a user selected and copied part of a region from the source image to the target image. Since the selected areas in the source and target images are not homogenous which ...
详细信息
ISBN:
(纸本)9781467355636;9781467355629
In this study, a method is proposed for pasting a user selected and copied part of a region from the source image to the target image. Since the selected areas in the source and target images are not homogenous which means they contain texture information, most of the previous methods in the literature depending on the Poisson equation cause occurrence of adverse effects such as blur or color leakage in the processed region. The proposed method does not cause those artifacts in most cases but it makes an improvement and minimizes the artifacts. The visual results also prove that the method is promising.
visual attention plays an important role in image and video processing. Nowadays, high definition (HD) techniques have been widely used. And ultra high definition (UHD) is becoming more and more popular. However, exis...
详细信息
ISBN:
(纸本)9781479961399
visual attention plays an important role in image and video processing. Nowadays, high definition (HD) techniques have been widely used. And ultra high definition (UHD) is becoming more and more popular. However, existing researches in visual attention mainly focus on relatively low resolution videos or images. There is very limited studies in visual attention of UHD videos. In this paper, we built a Ultra High Definition (4k) Video Saliency Database. Using this database, we explored the characteristics of visual attention related to ultra high definition videos. A concept of aggregation maps (AGM) for videos is put forward to better analyse the characteristics of visual attention of videos. Through the experiment, we find that there exist fairly strong correlations between the video resolution and visual attention behaviors. We also find that people tend to focus on the center of videos of relatively low resolution. The database will be make publicly available at *** soon.
Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently...
详细信息
ISBN:
(纸本)9781479902880
Compressive sensing imaging (CSI) is a new framework for image coding, which enables acquiring and compressing a scene simultaneously. The CS encoder shifts the bulk of the system complexity to the decoder efficiently. Ideally, implementation of CSI provides lossless compression in image coding. In this paper, we consider the lossy compression of the CS measurements in CSI system. We design a universal quantizer for the CS measurements of any input image. The proposed method firstly establishes a universal probability model for the CS measurements in advance, without knowing any information of the input image. Then a fast quantizer is designed based on this established model. Simulation result demonstrates that the proposed method has nearly optimal rate-distortion (R similar to D) performance, meanwhile, maintains a very low computational complexity at the CS encoder.
The end of the performance entitlement historically achieved by classic scaling of CMOS devices is within sight, driven ultimately by fundamental limits. Performance entitlements predicted by classic CMOS scaling have...
详细信息
ISBN:
(纸本)9780819469946
The end of the performance entitlement historically achieved by classic scaling of CMOS devices is within sight, driven ultimately by fundamental limits. Performance entitlements predicted by classic CMOS scaling have progressively failed to be realized in recent process generations due to excessive leakage, increasing interconnect delays and scaling of gate dielectrics. Prior to reaching fundamental limits, trends in technology, architecture and economics will pressure the industry to adopt new paradigms. A likely response is to repartition system functions away from digital implementations and into new architectures. Future architectures for visualcommunications will require extending the implementation into the optical and analog processing domains. The fundamental properties of these domains will in turn give rise to new architectural concepts. The limits of CMOS scaling and impact on architectures will be briefly reviewed. Alternative approaches in the optical, electronic and analog domains will then be examined for advantages, architectural impact and drawbacks.
暂无评论