Recent years have witnessed great progress in person re-identification (re-id). Several academic benchmarks such as Market1501, CUHK03 and DukeMTMC play important roles to promote the re-id research. To our best knowl...
详细信息
ISBN:
(数字)9781728171685
ISBN:
(纸本)9781728171692
Recent years have witnessed great progress in person re-identification (re-id). Several academic benchmarks such as Market1501, CUHK03 and DukeMTMC play important roles to promote the re-id research. To our best knowledge, all the existing benchmarks assume the same person will have the same clothes. While in real-world scenarios, it is very often for a person to change clothes. To address the clothes changing person re-id problem, we construct a novel large-scale re-id benchmark named Clothes Changing Person Set (COCAS), which provides multiple images of the same identity with different clothes. COCAS totally contains 62,382 body images from 5,266 persons. Based on COCAS, we introduce a new person re-id setting for clothes changing problem, where the query includes both a clothes template and a person image taking another clothes. Moreover, we propose a two-branch network named Biometric-Clothes Network (BC-Net) which can effectively integrate biometric and clothes feature for re-id under our setting. Experiments show that it is feasible for clothes changing re-id with clothes templates.
Semi-supervised learning (SSL) relies on a few labeled samples to explore data's intrinsic structure through pairwise smooth transduction. The performance of SSL mainly depends on two folds: (1) the accuracy of la...
详细信息
ISBN:
(纸本)9781467322164
Semi-supervised learning (SSL) relies on a few labeled samples to explore data's intrinsic structure through pairwise smooth transduction. The performance of SSL mainly depends on two folds: (1) the accuracy of labeled queries, (2) the integrity of manifolds in data distribution. Both of these qualities would be poor in real applications as data often consist of several irrelevant clusters and discrete noise. In this paper we propose a novel framework to simultaneously remove discrete noise and locate the high-density clusters. Experiments demonstrate that our algorithm is quite effective to solve several problems such as non-feedback image re-ranking and image co-segmentation.
Recent studies often exploit Graph Convolutional Network (GCN) to model label dependencies to improve recognition accuracy for multi-label image recognition. However, constructing a graph by counting the label co-occu...
详细信息
Recent years have witnessed great progress in person re-identification (re-id). Several academic benchmarks such as Market1501, CUHK03 and DukeMTMC play important roles to promote the re-id research. To our best knowl...
详细信息
In this paper, we introduce the Equipment Nameplate Dataset, a large dataset for scene text detection and recognition. Natural images in this dataset are taken in the wild and thus this dataset includes various intra-...
In this paper, we introduce the Equipment Nameplate Dataset, a large dataset for scene text detection and recognition. Natural images in this dataset are taken in the wild and thus this dataset includes various intra-class inconsistency such as ill illumination conditions and partly occluded, which makes our dataset more challenging than other datasets. In order to make people train detection and recognition model separately, we annotate our dataset not only word instance, but also text region by using rectangle bounding boxes. Some detailed statistics information about our dataset will be given so that people can use them to analyse and develop their own models. Moreover, we use our dataset to test some famous detection and recognition models and present the corresponding results in order to make researcher compare them with their own models. Dataset will be publicly available on the website.
Use of handwriting words for person identification in contrast to biometric features is gaining importance in the field of forensic applications. As a result, forging handwriting is a part of crime applications and he...
详细信息
The end-to-end Human Mesh Recovery (HMR) approach (Kanazawa et al. 2018) has been successfully used for 3D body reconstruction. However, most HMR-based frameworks reconstruct human body by directly learning mesh param...
详细信息
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution. However, we find that these networks can only utilize a limited spatial range of input information...
Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution. However, we find that these networks can only utilize a limited spatial range of input information through attribution analysis. This implies that the potential of Transformer is still not fully exploited in existing networks. In order to activate more input pixels for better reconstruction, we propose a novel Hybrid Attention Transformer (HAT). It combines both channel attention and window-based self-attention schemes, thus making use of their complementary advantages of being able to utilize global statistics and strong local fitting capability. Moreover, to better aggregate the cross-window information, we introduce an overlapping cross-attention module to enhance the interaction between neighboring window features. In the training stage, we additionally adopt a same-task pre-training strategy to exploit the potential of the model for further improvement. Extensive experiments show the effectiveness of the proposed modules, and we further scale up the model to demonstrate that the performance of this task can be greatly improved. Our overall method significantly outperforms the state-of-the-art methods by more than 1dB.
Due to change in mindset and living style of humans, the numbers of diversified marriages are increasing all around the world irrespective of race, color, religion and culture. As a result, it is challenging for resea...
详细信息
As an important branch of computational photography, light field photography combines the hardware design of optical system with key algorithm of signal processing quite well. Unlike traditional photography which can ...
详细信息
ISBN:
(纸本)9781467391054
As an important branch of computational photography, light field photography combines the hardware design of optical system with key algorithm of signal processing quite well. Unlike traditional photography which can only record light ray's two-dimensional position, light field photography system can record four-dimensional position and direction. Therefore, much more image information can be obtained from light field photography. With the development of 3D display technology, light field based autofocus and 3D display technology is becoming more and more popular. In this paper, a light field based new 3D reconstruction algorithm for buildings and office environment is proposed by applying Wavelet Transform and SVM (Support Vector Machine) model to obtain the image focusing quality assessment, along with the Mean Shift Algorithm and Random Field Model to get the depth map of the scene. Firstly, light field image is captured by using a light field camera. Secondly, we use frequency domain digital refocus algorithm to manipulate light field image and obtain several serialized refocused images with different focus. Thirdly, wavelet features are extracted from each refocused image, and then an image focusing quality assessment is conducted by using RBF (Radial Basis Function) kernel based SVM model. Finally, we use Mean Shift algorithm to realize color clustering of the original light field image, and then build MRF (Markov Random Field) Model with color nodes. By iterating the likelihood depth result obtained from real scenario depth calibrations according to image focusing quality assessment, finally the depth map of the scene is reconstructed. Experiments are conducted to prove the feasibility of the proposed 3D reconstructed algorithm based on light field. And the experimental results on real datasets demonstrate good performance of this algorithm.
暂无评论