CENTRIST (CENsus TRansform hISTogram) is a descriptor which is firstly proposed for scene classification. In this paper, the differences between CENTRIST and LBP are analyzed on theory. And then it is exploited in the...
详细信息
CENTRIST (CENsus TRansform hISTogram) is a descriptor which is firstly proposed for scene classification. In this paper, the differences between CENTRIST and LBP are analyzed on theory. And then it is exploited in the task of content-based image retrieval firstly integrated with the spatial information by multi scale spatial pyramid. The experimental results firstly show that the similarity of two images computed by histogram intersection is better than obtained by Euclidean distance for CENTRIST descriptor. And then the paper demonstrates the most difference between CENTRIST and LBP is that whether the constraints and the transitivity among neighbored pixels exist on experiments. Although CENTRIST can achieve higher precision at top 40 returned images compared with LBP and its extensions only for some categories chosen from Corel and Caltech101 database, the average P-R curve of CENTRIST is higher than LBPs.
We propose an unsupervised person search method for video surveillance. This method considers both the spatial features of persons within each frame and the temporal relationship of the same person among different fra...
详细信息
Efficient VLSI architectures for multi-dimensional (m-D) discrete wavelet transform (DWT), e.g. m=2, 3, are presented, in which the lifting scheme of DWT is used to reduce efficiently hardware complexity. The parallel...
详细信息
Efficient VLSI architectures for multi-dimensional (m-D) discrete wavelet transform (DWT), e.g. m=2, 3, are presented, in which the lifting scheme of DWT is used to reduce efficiently hardware complexity. The parallelism of 2 m subbands transforms in lifting-based m-D DWT is explored, which increases efficiently the throughput rate of separable m-D DWT. The proposed architecture is composed of m2m-1 1-D DWT modules working in parallel and pipelined, which is designed to process 2m input samples per clock cycle, and generate 2m subbands coefficients synchronously. The total time of computing one level of decomposition for a 2-D image (3-D image sequence) of size N2 (MN2) is approximately N2/4 (MN2/8) intra- clock cycles (ccs). An efficient line-based architecture framework for both 2D+t and t+2D 3-D DWT is first proposed. Compared with the similar works reported in previous literature, the proposed architecture has good performance in terms of production of computation time and hardware cost. The proposed architecture is simple, regular, scalable and well suited for VLSI implementation.
Deep learning technique has dramatically boosted the performance of face alignment algorithms. However, due to large variability and lack of samples, the alignment problem in unconstrained situations, e.g. large head ...
详细信息
- Non-local mean (NLM) algorithm has been implemented effectively in MRI denoising and is always limited by its computational complexity. To reduce the computational burden of NLM in 3D MRI dataset, in this paper, we ...
详细信息
In this paper, we try to deal with the problem of shadow detection from static images and video sequences. In instead to considering individual regions separately, we use relative illumination conditions between segme...
详细信息
Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and natural scene classification. In this...
详细信息
Visual codebook based quantization of robust appearance descriptors extracted from local image patches is an effective means of capturing image statistics for texture analysis and natural scene classification. In this paper, based on the newly proposed statistics of word activation forces (WAFs), we optimize the codebook. Currently, codebooks are typically created from a set of training images using a clustering algorithm. However, these codebooks are often functionally limited due to redundancy. We show that WAFs can remove the redundancy efficiently. In the experiment, the proposed method achieved the state-of-the-art performance on the Caltech- 101, fifteen natural scene categories and VOC2007 databases. The optimization method also offers insights into the success of several recently proposed images classification approaches, including vector quantization (VQ) coding in the Spatial Pyramid Matching (SPM), sparse coding SPM (ScSPM), and Locality-constrained Linear Coding (LLC).
A novel super-resolution approach is presented. It is based on the local Lipschitz regularity of wavelet transform along scales to predict the new detailed coefficients and their gradients from the horizontal, vertica...
详细信息
Collaboration among multiple tasks is advantageous for enhancing learning efficiency in multi-agent reinforcement learning. To guide agents in cooperating with different teammates in multiple tasks, contemporary appro...
详细信息
Collaboration among multiple tasks is advantageous for enhancing learning efficiency in multi-agent reinforcement learning. To guide agents in cooperating with different teammates in multiple tasks, contemporary approaches encourage agents to exploit common cooperative patterns or identify the learning priorities of multiple tasks. Despite the progress made by these methods, they all assume that all cooperative tasks to be learned are related and desire similar agent policies. This is rarely the case in multi-agent cooperation, where minor changes in team composition can lead to significant variations in cooperation, resulting in distinct cooperative strategies compete for limited learning resources. In this paper, to tackle the challenge posed by multi-task learning in potentially competing cooperative tasks, we propose a novel framework called Relation-Aware Learning (RAL). RAL incorporates a relation awareness module in both task representation and task optimization, aiding in reasoning about task relationships and mitigating negative transfers among dissimilar tasks. To assess the performance of RAL, we conduct a comparative analysis with baseline methods in a multi-task StarCraft environment. The results demonstrate the superiority of RAL in multi-task cooperative scenarios, particularly in scenarios involving multiple conflicting tasks. Index Terms—Cooperation games, multi-task learning, reinforcement learning. IEEE
Deep learning technique has dramatically boosted the performance of face alignment algorithms. However, due to large variability and lack of samples, the alignment problem in unconstrained situations, e.g. large head ...
详细信息
ISBN:
(数字)9781728171685
ISBN:
(纸本)9781728171692
Deep learning technique has dramatically boosted the performance of face alignment algorithms. However, due to large variability and lack of samples, the alignment problem in unconstrained situations, e.g. large head poses, exaggerated expression, and uneven illumination, is still largely unsolved. In this paper, we explore the instincts and reasons behind our two proposals, i.e. Propagation Module and Focal Wing Loss, to tackle the problem. Concretely, we present a novel structure-infused face alignment algorithm based on heatmap regression via propagating landmark heatmaps to boundary heatmaps, which provide structure information for further attention map generation. Moreover, we propose a Focal Wing Loss for mining and emphasizing the difficult samples under in-the-wild condition. In addition, we adopt methods like CoordConv and Anti-aliased CNN from other fields that address the shift variance problem of CNN for face alignment. When implementing extensive experiments on different benchmarks, i.e. WFLW, 300W, and COFW, our method outperforms the state-of-the-arts by a significant margin. Our proposed approach achieves 4.05% mean error on WFLW, 2.93% mean error on 300W full-set, and 3.71% mean error on COFW.
暂无评论