Human pose estimation is an important research topic in the field of computer vision. Bottom-up methods such as OpenPose have become prevalent pose estimation methods because of its high efficiency and detection speed...
详细信息
ISBN:
(纸本)9781728152103
Human pose estimation is an important research topic in the field of computer vision. Bottom-up methods such as OpenPose have become prevalent pose estimation methods because of its high efficiency and detection speed. But for poor quality video, the keypoints obtained by the bottom-up method have suffered from jitter and loss. In this paper, we propose DP-Pose, which constructs candidate pose sequence by selecting points in a region of the heatmap, improves constraint function by combining distance and confidence, and solves the optimal keypoints location by dynamic programming. The experimental results on class dataset show that DP-Pose can recover missing keypoint using heatmap data and obtain stable pose from candidate pose sequences.
Many signal processing applications require performing statistical inference on large datasets, where computational and/or memory restrictions become an issue. In this big data setting, computing an exact global centr...
详细信息
Many signal processing applications require performing statistical inference on large datasets, where computational and/or memory restrictions become an issue. In this big data setting, computing an exact global centralized estimator is often either unfeasible or impractical. Hence, several authors have considered distributed inference approaches, where the data are divided among multiple workers (cores, machines or a combination of both). The computations are then performed in parallel and the resulting partial estimators are finally combined to approximate the intractable global estimator. In this paper, we focus on the scenario where no communication exists among the workers, deriving efficient linear fusion rules for the combination of the distributed estimators. Both a constrained optimization perspective and a Bayesian approach (based on the Bernstein-von Mises theorem and the asymptotic normality of the estimators) are provided for the derivation of the proposed linear fusion rules. We concentrate on finding the minimum mean squared error (MMSE) global estimator, but the developed framework is very general and can be used to combine any type of unbiased partial estimators (not necessarily MMSE partial estimators). Numerical results show the good performance of the algorithms developed, both in problems where analytical expressions can be obtained for the partial estimators, and in a wireless sensor network localization problem where Monte Carlo methods are used to approximate the partial estimators. (C) 2018 Elsevier Inc. All rights reserved.
Video semantic segmentation is an important and fundamental problem in computer vision. It has broad application prospects in the fields of mobile robot, drone, intelligent driving and monitoring. With the development...
详细信息
ISBN:
(数字)9781728143286
ISBN:
(纸本)9781728143293
Video semantic segmentation is an important and fundamental problem in computer vision. It has broad application prospects in the fields of mobile robot, drone, intelligent driving and monitoring. With the development of neural networks, the models commonly adopted are all based on full convolutional network (FCN). However, current methods are limited by a small training set, which makes it difficult to improve the segmentation accuracy. In this paper, we propose a robust method that uses different data argumentation methods to increase the data set according to different characteristics of the scene. On the basis of analyzing different video features, targeted data argumentation techniques are selected to increase training samples. Experimental results show that data argumentation techniques can significantly improve the accuracy of video semantic segmentation compared with traditional training methods that ignore video features.
作者:
Ma, LinfeiZhang, XiangLan, LongHuang, XuhuiLuo, ZhigangNUDT
Sci & Technol Parallel & Distributed Lab Changsha 410073 Hunan Peoples R China NUDT
Coll Comp Changsha 410073 Hunan Peoples R China NUDT
State Key Lab High Performance Comp Changsha 410073 Hunan Peoples R China NUDT
Dept Comp Sci & Technol Changsha 410073 Hunan Peoples R China
Canonical correlation analysis (CCA) is a classical subspace learning method of capturing the common semantic information underlying multi-view data. It has been used in person re-identification (re-ID) task by treati...
详细信息
ISBN:
(纸本)9781479970612
Canonical correlation analysis (CCA) is a classical subspace learning method of capturing the common semantic information underlying multi-view data. It has been used in person re-identification (re-ID) task by treating the task of matching identical individuals across non-overlapping multi-cameras as a multi-view learning problem. However, CCA-based re-ID methods still achieve unsatisfactory results because few jointly consider discriminative margin information and selecting importantly relevant features. To address this issue, we propose a novel l(2,1)-norm regularized margin-embedding CCA (l(2,1)-MCCA), which learns a generalized discriminative subspace by employing more discriminative margin information. Moreover, the new method enforces the l(2,1)-norm regularization term over the learned subspace to identify the relevant features. Both lightweight and effective schemes can benefit from each other and endeavor to enlarge the interclass variations whilst reducing the intra-class variations. Experiments on three popular datasets show the efficacy of l(2,1)-MCCA as compared with recently representative re-ID methods.
With the increase in the number of students enrolled in the university system, regular assessment of student performance has become challenging. This is specially true in case of summative assessments, where one expec...
详细信息
Feature matching is a critical step in a variety of vision-based applications. However, it is difficult to customize a general method due to complicated image transformations. In this paper, we formulate feature match...
详细信息
ISBN:
(数字)9781728143286
ISBN:
(纸本)9781728143293
Feature matching is a critical step in a variety of vision-based applications. However, it is difficult to customize a general method due to complicated image transformations. In this paper, we formulate feature matching into a mathematical problem, which is simple and does not rely on any specific model. Specifically, we construct an objective function based on two constraints (i.e., geometric constraint and consistency constraint). Different from the previous work, we exploit the interaction of these two constraints to reject outliers. Besides, we adopt an iterative strategy to boost the number of inliers. To accelerate the matching process, we use the parallel computing to obtain the information of K nearest neighbors for each feature point. Experimental results on ten image pairs (undergo different image transformations) compared with other state-of-the-arts demonstrate our superiority both in accuracy and efficiency.
Addressing the problems of flying small target detection in infrared image sequences, a new approach is proposed based on generalized low-rank background estimation. Firstly, the generalized low rank approximations is...
详细信息
ISBN:
(纸本)9781728152103
Addressing the problems of flying small target detection in infrared image sequences, a new approach is proposed based on generalized low-rank background estimation. Firstly, the generalized low rank approximations is introduced to model infrared background image over sequential infrared images. Subsequently, the foreground target image is achieved by background subtraction with the generalized low-rank background estimation. Finally, flying small target detection is achieved over separated target image by threshold segmentation. The experiment results on two infrared image sequences of flying plane demonstrate that the proposed method have effective detection performance and outperform the baseline methods in precision and recall evaluation.
Based human visual speed perception characteristic, a video quanlity assessment(VQA) algorithm is introduced in the paper. Natural video statistics features extraction and weighting factors are incorporated in the sch...
详细信息
ISBN:
(纸本)9781728152103
Based human visual speed perception characteristic, a video quanlity assessment(VQA) algorithm is introduced in the paper. Natural video statistics features extraction and weighting factors are incorporated in the scheme. In the VQA, considering the impact both video content itself and HVS's characteristic on human subjective perception, we propose weighting factors to scale the effect of those features, and it contains two parts: motion information and perception noise. And natural video statistics features relate to the spatial and temporal domain are extracted. The weighting factors would be used to combine both the temporal and spatial features, then generate the quality of each frame. Finally, the video quality score can be obtained by pooling scheme. LivE database, EPFL-PoliMI database and some other generated test videos were used in our experiments, and the results indicate our model has outstanding performance.
Loop closure detection plays a vital role in the visual Simultaneous Localization and Mapping (SLAM) system. Traditional Bag-of-Visual-Word (BoVW) methods for loop closure detection has been successfully applied to va...
详细信息
ISBN:
(纸本)9781728152103
Loop closure detection plays a vital role in the visual Simultaneous Localization and Mapping (SLAM) system. Traditional Bag-of-Visual-Word (BoVW) methods for loop closure detection has been successfully applied to various SLAM systems. The Convolutional Neural Networks (ConvNets) based methods can automatically learn feature representation from original image, and it is more robust to illumination changes. However, these two methods all have some limitations. In order to overcome the shortcomings of the traditional methods extract insufficient features, and the ConvNets based methods unable to adapt complex environment changes, in this paper, we propose an improved hybrid deep learning architecture (HDLA) to generate high-level semantic image features for loop closure detection specifically. This network is based on hybrid ConvNet modified by us to especially cope with robust and real-time feature extraction. We present extensive experiments on three real world datasets to evaluate each of the specific challenges in loop closure detection. The results demonstrate that the proposed method achieves superior performances and provides a feasible solution for loop closure detection.
暂无评论