In this article, we present a novel approach to segment discriminative patches in human activity videos. First, we adopt the spatio-temporal interest points (STIPs) to represent significant motion patterns in the vide...
详细信息
In this article, we present a novel approach to segment discriminative patches in human activity videos. First, we adopt the spatio-temporal interest points (STIPs) to represent significant motion patterns in the video sequence. Then, nonnegative sparse coding is exploited to generate a sparse representation of each STIP descriptor. We construct the feature vector for each video by applying a two-stage sum-pooling and l(2)-normalization operation. After training a multi-class classifier through the error-correcting code svm, the discriminative portion of each video is determined as the patch that has the highest confidence while also being correctly classified according to the video category. Experimental results show that the video patches extracted by our method are more separable, while preserving the perceptually relevant portion of each activity.
暂无评论