In this paper, we propose a video-based full-body gesture recognition system independent of the view angle of the cameras. We performed multilinear analysis on the silhouette images of the static poses making up the g...
详细信息
We compare the characteristics and performance of joint (single-step) and sequential (two-step) approaches for creating sparse and structured acoustic signal representations derived using overcomplete methods (OMs). A...
详细信息
We compare the characteristics and performance of joint (single-step) and sequential (two-step) approaches for creating sparse and structured acoustic signal representations derived using overcomplete methods (OMs). A joint approach, such as molecular matching pursuit (MMP), attempts to find coherent structures in a signal as part of the decomposition process, while a sequential approach, such as agglomerative clustering (AC), attempts to find coherent structures after the signal decomposition. We review each approach, and examine their performance using real audio and music signals.
In this paper, we tackle robust human pose recognition using unlabelled markers obtained from an optical marker-based motion capture system. A coarse-to-fine fast pose matching algorithm is presented with the followin...
详细信息
In this paper, we tackle robust human pose recognition using unlabelled markers obtained from an optical marker-based motion capture system. A coarse-to-fine fast pose matching algorithm is presented with the following three steps. Given a query pose, firstly, the majority of the non-matching poses are rejected according to marker distributions along the radius and height dimensions. Secondly, relative rotation angles between the query pose and the remaining candidate poses are estimated using a fast histogram matching method based on circular convolution implemented using the fast Fourier transform. Finally, rotation angle estimates are refined using nonlinear least square minimization through the Levenberg-Marquardt minimization. In the presence of multiple solutions, false poses can be effectively removed by thresholding the minimized matching scores. The proposed framework can handle missing markers caused by occlusion. Experimental results using real motion capture data show the efficacy of the proposed approach.
作者:
Guo, FengQian, GangArts
Media and Engineering Program Department of Electrical Engineering Arizona State University
In this paper, a Bayesian mixture expert (BME) framework for the estimation of 3D human poses from two uncalibrated wide-baseline cameras is presented. The two cameras will reduce the ambiguities of the pose estimatio...
详细信息
The authors investigate the characteristics and performance of joint (single‐step) and sequential (two‐step) approaches to creating sparse and structured multiresolution representations of audio and music signals de...
The authors investigate the characteristics and performance of joint (single‐step) and sequential (two‐step) approaches to creating sparse and structured multiresolution representations of audio and music signals derived using sparse overcomplete methods. A joint approach, such as molecular matching pursuit, attempts to find structures in a signal as part of the decomposition process, while a sequential approach, such as agglomerative clustering, attempts to find structures in the completed decomposition of a signal. Each of these approaches have different benefits and drawbacks. For a joint approach, it is computationally convenient that the decomposition and structuring are done simultaneously, but usually only simple structural relations are possible. For a sequential approach, one is working in a parameter space of much smaller dimension than the original signal, but the computation is higher since the decomposition and the structure building are two separate processes. Results from these approaches using real audio and music signals will be compared and contrasted, and will contribute to our goal of creating an enhanced interface between the content of audio and music signals, e.g., onsets, notes, voices, and their multiresolution sparse atomic decompositions.
作者:
Feng GuoGang QianArts
Media and Engineering Program and Department of Electrical Engineering Arizona State University USA
In this paper, a Bayesian mixture expert (BME) framework for the estimation of 3D human poses from two uncalibrated wide-baseline cameras is presented. The two cameras will reduce the ambiguities of the pose estimatio...
详细信息
In this paper, a Bayesian mixture expert (BME) framework for the estimation of 3D human poses from two uncalibrated wide-baseline cameras is presented. The two cameras will reduce the ambiguities of the pose estimation greatly and is easy to implement. BME is learnt to conduct multimodal pose estimation regression. K-means algorithm considering Euclidean distance and maximum-value distance for the joint angle vector is used for the initial clustering in BME learning. This will give the better cluster results to separate the ambiguous poses into different experts. Also a weighted PCA is implemented in an expectation-maximization (EM) framework to learn the parameters of the BME. This can reduce the dimension of the training data more effectively compared with global PCA. The system is trained with synthesized silhouettes from motion capture data. The experimental results on synthesized and real images illustrate that our approach does not need precise camera calibration and can estimate the poses effectively
We present the Multimodal Music Stand (MMMS) for the untethered sensing of performance gestures and the interactive control of music. Using e-field sensing, audio analysis, and computer vision, the MMMS captures a per...
详细信息
ISBN:
(纸本)9781450378376
We present the Multimodal Music Stand (MMMS) for the untethered sensing of performance gestures and the interactive control of music. Using e-field sensing, audio analysis, and computer vision, the MMMS captures a performer's continuous expressive gestures and robustly identifies discrete cues in a musical performance. Continuous and discrete gestures are sent to an interactive music system featuring custom designed software that performs real-time spectral transformation of audio.
We define the research fields of experiential signal processing (ESP) and experiential telecommunications (ET), which are concerned with sensing, communicating, and presenting an Environment, Event, or Experience at a...
详细信息
We define the research fields of experiential signal processing (ESP) and experiential telecommunications (ET), which are concerned with sensing, communicating, and presenting an Environment, Event, or Experience at a distance. We develop our vision of ESP and ET and describe key components and research fields. We highlight the challenges of presenting multichannel, multimedia information and present an example for panoramic video using the Allosphere, a 3-story sphere housed in an anechoic chamber, that has been constructed at UCSB.
Movement-based interactive dance has recently attracted great interest in the performing arts. While utilizing motion capture technology, the goal of this project was to design the necessary real-time motion analysis ...
详细信息
ISBN:
(纸本)1595934472
Movement-based interactive dance has recently attracted great interest in the performing arts. While utilizing motion capture technology, the goal of this project was to design the necessary real-time motion analysis engine, staging, and communication systems for the completion of a movement-based interactive multimedia dance performance. The movement analysis engine measured the correlation of dance movement between three people wearing similar sets of retro-reflective markers in a motion capture volume. This analysis provided the framework for the creation of an interactive dance piece, Lucidity, which will be described in detail. Staging such a work also presented additional challenges. These challenges and our proposed solutions will be discussed. We conclude with a description of the final work and a summary of our future research objectives. Copyright 2006 ACM.
In this paper, we present a learning and inference framework for 3D human pose recovery using silhouettes represented by Gaussian mixtures. A Bayesian mixture of experts is learnt to conduct multimodal pose regression...
详细信息
暂无评论