Although existing image-based methods for 3dhumanmeshreconstruction have achieved remarkable accuracy, effectively capturing smooth human motion from monocular video remains a significant challenge. Recently, video...
详细信息
ISBN:
(纸本)9789819988495;9789819988501
Although existing image-based methods for 3dhumanmeshreconstruction have achieved remarkable accuracy, effectively capturing smooth human motion from monocular video remains a significant challenge. Recently, video-based methods for humanmeshreconstruction tend to build more complex networks to capture temporal information of human motion, resulting in a large number of parameters and limiting their practical applications. To address this issue, we propose an Efficient Graph Transformer network to Reconstruct 3dhumanmesh from monocular video, named EGTR. Specifically, we present a temporal redundancy removal module that uses 1d convolution to eliminate redundant information among video frames and a spatial-temporal fusion module that combinesModulated GCN with transformer framework to capture human motion. Our method achieves better accuracy than the state-of-the-art video-based method TCMR on 3dPW, human3.6M andMPI-INF-3dHP datasets while only using 8.7% of the parameters, indicating the effectiveness of our method for practical applications.
暂无评论