Feature representation is fundamental and attracts much attention in few-shot learning. Convolutional neural networks (CNNs) are among the best feature extractors so far in this field, which are successfully combined ...
详细信息
Real-time point cloud processing is fundamental for lots of computer vision tasks, while still challenged by the computational problem on resource-limited edge devices. To address this issue, we implement XNOR-Net-bas...
详细信息
To represent a text as a bag of properly identified "phrases" and use the representation for processing the text is proved to be useful. The key question here is how to identify the phrases and represent the...
详细信息
Recently, context reasoning using image regions beyond local convolution has shown great potential for scene parsing. In this work, we explore how to incorperate the linguistic knowledge to promote context reasoning o...
详细信息
Background Gesture recognition has attracted significant attention because of its wide range of potential *** multi-modal gesture recognition has made significant progress in recent years,a popular method still is sim...
详细信息
Background Gesture recognition has attracted significant attention because of its wide range of potential *** multi-modal gesture recognition has made significant progress in recent years,a popular method still is simply fusing prediction scores at the end of each branch,which often ignores complementary features among different modalities in the early stage and does not fuse the complementary features into a more discriminative *** This paper proposes an Adaptive Cross-modal Weighting(ACmW)scheme to exploit complementarity features from RGB-D data in this *** scheme learns relations among different modalities by combining the features of different data *** proposed ACmW module contains two key functions:(1)fusing complementary features from multiple streams through an adaptive one-dimensional convolution;and(2)modeling the correlation of multi-stream complementary features in the time *** the effective combination of these two functional modules,the proposed ACmW can automatically analyze the relationship between the complementary features from different streams,and can fuse them in the spatial and temporal *** Extensive experiments validate the effectiveness of the proposed method,and show that our method outperforms state-of-the-art methods on IsoGD and NVGesture.
We build a virtual agent for learning language in a 2D maze-like world. The agent sees images of the surrounding environment, listens to a virtual teacher, and takes actions to receive rewards. It interactively learns...
详细信息
3D object detection from a single image is an important task in Autonomous Driving (AD), where various approaches have been proposed. However, the task is intrinsically ambiguous and challenging as single image depth ...
详细信息
The clustering of autonomous driving scenario data can substantially benefit the autonomous driving validation and simulation systems by improving the simulation tests' completeness and fidelity. This article prop...
详细信息
Face image animation from a single image has achieved remarkable progress. However, it remains challenging when only sparse landmarks are available as the driving signal. Given a source face image and a sequence of sp...
详细信息
Recently, it has attracted more and more attentions to fuse multi-scale features for semantic image segmentation. Various works were proposed to employ progressive local or global fusion, but the feature fusions are n...
详细信息
暂无评论