We study the problem of recognizing sign language automatically using the RGB videos and skeleton coordinates captured by Kinect, which is of great significance in communication between the deaf and the hearing societ...
详细信息
In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations. Some user-centric tas...
详细信息
ISBN:
(纸本)9781467388511
In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations. Some user-centric tasks, such as image recommendations, call for effective representations of not only images but also preferences and intents of users over images. Such representations are termed hybrid and addressed via a deep learning approach in this paper. We design a dual-net deep network, in which the two subnetworks map input images and preferences of users into a same latent semantic space, and then the distances between images and users in the latent space are calculated to make decisions. We further propose a comparative deep learning (CDL) method to train the deep network, using a pair of images compared against one user to learn the pattern of their relative distances. The CDL embraces much more training data than naive deep learning, and thus achieves superior performance than the latter, with no cost of increasing network complexity. Experimental results with real-world data sets for image recommendations have shown the proposed dual-net network and CDL greatly outperform other stateof-the-art image recommendation solutions.
Our research has revealed a hidden relationship among several basic components, which leads to the best target detection result. Further, we have proved that the matched filter (MF) is always superior to the constrain...
详细信息
—The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, met...
详细信息
With the advantage in compact representation and efficient comparison, binary hashing has been extensively investigated for approximate nearest neighbor search. In this paper, we propose a novel and general hashing fr...
详细信息
Users’ download history is a primary data source for analyzing user interests. Recent work has shown that user interests are indeed time varying, and accurate profiling of user interest drifts requires the temporal d...
详细信息
—Inspired by the recent advances of image super-resolution using convolutional neural network (CNN), we propose a CNN-based block up-sampling scheme for intra frame coding. A block can be down-sampled before being co...
详细信息
Convolutional neural networks (CNNs) have achieved stateof-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of image...
详细信息
With the explosive growth in the number of mobile terminals, the demand for visual communication with mobility is increasing. However, traditional solutions for mobility over IP network cannot always meet the demand o...
详细信息
ISBN:
(纸本)9781509053179
With the explosive growth in the number of mobile terminals, the demand for visual communication with mobility is increasing. However, traditional solutions for mobility over IP network cannot always meet the demand of satisfying visual communication. Named Data Networking (NDN) is a new communication model that aims to replace IP model brings a different background to mobile visual communication problems. In this paper, we take advantage of the NDN model to realize seamless mobile visual communication. We introduce a delegate with calculation functions and a globally unique identifier (GUID) which can provide native identity indication into the NDN mechanism. The use of GUID benefits real-time applications like visual communication and further works with the delegate to decrease unnecessary routing update. We also specify the naming rule and design a FIB+ to support seamless mobile visual communication. To test the performance of our solutions, we build a proof-of-concept prototype and run experiments on it. The experiments demonstrate that our solution can provide real-time video communication with seamless mobility experience.
Recently high-level pose features (HLPF) have been shown to be efficient for action recognition in joint-annotated tasks. However, the relative positions between pairs of joints in actual situations and the spatio-tem...
详细信息
ISBN:
(纸本)9781509015535
Recently high-level pose features (HLPF) have been shown to be efficient for action recognition in joint-annotated tasks. However, the relative positions between pairs of joints in actual situations and the spatio-temporal information are not considered in constructing HLPF. To tackle their problems, we propose a set of novel high-level pose features (NHLPF). Specifically, considering that the distances between adjacent pairs of joints usually remain unchanged, we propose a horizontally relative position feature and a vertically relative position feature. In addition, a joint inner product feature is proposed to code the spatialinformation among each triplet of joints. To code temporal information, we calculate the trajectories of the above-mentioned three types of features as corresponding trajectory features. Furthermore, to combine the spatial and temporal information, we present a joint energy change feature, which is designed using observations of the magnitude and direction of the force between joints. We evaluate our NHLPF on a benchmark dataset. The results show that NHPLF are superior features for action recognition.
暂无评论