检索结果-内蒙古大学图书馆

17th Pacific-Rim Conference on Multimedia, PCM 2016

作者： Pu, Junfu Zhou, Wengang Li, Houqiang CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China

ISBN: (纸本)9783319488950

We study the problem of recognizing sign language automatically using the RGB videos and skeleton coordinates captured by Kinect, which is of great significance in communication between the deaf and the hearing societies. In this paper, we propose a sign language recognition (SLR) system with data of two channels, including the gesture videos of the sign words and joint trajectories. In our framework, we extract two modals of features to represent the hand shape videos and hand trajectories for recognition. The variation of gesture is obtained by 3D CNN and the activations of fully connected layers are used as the representations of these sign videos. For trajectories, we use the shape context to describe each joint, and combine them all within a feature matrix. After that, a convolutional neural network is applied to generate a robust representation of these trajectories. Furthermore, we fuse these features and train a SVM classifier for recognition. We conduct some experiments on large vocabulary sign language dataset with up to 500 words and the results demonstrate the effectiveness of our proposed method. © Springer International Publishing AG 2016.

关键词： Trajectories

来源：评论

学校读者我要写书评

暂无评论

Comparative deep learning of hybrid representations for image recommendations

Comparative deep learning of hybrid representations for imag...

引用

2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016

作者： Lei, Chenyi Liu, Dong Li, Weiping Zha, Zheng-Jun Li, Houqiang CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System University of Science and Technology of China Hefei230027 China

ISBN: (纸本)9781467388511

In many image-related tasks, learning expressive and discriminative representations of images is essential, and deep learning has been studied for automating the learning of such representations. Some user-centric tasks, such as image recommendations, call for effective representations of not only images but also preferences and intents of users over images. Such representations are termed hybrid and addressed via a deep learning approach in this paper. We design a dual-net deep network, in which the two subnetworks map input images and preferences of users into a same latent semantic space, and then the distances between images and users in the latent space are calculated to make decisions. We further propose a comparative deep learning (CDL) method to train the deep network, using a pair of images compared against one user to learn the pattern of their relative distances. The CDL embraces much more training data than naive deep learning, and thus achieves superior performance than the latter, with no cost of increasing network complexity. Experimental results with real-world data sets for image recommendations have shown the proposed dual-net network and CDL greatly outperform other stateof-the-art image recommendation solutions.

关键词： Pattern recognition

来源：评论

学校读者我要写书评

暂无评论

The basic equation for target detection in remote sensing

arXiv

引用

arXiv 2017年

作者： Geng, Xiurui Ji, Luyan Zhao, Yongchao Key Laboratory of Technology in Geo-Spatial Information Processing and Application System Institute of Electronics Chinese Academy of Sciences Beijing100190 China School of Electronic Electrical and Communication Engineering University of Chinese Academy of Sciences 100049 China Ministry of Education Key Laboratory for Earth System Modeling Department of Earth System Science Tsinghua University

Our research has revealed a hidden relationship among several basic components, which leads to the best target detection result. Further, we have proved that the matched filter (MF) is always superior to the constrained energy minimization (CEM) operator, both of which were originally of parallel importance in the field of target detection for remotely sensed image. Copyright © 2017, The Authors. All rights reserved.

关键词： Remote sensing

来源：评论

学校读者我要写书评

暂无评论

Recent Advance in Content-based Image Retrieval: A literature survey

arXiv

引用

arXiv 2017年

作者： Zhou, Wengang Li, Houqiang Fellow, Tian I.E.E.E. CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei230027 China Department of Computer Science University of Texas San Antonio San AntonioTX78249 United States

—The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content. Content-based image retrieval (CBIR), which makes use of the representation of visual content to identify relevant images, has attracted sustained attention in recent two decades. Such a problem is challenging due to the intention gap and the semantic gap problems. Numerous techniques have been developed for content-based image retrieval in the last decade. The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016. We conclude with several promising directions for future research. Copyright © 2017, The Authors. All rights reserved.

关键词： Content based retrieval

来源：评论

学校读者我要写书评

暂无评论

Linear distance preserving pseudo-supervised and unsupervised hashing 16

Linear distance preserving pseudo-supervised and unsupervise...

引用

24th ACM Multimedia Conference, MM 2016

作者： Wang, Min Zhou, Wengang Tian, Qi Zha, Zhengjun Li, Houqiang CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China China Computer Science Department University of Texas at San Antonio United States

ISBN: (纸本)9781450336031

With the advantage in compact representation and efficient comparison, binary hashing has been extensively investigated for approximate nearest neighbor search. In this paper, we propose a novel and general hashing framework, which simultaneously considers a new linear pair-wise distance preserving objective and point-wise constraint. The direct distance preserving objective aims to keep the linear relationships between the Euclidean distance and the Hamming distance of data points. Based on different pointwise constraints, we propose two methods to instantiate this framework. The first one is a pseudo-supervised hashing method, which uses existing unsupervised hashing methods to generate binary codes as pseudo-supervised information. The second one is an unsupervised hashing method, in which quantization loss is considered. We validate our framework on two large-scale datasets. The experiments demonstrate that our pseudo-supervised method achieves consistent improvement for the state-of-the-art unsupervised hashing methods, while our unsupervised method outperforms the state-of-the-art methods. ©2016 ACM.

关键词： Hamming distance

来源：评论

学校读者我要写书评

暂无评论

Visual analyses of music download history: User studies 22nd

Visual analyses of music download history: User studies

引用

22nd International Conference on MultiMedia Modeling, MMM 2016

作者： Liu, Dong Zhang, Jingxian CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei China Department of Computer Science University of Illinois at Urbana-Champaign ChampaignIL United States

ISBN: (纸本)9783319276700

Users’ download history is a primary data source for analyzing user interests. Recent work has shown that user interests are indeed time varying, and accurate profiling of user interest drifts requires the temporal dynamic analyses. We have proposed a visualization approach to analyzing user interest drifts from the download history, taking music as an example, and studied how to depict the underlying relevances among the downloaded music items to identify the drifts. We designed three new kinds of plots to display the music download history of one user, namely Bean plot, Transitional Pie plot, and Instrument plot. In this paper, we report our conducted user studies that ask normal users to visually analyze the download history of other users in a given realworld data set. User studies are performed in a learning-practice-test workflow. The results demonstrate the feasibility of our visualization design. © Springer International Publishing Switzerland 2016.

关键词： Visualization

来源：评论

学校读者我要写书评

暂无评论

Convolutional neural network-based block up-sampling for intra frame coding

arXiv

引用

arXiv 2017年

作者： Li, Yue Liu, Dong Li, Houqiang Li, Li Wu, Feng Zhang, Hong Yang, Haitao CAS Key Laboratory of Technology in Geo-Spatial Information Processing Application System University of Science Technology of China Hefei230027 China University of Missouri-Kansas City 5100 Rockhill Road Kansas CityMO64111 United States Media Technology Laboratory Central Research Institute of Huawei Technologies Co. Ltd Shenzhen518129 China

—Inspired by the recent advances of image super-resolution using convolutional neural network (CNN), we propose a CNN-based block up-sampling scheme for intra frame coding. A block can be down-sampled before being compressed by normal intra coding, and then up-sampled to its original resolution. Different from previous studies on down/up-sampling-based coding, the up-sampling methods in our scheme have been designed by training CNN instead of hand-crafted. We explore a new CNN structure for up-sampling, which features deconvolution of feature maps, multi-scale fusion, and residue learning, making the network both compact and efficient. We also design different networks for the up-sampling of luma and chroma components, respectively, where the chroma up-sampling CNN utilizes the luma information to boost its performance. In addition, we design a two-stage up-sampling process, the first stage being within the block-by-block coding loop, and the second stage being performed on the entire frame, so as to refine block boundaries. We also empirically study how to set the coding parameters of down-sampled blocks for pursuing the frame-level rate-distortion optimization. Our proposed scheme is implemented into the High Efficiency Video Coding (HEVC) reference software, and a comprehensive set of experiments have been performed to evaluate our methods. Experimental results show that our scheme achieves significant bits saving compared with HEVC anchor especially at low bit rates, leading to on average 5.5% BD-rate reduction on common test sequences and on average 9.0% BD-rate reduction on ultra high definition (UHD) test sequences. Copyright © 2017, The Authors. All rights reserved.

关键词： Signal sampling

来源：评论

学校读者我要写书评

暂无评论

Transform-invariant convolutional neural networks for image classification and search 24

Transform-invariant convolutional neural networks for image ...

引用

24th ACM Multimedia Conference, MM 2016

作者： Shen, Xu Tian, Xinmei He, Anfeng Sun, Shaoyan Tao, Dacheng CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China Hefei Anhui230027 China Centre for Quantum Computation and Intelligent Systems Faculty of Engineering and Information Technology University of Technology SydneyNSW2007 Australia

ISBN: (纸本)9781450336031

Convolutional neural networks (CNNs) have achieved stateof-the-art results on many visual recognition tasks. However, current CNN models still exhibit a poor ability to be invariant to spatial transformations of images. Intuitively, with sufficient layers and parameters, hierarchical combinations of convolution (matrix multiplication and nonlinear activation) and pooling operations should be able to learn a robust mapping from transformed input images to transform-invariant representations. In this paper, we propose randomly transforming (rotation, scale, and translation) feature maps of CNNs during the training stage. This prevents complex dependencies of specific rotation, scale, and translation levels of training images in CNN models. Rather, each convolutional kernel learns to detect a feature that is generally helpful for producing the transforminvariant answer given the combinatorially large variety of transform levels of its input feature maps. In this way, we do not require any extra training supervision or modification to the optimization process and training images. We show that random transformation provides significant improvements of CNNs on many benchmark tasks, including small-scale image recognition, large-scale image recognition, and image retrieval. © 2016 ACM.

关键词： Mathematical transformations

来源：评论

学校读者我要写书评

暂无评论

GUID-based mobile visual communication using NDN mechanism

GUID-based mobile visual communication using NDN mechanism

引用

IEEE Visual Communications and Image processing (VCIP)

作者： Yuanzun Zhang Xiaobin Tan Hao Liu Weiping Li CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System University of Science and Technology of China

ISBN: (纸本)9781509053179

With the explosive growth in the number of mobile terminals, the demand for visual communication with mobility is increasing. However, traditional solutions for mobility over IP network cannot always meet the demand of satisfying visual communication. Named Data Networking (NDN) is a new communication model that aims to replace IP model brings a different background to mobile visual communication problems. In this paper, we take advantage of the NDN model to realize seamless mobile visual communication. We introduce a delegate with calculation functions and a globally unique identifier (GUID) which can provide native identity indication into the NDN mechanism. The use of GUID benefits real-time applications like visual communication and further works with the delegate to decrease unnecessary routing update. We also specify the naming rule and design a FIB+ to support seamless mobile visual communication. To test the performance of our solutions, we build a proof-of-concept prototype and run experiments on it. The experiments demonstrate that our solution can provide real-time video communication with seamless mobility experience.

关键词： Routing Mobile communication Visual communication IP networks Artificial neural networks Mobile computing Real-time systems

来源：评论

学校读者我要写书评

暂无评论

Action recognition with novel high-level pose features

Action recognition with novel high-level pose features

引用

IEEE International Conference on Multimedia and Expo Workshops (ICMEW)

作者： Jiayi Fan Zhengjun Zha Xinmei Tian University of Science and Technology of China CAS Key Laboratory of Technology in Geo-spatial Information Processing and Application System

ISBN: (纸本)9781509015535

Recently high-level pose features (HLPF) have been shown to be efficient for action recognition in joint-annotated tasks. However, the relative positions between pairs of joints in actual situations and the spatio-temporal information are not considered in constructing HLPF. To tackle their problems, we propose a set of novel high-level pose features (NHLPF). Specifically, considering that the distances between adjacent pairs of joints usually remain unchanged, we propose a horizontally relative position feature and a vertically relative position feature. In addition, a joint inner product feature is proposed to code the spatial information among each triplet of joints. To code temporal information, we calculate the trajectories of the above-mentioned three types of features as corresponding trajectory features. Furthermore, to combine the spatial and temporal information, we present a joint energy change feature, which is designed using observations of the magnitude and direction of the force between joints. We evaluate our NHLPF on a benchmark dataset. The results show that NHPLF are superior features for action recognition.

关键词： Trajectory Pose estimation Detectors Joining processes Hip Force Three-dimensional displays

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：