检索结果-内蒙古大学图书馆

Improving Generative Adversarial Networks With local coordinate coding

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022年第1期44卷 211-227页

作者： Cao, Jiezhang Guo, Yong Wu, Qingyao Shen, Chunhua Huang, Junzhou Tan, Mingkui South China Univ Technol Sch Software Engn Guangzhou 510640 Guangdong Peoples R China Pazhou Lab Guangzhou 510335 Peoples R China Univ Adelaide Adelaide SA 5005 Australia Tencent AI Lab Shenzhen 518000 Peoples R China Univ Texas Arlington Dept Comp Sci & Engn Arlington TX 76019 USA

Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution (e.g., Gaussian noises). However, such prior distribution is often independent of real data and thus may lose semantic information (e.g., geometric structure or content in images) of data. In practice, the semantic information might be represented by some latent distribution learned from data. However, such latent distribution may incur difficulties in data sampling for GAN methods. In this paper, rather than sampling from the predefined prior distribution, we propose a GAN model with local coordinate coding (LCC), termed LCCGAN, to improve the performance of the image generation. First, we propose an LCC sampling method in LCCGAN to sample meaningful points from the latent manifold. With the LCC sampling method, we can explicitly exploit the local information on the latent manifold and thus produce new data with promising quality. Second, we propose an improved version, namely LCCGAN++, by introducing a higher-order term in the generator approximation. This term is able to achieve better approximation and thus further improve the performance. More critically, we derive the generalization bound for both LCCGAN and LCCGAN++ and prove that a low-dimensional input is sufficient to achieve good generalization performance. Extensive experiments on several benchmark datasets demonstrate the superiority of the proposed method over existing GAN methods.

关键词： Generative adversarial networks local coordinate coding latent distribution generalization bound

来源：评论

学校读者我要写书评

暂无评论

Online dictionary learning for local coordinate coding with locality coding Adaptors

引用

NEUROCOMPUTING 2015年 157卷 61-69页

作者： Pang, Junbiao Zhang, Chunjie Qin, Lei Zhang, Weigang Qing, Laiyun Huang, Qingming Yin, Baocai Beijing Univ Technol Coll Metropolitan Transportat Beijing Key Lab Multimedia & Intelligent Software Beijing 100124 Peoples R China Univ Chinese Acad Sci Sch Comp & Control Engn Beijing 100049 Peoples R China Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China Harbin Inst Technol Sch Comp Sci & Technol Weihai 26209 Peoples R China

Dictionary in local coordinate coding (LCC) is important to approximate a non-linear function with linear ones. Optimizing dictionary from predefined coding schemes is a challenge task. This paper focuses on learning dictionary from two locality coding Adaptors (LCAs), i.e., locality Gaussian Adaptor (GA) and locality Euclidean Adaptor (EA), for large-scale and high-dimension datasets. Online dictionary learning is formulated as two cycling steps, local coding and dictionary updating. Both stages scale up gracefully to large-scale datasets with millions of data. The experiments on different applications demonstrate that our method leads to a faster dictionary learning than the classical ones or the state-of-the-art methods. (C) 2015 Elsevier B.V. All rights reserved.

关键词： local coordinate coding Surrogate function locality coding Adaptor Large scale problem Online training

来源：评论

学校读者我要写书评

暂无评论

Two-Layers local coordinate coding 1

引用

1st Chinese Conference on Computer Vision (CCCV)

作者： Xiao, Wei Liu, Hong Tang, Hao Liu, Huaping Peking Univ Engn Lab Intelligent Percept Internet Things ELIP Key Lab Machine Percept Shenzhen Grad Sch Beijing 100871 Peoples R China Tsinghua Univ State Key Lab Intelligent Technol & Syst Dept Comp Sci & Technol Beijing 100084 Peoples R China

ISBN: (数字)9783662485583

ISBN: (纸本)9783662485583;9783662485576

Extracting informative regularized representations of input signals plays a key role in the field of artificial intelligence, such as machine learning and robotics. Traditional approaches feature l(2) norm and sparse inducing l(p) norm (0 <= p <= 1) based optimization methods, imposing strict regularization on the representations. However, these approaches overlook the fact that signals and atoms in the overcomplete dictionaries usually contain such wealth of structural information that could improves representations. This paper systematically exploits data manifold geometric structure where signals and atoms reside in, and thus presents a principled extension of sparse coding, i. e. two-layers local coordinate coding, which demonstrates a high dimensional nonlinear function could be locally approximated by a global linear function with quadratic approximation power. Moreover, to learn each latent layer, corresponding patterned optimization approaches are developed, encoding distance information between signals and atoms into the representations. Experimental results demonstrate the significance of this extension on improving the image classification performance and its potential applications for object recognition in robot system are also exploited.

关键词： local coordinate coding Machine learning Sparse coding Robotics

来源：评论

学校读者我要写书评

暂无评论

Robust 3D Face Landmark localization Based on local coordinate coding

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2014年第12期23卷 5108-5122页

作者： Song, Mingli Tao, Dacheng Sun, Shengpeng Chen, Chun Maybank, Stephen J. Zhejiang Univ Coll Comp Sci Hangzhou 310027 Peoples R China Univ Technol Sydney Fac Engn & Informat Technol Ctr Quantum Computat & Intelligent Syst Sydney NSW 2007 Australia Univ London Birkbeck Coll Dept Comp Sci & Informat Syst London WC1E 7HU England

In the 3D facial animation and synthesis community, input faces are usually required to be labeled by a set of landmarks for parameterization. Because of the variations in pose, expression and resolution, automatic 3D face landmark localization remains a challenge. In this paper, a novel landmark localization approach is presented. The approach is based on local coordinate coding (LCC) and consists of two stages. In the first stage, we perform nose detection, relying on the fact that the nose shape is usually invariant under the variations in the pose, expression, and resolution. Then, we use the iterative closest points algorithm to find a 3D affine transformation that aligns the input face to a reference face. In the second stage, we perform resampling to build correspondences between the input 3D face and the training faces. Then, an LCC-based localization algorithm is proposed to obtain the positions of the landmarks in the input face. Experimental results show that the proposed method is comparable to state of the art methods in terms of its robustness, flexibility, and accuracy.

关键词： Landmark 3D affine transformation face alignment face re-sampling iterative closest points local coordinate coding

来源：评论

学校读者我要写书评

暂无评论

Encoding human activities using multimodal wearable sensory data

引用

EXPERT SYSTEMS WITH APPLICATIONS 2024年 261卷

作者： Khan, Muhammad Hassan Shafiq, Hadia Farid, Muhammad Shahid Grzegorzek, Marcin Univ Punjab Dept Comp Sci Lahore Pakistan Univ Lubeck Inst Med Informat Lubeck Germany

Human Activity Recognition (HAR) is the task to automatically analyze and recognize human body gestures or actions. HAR using time-series multi-modal sensory data is a challenging and important task in the field of machine learning and feature engineering due to its increasing demands innumerous real-world applications such as healthcare, sports and surveillance. Numerous daily wearable devices e.g., smartphones, smartwatches, and smart glasses can be used to collect and analyze the human activities on an unprecedented scale. This paper presents a generic framework to recognize the different human activities using continuous time-series multimodal sensory data of these smart gadgets. The proposed framework follows the channel of Bag-of- Features which consists of four steps: (i) Data acquisition and pre-processing, (ii) codebook computation, (iii) feature encoding, and (iv) classification. Each step in the framework plays a significant role to generate an appropriate feature representation of raw sensory data for efficient activity recognition. In the first step, we employed a simple overlapped-window sampling approach to segment the continuous time-series sensory data to make it suitable for activity recognition. Secondly, we build a codebook using k-means clustering algorithm to group the similar sub-sequences. The center of each group is known as codeword and we assume that it represents a specific movement in the activity sequence. The third step consists of feature encoding which transform the raw sensory data of activity sequence into its respective high-level representation for the classification. Specifically, we presented three reconstruction-based encoding techniques to encode sensory data, namely: Sparse coding, local coordinate coding, and locality-constrained Linear coding. The segmented activity sub-sequences are transformed to high-level representation using these techniques and earlier computed codebook. Finally, the encoded features are classified u

关键词： Human activity recognition Wearable sensors Sensory data analysis Reconstruction-based encoding Sparse coding locality-constrained linear coding local coordinate coding Classification

来源：评论

学校读者我要写书评

暂无评论

Joint non-negative and fuzzy coding with graph regularization for efficient data clustering

引用

EGYPTIAN INFORMATICS JOURNAL 2021年第1期22卷 91-100页

作者： Peng, Yong Zhang, Yikai Qin, Feiwei Kong, Wanzeng Hangzhou Dianzi Univ Sch Comp Sci & Technol Hangzhou 310018 Peoples R China Soochow Univ Prov Key Lab Comp Informat Proc Technol Suzhou 215123 Peoples R China Key Lab Brain Machine Collaborat Intelligence Zhe Hangzhou 310018 Peoples R China

Non-negative matrix factorization (NMF) is an effective model in converting data into non-negative coefficient representation whose discriminative ability is usually enhanced to be used for diverse pattern recognition tasks. In NMF-based clustering, we often need to perform K-means on the learned coefficient as postprocessing step to get the final cluster assignments. This breaks the connection between the feature learning and recognition stages. In this paper, we propose to learn the non-negative coefficient matrix based on which we jointly perform fuzzy clustering, by viewing that each column of the dictionary matrix as a concept of each cluster. As a result, we formulate a new fuzzy clustering model, termed Joint Non-negative and Fuzzy coding with Graph regularization (G-JNFC), and design an effective optimization method to solve it under the alternating direction optimization framework. Besides the convergence and computational complexity analysis on G-JNFC, we conduct extensive experiments on both synthetic and representative benchmark data sets. The results show that the proposed G-JNFC model is effective in data clustering. (C) 2020 THE AUTHORS. Published by Elsevier BV on behalf of Faculty of Computers and Artificial Intelligence, Cairo University.

关键词： Non-negative matrix factorization Fuzzy coding local coordinate coding Graph regularization Clustering

来源：评论

学校读者我要写书评

暂无评论

Graph-Regularized local coordinate Concept Factorization for Image Representation

引用

NEURAL PROCESSING LETTERS 2017年第2期46卷 427-449页

作者： Ye, Jun Jin, Zhong Nanjing Univ Posts & Telecommun Sch Nat Sci Nanjing 210003 Jiangsu Peoples R China Nanjing Univ Sci & Technol Sch Comp Sci & Technol Nanjing 210094 Jiangsu Peoples R China

Existing matrix factorization based techniques, such as nonnegative matrix factorization and concept factorization, have been widely applied for data representation. In order to make the obtained concepts to be as close to the original data points as possible, one state-of-the-art method called locality constraint concept factorization is put forward, which represent the data by a linear combination of only a few nearby basis concepts. But its locality constraint does not well reveal the intrinsic data structure since it only requires the concept to be as close to the original data points as possible. To address these problems, by considering the manifold geometrical structure in local concept factorization via graph-based learning, we propose a novel algorithm, called graph-regularized local coordinate concept factorization (GRLCF). By constructing a parameter-free graph using constrained Laplacian rank (CLR) algorithm, we also present an extension of GRLCF algorithm as . Moreover, we develop the iterative updating optimization schemes, and provide the convergence proof of our optimization scheme. Since GRLCF simultaneously considers the geometric structures of the data manifold and the locality conditions as additional constraints, it can obtain more compact and better structured data representation. Experimental results on ORL, Yale and Mnist image datasets demonstrate the effectiveness of our proposed algorithm.

关键词： NMF Concept factorization Graph regularized local coordinate coding Image clustering

来源：评论

学校读者我要写书评

暂无评论

Traffic Sign Recognition with Transfer Learning

Traffic Sign Recognition with Transfer Learning

引用

IEEE Symposium Series on Computational Intelligence (IEEE SSCI)

作者： Peng, Xishuai Li, Yuanxiang Wei, Xian Luo, Jianhua Murphey, Yi Lu Shanghai Jiao Tong Univ Sch Aeronaut & Astronaut Shanghai 200240 Peoples R China Univ Michigan Dept Elect & Comp Engn Dearborn MI 48128 USA

ISBN: (纸本)9781538627266

Traffic signs are characterized by a wide variability in their visual appearance in real-world environments. Supervised algorithms have achieved superior results on German Traffic Sign Recognition Bench-mark (GTSRB) database. However, these models cannot transfer knowledge across domains, e.g. transfer knowledge learned from Synthetic Signs database to recognize the traffic signs in GTSRB database. Through Synthetic Signs database shares exactly the same class label with GTSRB, the data distribution between them are divergent. Such task is called transfer learning, that is a basic ability for human being but a challenge problem for machines. In order to make these algorithms have ability to transfer knowledge between domains, we propose a variant of Generalized Auto-Encoder (GAE) in this paper. Traditional transfer learning algorithms, e.g. Stacked Autoencoder(SA), usually attempt to reconstruct target data from source data or man-made corrupted data. In contrast, we assume the source and target data are two different corrupted versions of a domain-invariant data. And there is a latent subspace that can reconstruct the domain-invariant data as well as preserve the local manifold of it. Therefore, the domain-invariant data can be obtained not only by de-noising from the nearest source and target data but also by reconstructing from the latent subspace. In order to make the statistical and geometric property preserved simultaneously, we additionally propose a local coordinate coding (LCC)-based relational function to construct the deep nonlinear architecture. The experimental results on several benchmark datasets demonstrate the effectiveness of our proposed approach in comparison with several traditional methods.

关键词： transfer learning traffic sign recognition stacked auto-encoder local coordinate coding deep learning

来源：评论

学校读者我要写书评

暂无评论

Generalized Pooling for Robust Object Tracking

引用

IEEE TRANSACTIONS ON IMAGE PROCESSING 2016年第9期25卷 4199-4208页

作者： Ma, Bo Hu, Hongwei Shen, Jianbing Liu, Yangbiao Shao, Ling Beijing Inst Technol Sch Comp Sci & Technol Beijing Lab Intelligent Informat Technol Beijing 100081 Peoples R China Northumbria Univ Dept Comp Sci & Digital Technol Newcastle Upon Tyne NE1 8ST Tyne & Wear England

Feature pooling in a majority of sparse coding-based tracking algorithms computes final feature vectors only by low-order statistics or extreme responses of sparse codes. The high-order statistics and the correlations between responses to different dictionary items are neglected. We present a more generalized feature pooling method for visual tracking by utilizing the probabilistic function to model the statistical distribution of sparse codes. Since immediate matching between two distributions usually requires high computational costs, we introduce the Fisher vector to derive a more compact and discriminative representation for sparse codes of the visual target. We encode target patches by local coordinate coding, utilize Gaussian mixture model to compute Fisher vectors, and finally train semi-supervised linear kernel classifiers for visual tracking. In order to handle the drifting problem during the tracking process, these classifiers are updated online with current tracking results. The experimental results on two challenging tracking benchmarks demonstrate that the proposed approach achieves a better performance than the state-of-the-art tracking algorithms.

关键词： Object tracking feature pooling Fisher kernel local coordinate coding sparse coding

来源：评论

学校读者我要写书评

暂无评论

Traffic Sign Recognition with Transfer Learning

Traffic Sign Recognition with Transfer Learning

引用

IEEE Symposium Series on Computational Intelligence

作者： Xishuai Peng Yuanxiang Li Xian Wei Jianhua Luo Yi Lu Murphey School of Aeronautics and Astronautics Shanghai Jiao Tong University Shanghai China University of Michigan Dearborn Dearborn MI US

Traffic signs are characterized by a wide variability in their visual appearance in real-world environments. Supervised algorithms have achieved superior results on German Traffic Sign Recognition Bench-mark (GTSRB) database. However, these models cannot transfer knowledge across domains, e.g. transfer knowledge learned from Synthetic Signs database to recognize the traffic signs in GTSRB database. Through Synthetic Signs database shares exactly the same class label with GTSRB, the data distribution between them are divergent. Such task is called transfer learning, that is a basic ability for human being but a challenge problem for machines. In order to make these algorithms have ability to transfer knowledge between domains, we propose a variant of Generalized Auto-Encoder (GAE) in this paper. Traditional transfer learning algorithms, *** Autoencoder(SA), usually attempt to reconstruct target data from source data or man-made corrupted data. In contrast, we assume the source and target data are two different corrupted versions of a domain-invariant data. And there is a latent subspace that can reconstruct the domain-invariant data as well as preserve the local manifold of it. Therefore, the domain-invariant data can be obtained not only by de-noising from the nearest source and target data but also by reconstructing from the latent subspace. In order to make the statistical and geometric property preserved simultaneously, we additionally propose a local coordinate coding (LCC)-based relational function to construct the deep nonlinear architecture. The experimental results on several benchmark datasets demonstrate the effectiveness of our proposed approach in comparison with several traditional methods.

关键词： Transfer learning Traffic sign recognition Stacked auto-encoder local coordinate coding Deep learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：