检索结果-内蒙古大学图书馆

arXiv 2019年

作者： Li, Bo Xu, Kele Feng, Dawei Mi, Haibo Wang, Huaimin Zhu, Jian Beijing University of Posts and Telecommunications Automation Dept. Beijing100876 China National Key Lab of Parallel and Distributed Processing Changsha China National University of Defense Technology Changsha China Department of Linguistics University of Michigan Ann Arbor United States

B-mode ultrasound tongue imaging is widely used in the speech production field. However, efficient interpretation is in a great need for the tongue image sequences. Inspired by the recent success of unsupervised deep learning approach, we explore unsupervised convolutional network architecture for the feature extraction in the ultrasound tongue image, which can be helpful for the clinical linguist and phonetics. By quantitative comparison between different unsupervised feature extraction approaches, the denoising convolutional autoencoder (DCAE)-based method outperforms the other feature extraction methods on the reconstruction task and the 2010 silent speech interface challenge. A Word Error Rate of 6.17% is obtained with DCAE, compared to the state-of-the-art value of 6.45% using Discrete cosine transform as the feature extractor. Our codes are available at https://***/DeePBluE666/Source-code1. Copyright © 2019, The Authors. All rights reserved.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

High Performance Graph Analytics with Productivity on Hybrid CPU-GPU Platforms

High Performance Graph Analytics with Productivity on Hybrid...

引用

2018年第二届高性能编译、计算和通信国际会议

作者： Haoduo Yang Huayou Su Qiang Lan Mei Wen Chunyuan Zhang Department of Computer National University of Defense Technology National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology

In recent years, the rapid-growing scales of graphs have sparked a lot of parallel graph analysis frameworks to leverage the massive hardware resources on CPUs or GPUs. Existing CPU implementations are time-consuming, while GPU implementations are restricted by the memory space and the complexity of programming. In this paper, we present a high performance hybrid CPU-GPU parallel graph analytics framework with good productivity based on GraphMat. We map vertex programs to generalized sparse matrix vector multiplication on GPUs to deliver high performance, and propose a high-level abstraction for developers to implement various graph algorithms with relatively little efforts. Meanwhile, several optimizations have been adopted for reducing the communication cost and leveraging hardware resources, especially the memory hierarchy. We evaluate the proposed framework on three graph primitives(PageRank, BFS and SSSP) with large-scale graphs. The experimental results show that, our implementation achieves an average speedup of 7.0 X than GraphMat on two 6-core Intel Xeon CPUs. It also has the capability to process larger datasets but achieves comparable performance than MapGraph, a state-of-theart GPU-based framework.

关键词： parallel Computing Graph Analytics Hybrid CPU-GPU

来源：评论

学校读者我要写书评

暂无评论

A GPU-based parallel WFST decoder on nnet3

引用

AIP Conference Proceedings 2019年第1期2073卷

作者： Yong Wang Jie Liu Chen Zhou Zhengbin Pang Shengguo Li Chunye Gong Xinbiao Gan Yurong Li 1Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha 410073 China

One performance-intensive part of automatic speech recognition is the weighted finite-state transducer (WFST) decoding. To solve the problem, we expand parallel Graphics processing Units (GPU) computing to the decoding period. We describe extension work based on Kaldi toolkit for speech recognition research. Our work can support weighted finite-state transducer decoding on Kaldi neural nets with CUDA toolkit. Our paper also expands an efficient parallel Viterbi beam decoding algorithm to decrease the speech recognition Real Time Factor (RTF) value. Together with our optimization algorithm, we have reached 2.3x speed up on the AISHELL corpus decoding. We also implement nnet3 decoder that improves real-time speed up with no word error rate raise.

关键词：

来源：评论

学校读者我要写书评

暂无评论

RZKPB: A Privacy-Preserving Blockchain-Based Fair Transaction Method for Sharing Economy

RZKPB: A Privacy-Preserving Blockchain-Based Fair Transactio...

引用

IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)

作者： Bin Li Yijie Wang Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology Hunan Changsha P.R. China

Emerging blockchain systems have been widely adopted in sharing economy, such as e-commerce, to allow mutually distrustful parties to transact fairly without trusted parties. Most blockchain systems, however, lack transactional privacy protection. All transactions, including trading relationship between pseudonyms and content transacted, are exposed on the blockchain. Although many existing privacy protection methods on the blockchain have been proposed, it is difficult to find a trade-off between keeping speed and protecting privacy of transactions. To address this limitation, we propose a novel privacy-preserving method RZKPB that does not store financial transactions in clear on the blockchain, thus retaining transactional privacy from the public's view. Meanwhile, these transactions are as proofs to solve disputes between trading partners. RZKPB ensures fairness and privacy of transactions between participants without adding a new trusted party and breaking the verifying protocol on the blockchain. We take the e-commerce as an example in sharing economy to introduce RZKPB in our paper. Our experimental results show that compared with existing privacy-preserving methods based on the blockchain, RZKPB is more efficient under different settings.

关键词： Privacy Protocols Encryption Distortion

来源：评论

学校读者我要写书评

暂无评论

FPPB: A Fast and Privacy-Preserving Method Based on the Permissioned Blockchain for Fair Transactions in Sharing Economy

FPPB: A Fast and Privacy-Preserving Method Based on the Perm...

引用

IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)

作者： Bin Li Yijie Wang Peichang Shi Huan Chen Li Cheng Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha Hunan P.R. China

Blockchain is a distributed system with efficient transaction recording and has been widely adopted in sharing economy. Although many existing privacy-preserving methods on the blockchain have been proposed, finding a trade-off between keeping speed and preserving privacy of transactions remain challenging. To address this limitation, we propose a novel Fast and Privacy-preserving method based on the Permissioned Blockchain (FPPB) for fair transactions in sharing economy. Without breaking the verifying protocol and bringing additional off-blockchain interactive communication, FPPB protects the privacy and fairness of transactions. Additionally, experiments are implemented in EthereumJ (a Java implementation of the Ethereum protocol) to measure the performance of FPPB. Compared with normal transactions without cryptographic primitives, FPPB only slows down transactions slightly.

关键词： Privacy Cloud computing Encryption Registers Protocols

来源：评论

学校读者我要写书评

暂无评论

HPGA: A High-Performance Graph Analytics Framework on the GPU

HPGA: A High-Performance Graph Analytics Framework on the GP...

引用

International Conference on Information Systems and Computer Aided Education (ICISCAE)

作者： Haoduo Yang Huayou Su Mei Wen Chunyuan Zhang Department of Computer National University of Defense Technology Changsha China National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha China

ISBN: (纸本)9781538657393;9781538657386

In recent years, the rapidly growing use of graphs has sparked parallel graph analytics frameworks for leveraging the massive hardware resources, specifically graphics processing units (GPUs). However, the issues of the unpredictable control flows, memory divergence, and the complexity of programming have restricted high-level GPU graph libraries. In this work, we present HPGA, a high performance parallel graph analytics framework targeting the GPU. HPGA implements an abstraction which maps vertex programs to generalized sparse matrix operations on GPUs for delivering high performance. HPGA incorporates high-performance GPU computing primitives and optimization strategies with a high-level programming model. We evaluate the performance of HPGA for three graph primitives (BFS, SSSP, PageRank) with large-scale datasets. The experimental results show that HPGA matches or even exceeds the performance of MapGraph and nvGRAPH, two state-of-the-art GPU graph libraries.

关键词： Graphics processing units Sparse matrices Programming Computational modeling Arrays Optimization

来源：评论

学校读者我要写书评

暂无评论

Predicting potential gene ontology from cellular response data 17

Predicting potential gene ontology from cellular response da...

引用

5th International Conference on Bioinformatics and Computational Biology, ICBCB 2017

作者： Hong, Hao Yin, Xiaoyao Li, Fei Guan, Naiyang Bo, Xiaochen Luo, Zhigang Department of Chemistry and Biology National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China Department of Biotechnology Beijing Institute of Radiation Medicine Beijing China

ISBN: (纸本)9781450348270

Ontologies have proven to be useful for capturing and organizing knowledge as a hierarchical set of terms and their relationships. However, curating gene ontology data by hand requires specialized knowledge of certain field, which is inefficient. Thus inferring gene ontology from the exponentially increased biological data is getting hot. Based on the Library of Integrated Network-Based Cellular Signatures (LINCS) data we came up with the hypothesis that genes participate in analogous biological processes might affect cells accordantly. By assessing cellular response after genes were knock out we built a similarity matrix with the Gene Set Enrichment Analysis (GSEA) and clustered the genes with affinity propagation algorithm. Next we mapped the cluster result to gene ontology biological process data for annotation and enrichment analysis, which confirmed our hypothesis and made it possible to predict biological processes for unannotated genes from cellular response data after genes are knock out for the first time. We further validated the rationality from the gene ontology molecular function data. © 2017 ACM.

关键词： Gene Ontology

来源：评论

学校读者我要写书评

暂无评论

Corrigendum to “A distributed Relation Detection Approach in the Internet of Things”

引用

Mobile Information Systems 2019年第1期2019卷

作者： Weiping Zhu Hongliang Lu Xiaohui Cui Jiannong Cao International School of Software Wuhan University Wuhan *** Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha *** Department of Computing Hong Kong Polytechnic University Kowloon Hong Kongpolyu.edu.hk

来源：评论

学校读者我要写书评

暂无评论

Collaborative deep learning across multiple data centers

arXiv

引用

arXiv 2018年

作者： Xu, Kele Mi, Haibo Feng, Dawei Wang, Huaimin Chen, Chuan Zheng, Zibin Lan, Xu National Key Laboratory of Parallel and Distributed Processing Changsha China College of Computer National University of Defense Technology Changsha China School of Data and Computer Science Sun Yat-Sen University Guangzhou China Queen Mary University of London London United Kingdom

Valuable training data is often owned by independent organizations and located in multiple data centers. Most deep learning approaches require to centralize the multi-datacenter data for performance purpose. In practice, however, it is often infeasible to transfer all data to a centralized data center due to not only bandwidth limitation but also the constraints of privacy regulations. Model averaging is a conventional choice for data parallelized training, but its ineffectiveness is claimed by previous studies as deep neural networks are often non-convex. In this paper, we argue that model averaging can be effective in the decentralized environment by using two strategies, namely, the cyclical learning rate and the increased number of epochs for local model training. With the two strategies, we show that model averaging can provide competitive performance in the decentralized mode compared to the data-centralized one. In a practical environment with multiple data centers, we conduct extensive experiments using state-of-the-art deep network architectures on different types of data. Results demonstrate the effectiveness and robustness of the proposed method. Copyright © 2018, The Authors. All rights reserved.

关键词： Network architecture

来源：评论

学校读者我要写书评

暂无评论

Fine-grained checkpoint based on non-volatile memory

引用

Frontiers of Information technology & Electronic Engineering 2017年第2期18卷 220-234页

作者： Wen-zhe ZHANG Kai LU Mikel LUJAN Xiao-ping WANG Xu ZHOU Science and Technology on Parallel and Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha 410072 China School of Computer The University of Manchester Manchester M13 9PL UK

New non-volatile memory （e.g., phase-change memory） provides fast access, large capacity, byteaddressability, and non-volatility features. These features, fast-byte-persistency, will bring new opportunities to fault tolerance. We propose a fine-grained checkpoint based on non-volatile memory. We extend the current virtual memory manager to manage non-volatile memory, and design a persistent heap with support for fast allocation and checkpointing of persistent objects. To achieve a fine-grained checkpoint, we scatter objects across virtual pages and rely on hardware page-protection to monitor the modifications. In our system, two objects in different virtual pages may reside on the same physical page. Modifying one object would not interfere with the other object. This allows us to monitor and checkpoint objects smaller than 4096 bytes in a fine-grained way. Compared with previous page-grained based checkpoint mechanisms, our new checkpoint method can greatly reduce the data copied at checkpoint time and better leverage the limited bandwidth of non-volatile memory.

关键词： Non-volatile memory Byte-persistency Persistent heap Fine-grained checkpoint

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：