检索结果-内蒙古大学图书馆

IEEE International Conference on Cluster Computing

作者： Jia Li Dongsheng Li Yiming Zhang National Laboratory for Parallel and Distributed Processing School of Computer Science National University of Defense Technology Changsha China

ISBN: (纸本)9781467365994

Data clustering is usually time-consuming since it by default needs to iteratively aggregate and process large volume of data. Approximate aggregation based on sample provides fast and quality ensured results. In this paper, we propose to leverage approximation techniques to data clustering to obtain the trade-off between clustering efficiency and result quality, along with online accuracy estimation. The proposed method is based on the bootstrap trials. We implemented this method as an Intelligent Bootstrap Library (IBL) on Spark to support efficient data clustering. Intensive evaluations show that IBL can provide a 2x speed-up over the state of art solution with the same error bound.

关键词： Sparks Accuracy Data mining Estimation error Distributed databases Approximation methods

来源：评论

学校读者我要写书评

暂无评论

Experimental verification of the parasitic bipolar amplification effect in PMOS single event transients

引用

Chinese Physics B 2014年第7期23卷 775-779页

作者：何益百陈书明 College of Computer National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology

The contribution of parasitic bipolar amplification to SETs is experimentally verified using two P-hit target chains in the normal layout and in the special layout. For PMOSs in the normal layout, the single-event charge collection is composed of diffusion, drift, and the parasitic bipolar effect, while for PMOSs in the special layout, the parasitic bipolar junction transistor cannot turn on. Heavy ion experimental results show that PMOSs without parasitic bipolar amplification have a 21.4% decrease in the average SET pulse width and roughly a 40.2% reduction in the SET cross-section.

关键词： single event effect single event transient parasitic bipolar amplification heavy ion experiments

来源：评论

学校读者我要写书评

暂无评论

Two-dimensional euler PCA for face recognition 21

Two-dimensional euler PCA for face recognition

引用

21st International Conference on MultiMedia Modeling, MMM 2015

作者： Tan, Huibin Zhang, Xiang Guan, Naiyang Tao, Dacheng Huang, Xuhui Luo, Zhigang Science and Technology on Parallel Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha Hunan410073 China Department of Computer Science and Technology College of Computer National University of Defense Technology Changsha Hunan410073 China Centre for Quantum Computation and Intelligent Systems and the Faculty of Engineering and Information Technology University of Technology Sydney 235 Jones Street UltimoNSW2007 Australia

ISBN: (纸本)9783319144412

Principal component analysis (PCA) projects data on the directions with maximal variances. Since PCA is quite effective in dimension reduction, it has been widely used in computer vision. However, conventional PCA suffers from following deficiencies: 1) it spends much computational costs to handle high-dimensional data, and 2) it cannot reveal the nonlinear relationship among different features of data. To overcome these deficiencies, this paper proposes an efficient two-dimensional Euler PCA (2D-ePCA) algorithm. Particularly, 2D-ePCA learns projection matrix on the 2D pixel matrix of each image without reshaping it into 1D long vector, and uncovers nonlinear relationships among features by mapping data onto complex representation. Since such 2D complex representation induces much smaller kernel matrix and principal subspaces, 2D-ePCA costs much less computational overheads than Euler PCA on large-scale dataset. Experimental results on popular face datasets show that 2D-ePCA outperforms the representative algorithms in terms of accuracy, computational overhead, and robustness. © Springer International Publishing Switzerland 2015.

关键词： Principal component analysis

来源：评论

学校读者我要写书评

暂无评论

Non-negative low-rank and group-sparse matrix factorization 21

Non-negative low-rank and group-sparse matrix factorization

引用

21st International Conference on MultiMedia Modeling, MMM 2015

作者： Wu, Shuyi Zhang, Xiang Guan, Naiyang Tao, Dacheng Huang, Xuhui Luo, Zhigang Science and Technology on Parallel Distributed Processing Laboratory College of Computer National University of Defense Technology Changsha Hunan410073 China Department of Computer Science and Technology College of Computer National University of Defense Technology Changsha Hunan410073 China Centre for Quantum Computation & Intelligent Systems and the Faculty of Engineering and Information Technology University of Technology Sydney 235 Jones Street UltimoNSW2007 Australia

ISBN: (纸本)9783319144412

Non-negative matrix factorization (NMF) has been a popular data analysis tool and has been widely applied in computer vision. However, conventional NMF methods cannot adaptively learn grouping structure froma *** paper proposes a non-negative low-rank and group-sparse matrix factorization (NLRGS) method to overcome this deficiency. Particularly, NLRGS captures the relationships among examples by constraining rank of the coefficients meanwhile identifies the grouping structure via group sparsity regularization. By both constraints, NLRGS boosts NMF in both classification and clustering. However, NLRGS is difficult to be optimized because it needs to deal with the low-rank constraint. To relax such hard constraint, we approximate the low-rank constraint with the nuclear norm and then develop an optimization algorithm for NLRGS in the frame of augmented Lagrangian method(ALM). Experimental results of both face recognition and clustering on four popular face datasets demonstrate the effectiveness of NLRGS in quantities. © Springer International Publishing Switzerland 2015.

关键词： Non-negative matrix factorization

来源：评论

学校读者我要写书评

暂无评论

Comparison of heavy-ion induced SEU for D- and TMR-flip-flop designs in 65-nm bulk CMOS technology

引用

science China(Information sciences) 2014年第10期57卷 223-229页

作者： HE YiBai CHEN ShuMing School of Computer Science National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology

Heavy ion experiments were performed on D flip-flop(DFF) and TMR flip-flop(TMRFF) fabricated in a 65-nm bulk CMOS process. The experiment results show that TMRFF has about 92% decrease in SEU crosssection compared to the standard DFF design in static test mode. In dynamic test mode, TMRFF shows much stronger frequency dependency than the DFF design, which reduces its advantage over DFF at higher operation frequency. At 160 MHz, the TMRFF is only 3.2× harder than the standard DFF. Such small improvement in the SEU performance of the TMR design may warrant reconsideration for its use in hardening design.

关键词： SEU flip-flop TMR heavy-ion frequency

来源：评论

学校读者我要写书评

暂无评论

OpenMP-Based Monte Carlo Dose Calculation for Radiotherapy Treatment Planning on the Intel MIC Architecture

OpenMP-Based Monte Carlo Dose Calculation for Radiotherapy T...

引用

IEEE International Symposium on Information (IT) in Medicine and Education, ITME

作者： Qinglin Wang Jie Liu Peizhen Xie Chunye Gong Yuan Li Zuocheng Xing School of Computer Science National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

Monte Carlo (MC) simulation plays an important part in dose calculation for radiotherapy treatment planning. Since the accuracy of MC simulation relies on the number of simulated particles histories, it's very time-consuming. The Intel Many Integrated Core (MIC) architecture, which consists of more than 50 cores and supports many parallel programming models, provides an efficient alternative for accelerating MC dose calculation. This paper implements the OpenMP-based MC Dose Planning Method (DPM) for radiotherapy treatment problems on the Intel MIC architecture. The implementation has been verified on the target MIC coprocessor including 57 cores. The results demonstrate that the OpenMP-based DPM implementation exhibits very accurate results and achieves the maximum speedup of 10.53 times in comparison to the original DPM one on a Xeon E5-2670 CPU. Additionally, speedup and efficiency of the implementation running on the different number of cores in MIC are also reported.

关键词： Microwave integrated circuits Instruction sets Computer architecture Photonics Computational modeling History Interpolation

来源：评论

学校读者我要写书评

暂无评论

Implementation of an Accurate and Efficient Compensated DGEMM for 64-bit ARMv8 Multi-Core Processors

Implementation of an Accurate and Efficient Compensated DGEM...

引用

International Conference on parallel and Distributed Systems (ICPADS)

作者： Hao Jiang Feng Wang Kuan Li Canqun Yang Kejia Zhao Chun Huang College of Computer Science National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

ISBN: (纸本)9781467386692

This paper presents an implementation of an accurate and efficient compensated Double-precision General Matrix Multiplication (DGEMM) based on OpenBLAS for 64-bit ARMv8 multi-core processors. Due to cancellation phenomena in floating point arithmetic, the results of DGEMM may not be as accurate as expected. In order to increase the accuracy of DGEMM, we compensate the error introduced by its dot product kernel (GEBP) by applying an error-free transformation to rewrite the kernel in assembly language. We optimize the computations in the inner kernel through exploiting loop unrolling, instruction scheduling and software-implemented register rotation to exploit instruction level parallelism (ILP). We also conduct a priori error analysis of the derived CompDGEMM. Our compensated DGEMM is as accurate as the existing quadruple precision GEMM using MBLAS, but is up to 6.4x faster. Our parallel implementation achieves good performance and scalability under varying thread counts across a range of matrix sizes evaluated.

关键词： Multicore processing Kernel Registers Libraries Error analysis Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

A low-latency fine-grained dynamic shared cache management scheme for chip multi-processor

A low-latency fine-grained dynamic shared cache management s...

引用

IEEE International Conference on Performance, Computing and Communications (IPCCC)

作者： Jinbo Xu Weixia Xu Zhengbin Pang College of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

In order to utilize the shared last-level cache (LLC) in chip multi-processors (CMP) more efficiently, the partitioning of LLC resources among all cores should have the characteristics of low-latency for access, fine granularity for migration and simple hardware complexity for implementation. This paper proposes a dynamic LLC management scheme to achieve these goals. The proposed scheme migrates cache resources among different cores at the granularity of cache blocks, instead of ways. The quantity of victim cache blocks that each victim core can migrate to other target cores are related to an eviction probability, which are calculated according to the performance goal. Then the victim cache blocks for a target core is chosen from the nearest victim core who has non-zero eviction probability by introducing innovate E-Table structure in CMP. The eviction probabilities are updated periodically. With the help of E-Tables, the proposal achieves low-latency accesses by always keeping the required cache blocks near to the target cores. And fine granularity is guaranteed by maintaining an eviction probability for each core. In addition, only little additional hardware changes to traditional cache structure is required. Simulation results suggest significant performance improvements from 6.8% to 22.7% over related works.

关键词： Hardware Resource management Probability distribution Proposals Complexity theory Simulation

来源：评论

学校读者我要写书评

暂无评论

Classification of Tiangong-1 hyperspectral remote sensing image via contextual sparse coding

Classification of Tiangong-1 hyperspectral remote sensing im...

引用

International Conference on Machine Learning and Cybernetics (ICMLC)

作者： Qi Lv Yong Dou Xin Niu Jiaqing Xu Jinbo Xu School of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

ISBN: (纸本)9781467372220

The hyperspectral remote sensing is one of the frontier techniques in the remote sensing research fields. Applying the sparse coding model to the hyperspectral remote sensing image processing is a hot topic in hyperspectral information processing. To improve the accuracy of hyperspectral image classification, we propose a classification method based on the spatial-spectral join-t contextual sparse coding. Firstly, a dictionary is obtained by training using samples selected from the ground-truth reference data. Then, the sparse coefficients of each pixel are calculated based on the learned dictionary. Afterward, the sparse coefficients are input to the classifier and the final classification result is obtained. The visible and near-infrared hyperspectral remote sensing image collected by Tiangong-1 in Chaoyang District of Beijing is used to evaluate the performance of the proposed approach. Experimental results show that the proposed method yields the best classification performance with the overall accuracy of 95.74% and the Kappa coefficient of 0.9476 in comparison with other classification methods.

关键词： Hyperspectral image Remote sensing Sparse coding Sparse coding Classification Methods remote sensing hyperspectral remote sensing hyperspectral imagery Image classification

来源：评论

学校读者我要写书评

暂无评论

Poster: Segmentation Based Online Performance Problem Diagnosis

Poster: Segmentation Based Online Performance Problem Diagno...

引用

International Conference on Software Engineering (ICSE)

作者： Jingwen Zhou Zhenbang Chen Ji Wang College of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

ISBN: (纸本)9781479919352

Currently, the performance problems of software systems gets more and more attentions. Among various diagnosis methods based on system traces, principal component analysis (PCA) based methods are widely used due to the high accuracy of the diagnosis results and requiring no specific domain knowledge. However, according to our experiments, we have validated several shortcomings existed in PCA-based methods, including requiring traces with a same call sequence, inefficiency when the traces are long, and missing performance problems. To cope with these issues, we introduce a segmentation based online diagnosis method in this poster.

关键词： Principal component analysis Software systems Accuracy Measurement Monitoring Computer architecture Conferences

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：