检索结果-内蒙古大学图书馆

IEEE International Parallel and Distributed Processing Symposium

作者： Tyler M. Smith Robert van de Geijn Mikhail Smelyanskiy Jeff R. Hammond Field G. Van Zee Institute for Computational Engineering and Sciences and Department of Computer Science The University of Texas at Austin Parallel Computing Lab Intel Corporation Leadership Computing Facility Argonne National Lab

ISBN: (纸本)9781479938018

BLIS is a new framework for rapid instantiation of the BLAS. We describe how BLIS extends the "GotoBLAS approach" to implementing matrix multiplication (GEMM). While GEMM was previously implemented as three loops around an inner kernel, BLIS exposes two additional loops within that inner kernel, casting the computation in terms of the BLIS micro-kernel so that porting GEMM becomes a matter of customizing this micro-kernel for a given architecture. We discuss how this facilitates a finer level of parallelism that greatly simplifies the multithreading of GEMM as well as additional opportunities for parallelizing multiple loops. Specifically, we show that with the advent of many-core architectures such as the IBM PowerPC A2 processor (used by Blue Gene/Q) and the Intel Xeon Phi processor, parallelizing both within and around the inner kernel, as the BLIS approach supports, is not only convenient, but also necessary for scalability. The resulting implementations deliver what we believe to be the best open source performance for these architectures, achieving both impressive performance and excellent scalability.

关键词： Linear algebra Libraries High-performance Matrix BLAS Multicore

来源：评论

学校读者我要写书评

暂无评论

First-principles study of vibrational modes and Raman spectra in P-doped Si nanocrystals

引用

Physical Review B 2014年第19期89卷 195309-195309页

作者： K. H. Khoo James R. Chelikowsky Institute of High Performance Computing Agency for Science Technology and Research 1 Fusionopolis Way #16-16 Connexis Singapore 138632 Center for Computational Materials Institute for Computational Engineering and Sciences Departments of Physics and Chemical Engineering The University of Texas at Austin Austin Texas 78712 USA

We have studied the vibrational modes and Raman spectra of P-doped Si nanocrystals using pseudopotential density functional theory and the Placzek approximation. We find that Si nanocrystal vibrations are largely unaffected by the introduction of P dopants. However, the Raman spectra of doped nanocrystals are enhanced relative to those of pristine nanocrystals, and demonstrate a strong dependence on dopant position. Thus, Raman has the potential of being developed as a tool for probing the location of the dopant within the nanocrystal. Our analysis shows that vibrational modes involving atoms in the vicinity of the dopant give the largest contributions to the Raman spectra.

关键词： First-principles P-doped nanocrystals Nanocrystals Raman spectrometry p-doping adulterated products Vibrations modes Raman spectra density functional theory

来源：评论

学校读者我要写书评

暂无评论

In search of perfect reads

In search of perfect reads

引用

IEEE International Conference on computational Advances in Bio and Medical sciences (ICCABS)

作者： Soumitra Pal Srinivas Aluru Department of Computer Science and Engineering Indian Institute of Technology Bombay Powai Mumbai India School of Computational Science and Engineering College of Computing Georgia Institute of Technology 266 Ferst Drive Atlanta GA USA

Continued advances in next generation short-read sequencing technologies are increasing throughput and read lengths, while driving down the error rates, for example within 1% for Illumina HiSeq reads. Moreover, the errors are not uniformly distributed in all reads, and a large percentage of reads are indeed error-free. Ability to predict such perfect reads can have significant impact on run-time complexity of applications. In this paper, we present a simple and fast k-spectrum analysis based method to identify error-free reads. Our experiments show that if around 80% of the reads in a dataset are perfect, then our method retains almost 99.9% of them with more than 90% precision rate. Though filtering out reads identified as erroneous by our method reduces the coverage by about 7% on an average, coverage pattern across genome remains similar. The filtration process can be customized at several levels of stringency depending upon the downstream application need.

关键词： Bioinformatics Genomics Error correction Sequential analysis Next generation networking Accuracy Prediction algorithms

来源：评论

学校读者我要写书评

暂无评论

Nonlocal Diffusion Tensor for Visual Saliency Detection

Nonlocal Diffusion Tensor for Visual Saliency Detection

引用

International Conference on computational Intelligence and Security

作者： Xiujun Zhang Chen Xu Min Li College of Information and Engineering Shenzhen University Shenzhen China Institute of Intelligent Computing Science Shenzhen University Shenzhen China College of Mathematics and Computational Science Shenzhen University Shenzhen China

ISBN: (纸本)9781479974351

In this paper, visual attention transfer is formulated as a nonlocal diffusion equation. Different from the other diffusion based method, a nonlocal diffusion tensor is introduced to consider both the diffusion strength and direction. Along with the principle direction, the diffusion should be suppressed to preserve the dissimilarity between the foreground and background, and in other directions, the diffusion should be boosted to combine the similar regions and highlight the saliency object as a whole. Through a two-stages diffusion, the final saliency map is obtained and quantitative and visual comparisons are executed on two large benchmark databases. Experimental results demonstrate the superior performance of our method.

关键词： Visualization Tensile stress Databases Image color analysis Mathematical model Equations Vectors

来源：评论

学校读者我要写书评

暂无评论

Non-Darcy behavior of two-phase channel flow

引用

Physical Review E 2014年第2期90卷 023010-023010页

作者： Xianmin Xu Xiaoping Wang LSEC Institute of Computational Mathematics and Scientific/Engineering Computing NCMIS AMSS Chinese Academy of Sciences Beijing 100190 China Department of Mathematics Hong Kong University of Science and Technology Clear Water Bay Kowloon Hong Kong China

We study the macroscopic behavior of two-phase flow in porous media from a phase-field model. A dissipation law is first derived from the phase-field model by homogenization. For simple channel geometry in pore scale, the scaling relation of the averaged dissipation rate with the velocity of the two-phase flow can be explicitly obtained from the model which then gives the force-velocity relation. It is shown that, for the homogeneous channel surface, Dacry's law is still valid with a significantly modified permeability including the contribution from the contact line slip. For the chemically patterned surfaces, the dissipation rate has a non-Darcy linear scaling with the velocity, which is related to a depinning force for the patterned surface. Our result offers a theoretical understanding on the prior observation of non-Darcy behavior for the multiphase flow in either simulations or experiments.

关键词： TWO-phase flow ENERGY dissipation HOMOGENIZATION (Differential equations) CHANNEL flow (Fluid dynamics) RESEARCH DARCY'S law POROUS materials

来源：评论

学校读者我要写书评

暂无评论

Rotated block triangular preconditioning based on PMHSS

引用

science China Mathematics 2013年第12期56卷 2523-2538页

作者： BAI Zhong-Zhi State Key Laboratory of Scientific/Engineering Computing Institute of Computational Mathematics and Scientific/Engineering Computing Academy of Mathematics and Systems ScienceChinese Academy of Sciences

Based on the PMHSS preconditioning matrix, we construct a class of rotated block triangular preconditioners for block two-by-two matrices of real square blocks, and analyze the eigen-properties of the corresponding preconditioned matrices. Numerical experiments show that these rotated block triangular pre- conditioners can be competitive to and even more efficient than the PMHSS preconditioner when they are used to accelerate Krylov subspeme iteration methods for solving block two-by-two linear systems with coefficient matrices possibly of nonsymmetric sub-blocks.

关键词： block two-by-two matrix PMHSS preconditioner block triangular preconditioning product-typepreconditioning eigen-properties

来源：评论

学校读者我要写书评

暂无评论

The efect of ghost forces for a quasicontinuum method in three dimension

引用

science China Mathematics 2013年第12期56卷 2571-2589页

作者： CUI Long MING PingBing LSEC Institute of Computational Mathematics and Scientific/Engineering ComputingAcademy of Mathematics and Systems ScienceChinese Academy of Sciences

We study the effect of ＂ghost forces＂ for a quasicontinuum method in three dimension with a planar interface. ＂Ghost forces＂ are the inconsistency of the quasicontinuum method across the interface between the atomistic region and the continuum region. Numerical results suggest that ＂ghost forces＂ may lead to a negilible error on the solution, while lead to a finite size error on the gradient of the solution. The error has a layer-like profile, and the interfacial layer width is of O（ε）. The error in certain component of the displacement gradient decays algebraically from O（1） to O（ε） away from the interface. A surrogate model is proposed and analyzed, which suggests the same scenario for the effect of ＂ghost forces＂. Our analysis is based on the explicit solution of the surrogate model.

关键词： quasicontinuum method atomistic-to-continuum ghost force

来源：评论

学校读者我要写书评

暂无评论

Phonon Softening Induced Intrinsic Thermal Resistance in Individual Single-layer Graphene

Phonon Softening Induced Intrinsic Thermal Resistance in Ind...

引用

The 2nd International Conference on Phononics and thermal Energy science(PTES2014)(第二届国际声子学与热能科学大会)

作者： Wen Xu Gang Zhang Baowen Li NUS Graduate School for Integrative Sciences and Engineering National University of SingaporeKent Ridge 119620Singapore Department of Physics and Centre for Computational Science and Engineering National University of SingaporeSingapore 117546Singapore Center for Phononics and Thermal Energy Science School of Physics Science and EngineeringTongji University200092 ShanghaiChina Institute of High Performance Computing A*STARSingapore 138632Singapore Institute of High Performance ComputingA*STARSingapore 138632Singapore Department of Physics and Centre for Computational Science and EngineeringNational University of SingaporeSingapore 117546Singapore

With molecular dynamics simulations,we systematically uncover a new kind of intrinsic thermal resistance that exists in two-dimensional materials under uneven external perturbation,by using partly encased graphene as a typical *** with lattice dynamics analysis,we demonstrate that this intrinsic thermal resistance originates from the softening of flexural phonons partly in graphene induced by inhomogeneous external potential field or substrates which serve as *** the interface between graphene sections with and without external potential field,in-plane phonon modes can transmit well,whereas,low frequency flexural phonon modes are reflected,leading to this nontrivial intrinsic thermal resistance in the individual single-layer *** intrinsic thermal resistance closely depends on coupling strength between graphene and substrates,and could be significant when the coupling is ***,it is suppressed at high *** is also found that this intrinsic thermal resistance depends on the size of the system to some extent,and a length independent value is ***,we demonstrate that thermal rectification can be realized by including the uneven external *** study provides new insight to better understand thermal transport in two-dimensional materials.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Parallel Bayesian Network Structure Learning for Genome-Scale Gene Networks

Parallel Bayesian Network Structure Learning for Genome-Scal...

引用

Supercomputing Conference

作者： Sanchit Misra Md. Vasimuddin Kiran Pamnany Sriram P. Chockalingam Yong Dong Min Xie Maneesha R. Aluru Srinivas Aluru Parallel Computing Lab Intel Corporation Bangalore India Dept. of Computer Science and Engineering Indian Institute of Technology Bombay Mumbai India State Key Laboratory of High Performance Computing National University of Defense Technology Changsha China School of Biology Georgia Institute of Technology Atlanta USA School of Computational Science and Engineering Georgia Institute of Technology Atlanta USA

Learning Bayesian networks is NP-hard. Even with recent progress in heuristic and parallel algorithms, modeling capabilities still fall short of the scale of the problems encountered. In this paper, we present a massively parallel method for Bayesian network structure learning, and demonstrate its capability by constructing genome-scale gene networks of the model plant Arabidopsis thaliana from over 168.5 million gene expression values. We report strong scaling efficiency of 75% and demonstrate scaling to 1.57 million cores of the Tianhe-2 supercomputer. Our results constitute three and five orders of magnitude increase over previously published results in the scale of data analyzed and computations performed, respectively. We achieve this through algorithmic innovations, using efficient techniques to distribute work across all compute nodes, all available processors and coprocessors on each node, all available threads on each processor and coprocessor, and vectorization techniques to maximize single thread performance.

关键词： Bayes methods Hypercubes Instruction sets Genomics Bioinformatics Coprocessors Vectors

来源：评论

学校读者我要写书评

暂无评论

Sensitivity Analysis for Time Dependent Problems: Optimal Checkpoint-Recompute HPC Workflows

Sensitivity Analysis for Time Dependent Problems: Optimal Ch...

引用

Workshop on Workflows in Support of Large-Scale science (WORKS)

作者： Varis Carey Hasan Abbasi Ivan Rodero Hemanth Kolla Institute of Computational Engineering and Science University of Texas at Austin Computer Science and Mathematics Division Oak Ridge National Laboratory Rutgers Discovery Informatics Institute and NSF Cloud and Autonomic Computing Center Rutgers University Scalable Modeling and Analysis Sandia National Laboratories

Sensitivity analysis (SA) is a fundamental tool of uncertainty quantification(UQ). Adjoint-based SA is the optimal approach in many large-scale applications, such as the direct numerical simulation (DNS) of combustion. However, one of the challenges of the adjoint workflow for time-dependent applications is the storage and I/O requirements for the application state. During the time-reversal portion of the workflow, forward state is required in last-in-first-out order. The resulting requirements for storage at exascale are enormous. To mitigate this requirement, application state is regenerated from checkpoints over short windows of application time. This approach drastically reduces the total volume of stored data, allows the caching of state in the regeneration window in memory and on local SSDs, may accelerate the application execution by reducing output frequency, and reduces the power overhead from I/O. We explore variations to this workflow, applied to a proxy for the SA of turbulent combustion, by varying checkpoint number, state storage, and other regeneration options to find efficient implementations for minimizing compute time or power consumption.

关键词： computational modeling Mathematical model Sensitivity analysis Combustion Analytical models Checkpointing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：