检索结果-内蒙古大学图书馆

2006 ACM/IEEE Conference on Supercomputing, SC'06

作者： Narumi, Tetsu Ohno, Yousuke Okimoto, Noriaki Koishi, Takahiro Suenaga, Atsushi Futatsugi, Noriyuki Yanai, Ryoko Himeno, Ryutaro Fujikawa, Shigenori Taiji, Makoto Ikei, Mitsuru Intel K.K. High Performance Molecular Simulation Team Genomic Sciences Center Innovative Nanopatterning Laboratory Topochemical Design Laboratory Frontier Research System Advanced Center for Computing and Communication Software and Solutions Group Parallel and Distributed Solutions Group

ISBN: (纸本)0769527000

We have achieved a sustained performance of 55 TFLOPS for molecular dynamics simulations of the amyloid fibril formation of peptides from the yeast Sup35 in an aqueous solution. For performing the calculations, we used the MDGRAPE-3 system - -a special-purpose computer system for molecular dynamics simulations. Its nominal peak performance was 415 TFLOPS for Coulomb force calculations;this is the highest-ever performance reported for classical molecular dynamics simulations. Amyloid fibril formation is known to be related to the occurrence of severe diseases such as Alzheimer's, Parkinson's, and Creutzfeldt-Jakob diseases. The Sup35 protein is a "yeast prion protein," which forms mini-crystals due to aggregation;it forms an effective platform for studying the formation process of amyloid fibrils. In these simulations, we first elucidate that the amyloid-forming peptides GNNQQNY aggregate at a higher frequency than non-amyloid-forming peptides SQNGNQQRG;further, the GNNQQNY peptides tend to form parallel two-stranded -sheets that would grow into a cross- amyloid nucleus. The results are consistent with those obtained experimentally. Furthermore, we could observe an early elongation of the amyloid nucleus. This result is expected to contribute toward a deeper understanding of the amyloid growth mechanism. © 2006 IEEE.

关键词： Peptides

来源：评论

学校读者我要写书评

暂无评论

A distributed Multi-Storage Resource Architecture and I/O Performance Prediction for Scientific computing

引用

Cluster computing 2003年第3期6卷 189-200页

作者： Shen, X. Choudhary, A. Matarazzo, C. Sinha, P. Center for Parallel and Distributed Computing Department of Electrical and Computer Engineering Northwestern University Evanston USA Lawrence Livermore National Laboratory Livermore USA

I/O intensive applications have posed great challenges to computational scientists. A major problem of these applications is that users have to sacrifice performance requirements in order to satisfy storage capacity requirements in a conventional computing environment. Further performance improvement is impeded by the physical nature of these storage media even when state-of-the-art I/O optimizations are employed. In this paper, we present a distributed multi-storage resource architecture, which can satisfy both performance and capacity requirements by employing multiple storage resources. Compared to a traditional single storage resource architecture, our architecture provides a more flexible and reliable computing environment. This architecture can bring new opportunities for high performance computing as well as inherit state-of-the-art I/O optimization approaches that have already been developed. It provides application users with high-performance storage access even when they do not have the availability of a single large local storage archive at their disposal. We also develop an Application Programming Interface (API) that provides transparent management and access to various storage resources in our computing environment. Since I/O usually dominates the performance in I/O intensive applications, we establish an I/O performance prediction mechanism which consists of a performance database and a prediction algorithm to help users better evaluate and schedule their applications. A tool is also developed to help users automatically generate performance data stored in databases. The experiments show that our multi-storage resource architecture is a promising platform for high performance distributed computing.

关键词：

来源：评论

学校读者我要写书评

暂无评论

DPFS: a distributed parallel file system

DPFS: a distributed parallel file system

引用

International Conference on parallel Processing (ICPP)

作者： Xiaohui Shen A. Choudhary Center for Parallel and Distributed Computing Department of Electrical and Computer Engineering Northwestern University Evanston IL USA

One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using enough local storage resources to hold huge amount of data generated by the simulation while providing high performance I/O. DPFS, a distributed parallel File System, is designed and implemented to address this problem. DPFS collects locally distributed unused storage resources as a supplement to the internal storage of parallel computing systems to satisfy the storage capacity requirement of large-scale applications. In addition, like parallel file systems, DPFS provides striping mechanisms that divides a file into small pieces and distributes them across multiple storage devices for parallel data access. The unique feature of DPFS is that it provides three file levels with each file level corresponding to a file striping method. In addition to the traditional linear striping method, DPFS also provides a novel multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS.

关键词： File systems parallel processing Large-scale systems Spatial databases distributed computing distributed power generation Computational modeling Computer simulation Multidimensional systems Ear

来源：评论

学校读者我要写书评

暂无评论

Efficient construction of catastrophic patterns for VLSI reconfigurable arrays with bidirectional links 4

Efficient construction of catastrophic patterns for VLSI rec...

引用

4th International Conference on computing and Information, ICCI 1992

作者： Nayak, Amiya Pagli, Linda Santoro, Nicola Center for Parallel and Distributed Computing School of Computer Science Carleton University OttawaKlS 5B6 Canada Dipartimento di Scienze dell'Informazione University of Pisa Corso Italia 40 Pisa56100 Italy

ISBN: (纸本)081862812X

Patterns of faults that are catastrophic for regular architectures, particularly the systolic arrays, have been studied. For a given link configuration, there are many fault patterns which are catastrophic. Among those, there is a particular fault pattern, called the reference fault pattern, which is crucial for the development of testing techniques;furthermore, the efficiency of any testing algorithm can be further improved in the presence of efficient algorithms for constructing the reference fault pattern. In this paper, we develop a new algorithm for the construction of the reference fault pattern for VLSI reconfigurable arrays in which the links are bidirectional. The complexity of the new algorithm is 0(kN) which is a significant improvement over the existing 0(N2) algorithm, where k is the number of bypass links, and N is the length of the largest bypass link. ©1992 IEEE.

关键词： Systolic arrays

来源：评论

学校读者我要写书评

暂无评论

On reducing false sharing while improving locality on shared memory multiprocessors

On reducing false sharing while improving locality on shared...

引用

International Conference on parallel Architecture and Compilation Techniques (PACT)

作者： M. Kandemir A. Choudhary J. Ramaujam P. Banerjee Center for Parallel and Distributed Computing Department of Electrical and Computer Engineering Northwestern University Evanston IL USA

The performance of applications on large shared-memory multiprocessors with coherent caches depends on the interaction between the granularity of data sharing, the size of the coherence unit and the spatial locality exhibited by the applications, in addition to the amount of parallelism in the applications. Large coherence units are helpful in exploiting spatial locality, but worsen the effects of false sharing. We present a mathematical framework that allows a clean description of the relationship between spatial locality and false sharing. We first show how to identify a severe form of multiple-writer false sharing and then demonstrate the importance of the interaction between optimization techniques aimed at enhancing locality and the techniques oriented toward reducing false sharing. Given the conflicting requirements, a compiler based approach to this problem holds promise. We investigate the use of data transformations in addressing spatial locality and false sharing, and derive an approach that balances the impact of the two. Experimental results demonstrate that such a balanced approach outperforms those approaches that consider only one of these two issues. On an eight-processor SGI Origin 2000 system, our approach brings an additional 9% improvement over a powerful locality optimization technique that uses both loop and data transformations. Also, our approach obtains an additional 19% improvement over an optimization technique that is oriented specifically toward reducing false sharing.

关键词： parallel machines Random access memory Cost function parallel processing Concurrent computing

来源：评论

学校读者我要写书评

暂无评论

A Multi-Dimensional Assessment Model and Its Application in E-learning Courses of Computer Science 20

A Multi-Dimensional Assessment Model and Its Application in ...

引用

21st Annual Conference of the Special Interest Group in Information Technology Education, SIGITE 2020

作者： Luo, Jiwen Lu, Feng Wang, Tao Key Lab. of Parallel and Distributed Computing College of Computer National University of Defense Technology Changsha China National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Cluster and Grid Computing Lab School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China

ISBN: (纸本)9781450370455

Computer science is a practical discipline. It is always a great challenge to evaluate students' computer practice using computer-aided means for large scale students. We always need to address problems such as suspected plagiarism and deviation of the overall difficulty factor. In this paper, a multi-dimensional assessment model is designed for CS courses based on the detailed practice processing data in an E-learning system. The model comprehensively evaluates the students' learning process and results in three aspects of correctness, originality, and quality detection. Besides, the teacher can easily participate in the assessment according to their needs. The correctness is an essential requirement, and the originality is based on the clustering results of students' behaviors after clone detection to curb homework plagiarism. SonarQube is used to detect code quality and put forward higher requirements for codes. Manual participation intelligence has improved the flexibility and applicability of the model to a certain extent. We applied this model on the EduCoder online education platform and carried out a comprehensive analysis of 485 students in the parallel Programming Principles and Practice Class of Huazhong University of Science and Technology. Experiment results confirm the distinction, rationality, and fairness of the model in assessing student performance. It not only gives students a credible, comprehensive score in large-scale online practical programming courses but also gives teachers and students corresponding suggestions based on the evaluation results. Furthermore, the model can be extended to other online education platforms. © 2020 ACM.

关键词： Students

来源：评论

学校读者我要写书评

暂无评论

A resource broker for computing nodes selection in grid computing environments 3rd

引用

3rd International Conference on Grid and Cooperative computing, GCC2004

作者： Yang, Chao-Tung Lai, Chuan-Lin Shih, Po-Chi Li, Kuan-Ching High-Performance Computing Laboratory Department of Computer Science and Information Engineering Tunghai University Taichung407 Taiwan Parallel and Distributed Processing Center Department of Computer Science and Information Management Providence University Shalu Taichung433 Taiwan

ISBN: (纸本)3540235647

As Grid computing is becoming a reality, there is a need for managing and monitoring the available resources worldwide, as well as the need for conveying these resources to the everyday user. This paper describes a resource broker with its main function as to match the available resources to the user’s needs. The use of the resource broker provides a uniform interface to access any of the available and appropriate resources using user’s credentials. The resource broker runs on top of the Globus Toolkit. Therefore, it provides security and current information about the available resources and serves as a link to the diverse systems available in the Grid. © Springer-Verlag Berlin Heidelberg 2004.

关键词： Grid computing

来源：评论

学校读者我要写书评

暂无评论

Interprocedural array redistribution data-flow analysis 9th

引用

9th International Workshop on Languages and Compilers for parallel computing, LCPC 1996

作者： Palermo, Daniel J. Hodges Iv, Eugene W. Banerjee, Prithviraj Hewlett-Packard Company Convex Division RichardsonTX75083 United States SAS Institute Inc CaryNC27513 United States Northwestern University Center for Parallel and Distributed Computing EvanstonIL60208 United States

ISBN: (纸本)3540630910

In High Performance Fortran (HPF), array redistribution can be described explicitly using directives (REDISTRIBUTE or REALIGN) which specify where new distributions become active or implicitly by calling functions which require different data distributions than the calling function. In order to actually compile an HPF program into an efficient form, however, both the redistribution operations as well as the possible distributions for the individual blocks of code must be known at compile-time. In this paper, we present an interprocedural dataflow framework which takes into account both explicit and implicit redistribution to automatically: (1) determine which distributions hold over specific sections of a program, (2) optimize both the inter- and intraprocedural transitions between dynamic distributions while still maintaining the original semantics of the HPF program, (3) determine when the distribution pattern specified by an HPF program causes a given array to be assigned multiple distributions due to different redistribution operations on multiple paths within a function or as a result of parameter aliasing (resulting in a non-conforming HPF program), as well as (4) convert (well behaved) dynamic HPF programs into equivalent static forms through a process we refer to as static distribution assignment (SDA) which can be used to extend the capabilities of existing subset HPF compilers that support static data distributions. As the approach presented in this paper has already been implemented as part of the PARADIGM (parallelizing compiler for distributed-memory General-purpose Multicomputers) project at the University of Illinois, examples will also be presented to demonstrate several applications of this framework. © Springer-Verlag Berlin Heidelberg 1997.

关键词： Data flow analysis

来源：评论

学校读者我要写书评

暂无评论

An integrated graphical user interface for high performance distributed computing

An integrated graphical user interface for high performance ...

引用

International Symposium on Database Engineering and Applications (IDEAS)

作者： X. Shen W.-K. Liao A. Choudhary Center for Parallel and Distributed Computing Department of Electrical and Computer Engineering Northwestern University Evanston IL USA

ISBN: (纸本)0769511406

It is very common that modern large-scale scientific applications employ multiple compute and storage resources in a heterogeneously distributed environment. Working effectively and efficiently in such an environment is one of the major concerns for designing meta-data management systems. The authors present an integrated graphical user interface (GUI) that makes the entire environment virtually an easy-to-use control platform for managing complex programs and their large datasets. To hide the I/O latency when the the user carries out interactive visualization, aggressive prefetching and caching techniques are employed in our GUI. The performance numbers show that the design of our Java GUI has achieved the goals of both high performance and ease-of-use.

关键词： Graphical user interfaces distributed computing Analytical models Delay Data visualization Prefetching Visual databases Large-scale systems Environmental management Java

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for power estimation

Parallel algorithms for power estimation

引用

Design Automation Conference

作者： V. Kim P. Banerjee Center for Parallel and Distributed Computing Department of Electrical and Computer Engineering Northwestern University Evanston IL USA

Several techniques currently exist for estimating the power dissipation of combinational and sequential circuits using exhaustive simulation, Monte Carlo sampling, and probabilistic estimation. Exhaustive simulation and Monte Carlo sampling techniques can be highly reliable but often require long runtimes. This paper presents a comprehensive study of pattern-partitioning and circuit-partitioning parallelization schemes for those two methodologies in the context of distributed-memory multiprocessing systems. Issues in pipelined event-driven simulation and dynamic load balancing are addressed. Experimental results are presented for an IBM SP-2 system and a network of HP-9000 workstations. For instance, runtimes have been reduced from over 3 hours to under 20 minutes in one case.

关键词： parallel algorithms Circuit simulation Monte Carlo methods Runtime Power dissipation Sequential circuits Power system reliability Multiprocessing systems Discrete event simulation Load management

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：