检索结果-内蒙古大学图书馆

20th IEEE international Parallel and Distributed Processing symposium, IPDPS 2006

作者： Guangming, Tan Lin, Xu Shengzhong, Feng Ninghui, Sun Institute of Computing Technology Chinese Academy of Sciences China Graduate School Chinese Academy of Sciences China

ISBN: (纸本)1424400546

As bioinformatics is an emerging application of high performance computing, this paper first evaluates the memory performance of several representative bioinformatics applications so that some appropriate optimization methods can be applied. Based on the computational behavior of these bioinformatics applications, we propose two optimized algorithms on high performance computer architectures. 1) For the data(I/O) intensive program, MegaBlast, we overlap computation with I/O to produce an improved high-throughput algorithm with reduced time and memory requirements. 2) For a CPU-intensive RNA secondary structure prediction algorithm, we propose a fine-grain parallel O(N3) algorithm based on reconfigurable arrays (FPGAs). In order to optimize the FPGA architecture, we evaluate the performance in different architectures using cycle-by-cycle simulator. © 2006 IEEE.

关键词： Natural sciences computing

来源：评论

学校读者我要写书评

暂无评论

Experience of Parallelizing cryo-EM 3D Reconstruction on a CPU-GPU Heterogeneous System 11

Experience of Parallelizing cryo-EM 3D Reconstruction on a C...

引用

20th international symposium on high performance Distributed computing

作者： Li, Linchuan Li, Xingjian Tan, Guangming Chen, Mingyu Zhang, Peiheng Chinese Acad Sci Inst Comp Technol Key Lab Comp Syst & Architecture Beijing Peoples R China

ISBN: (纸本)9781450305525

Heterogeneous architecture is becoming an important way to build a massive parallel computer system, i.e. the CPU-GPU heterogeneous systems ranked in Top500 list. However, it is a challenge to efficiently utilize massive parallelism of both applications and architectures on such heterogeneous systems. In this paper we present a practice on how to exploit and orchestrate parallelism at algorithm level to take advantage of underlying parallelism at architecture level. A potential Petaflops application cryo-EM 3D reconstruction is selected as an example. We exploit all possible parallelism in cryo-EM 3D reconstruction, and leverage a self-adaptive dynamic scheduling algorithm to create a proper parallelism mapping between the application and architecture. the parallelized programs are evaluated on a subsystem of Dawning Nebulae supercomputer, whose node is composed of two Intel six-core Xeon CPUs and one Nvidia Fermi CPU. the experiment confirms that hierarchical parallelism is an efficient pattern of parallel programming to utilize capabilities of both CPU and CPU in a heterogeneous system. the CUDA kernels run more than 3 times faster than the OpenMP parallelized ones using 12 cores (threads). Based on the CPU-only version, the hybrid CPU-CPU program further improves the whole application's performance by 30% on the average.

关键词： task parallelism data parallelism high performance computing CUDA cryo-EM

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 20th Annual international symposium on computer architecture

Proceedings of the 20th Annual International Symposium on Co...

引用

Proceedings of the 20th Annual international symposium on computer architecture

ISBN: (纸本)0818638109

this conference proceedings contain 32 papers. the main topics are architectural characteristics of scientific applications, TLBs and memory management, input/output systems, fault-tolerant computer architecture, multiprocessor caches, high-performance computing from the application perspective, multithreading support, shared memory systems, cache designs,and multiprocessor memory systems and interconnections.

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

Early Detection of At-Risk Students Using Machine Learning

Early Detection of At-Risk Students Using Machine Learning

引用

20th international Conference on Foundations of computer Science, FCS 2024, and 20th international Conference on Frontiers in Education, FECS 2024, held as part of the World Congress in computer Science, computer Engineering and Applied computing, CSCE 2024

作者： Martinez, Azucena L. Jimenez Sood, Kanika Mahto, Rakeshkumar California State University Department of Computer Science FullertonCA United States California State University Department of Electrical and Computer Engineering FullertonCA92831 United States

ISBN: (纸本)9783031859298

this research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected in Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps include anonymizing student information, managing missing data, and identifying the most significant features. Given the mixed data types in the datasets and the binary classification nature of this study, this work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. We predict at-risk students and plan to identify critical periods of the semester when student performance is most vulnerable. We will use validation techniques such as train test split and k-fold cross-validation to ensure the reliability of the models. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall. © the Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Support vector regression

来源：评论

学校读者我要写书评

暂无评论

PVCoherence: Designing flat coherence protocols for scalable verification

PVCoherence: Designing flat coherence protocols for scalable...

引用

20th IEEE international symposium on high performance computer architecture, HPCA 2014

作者： Zhang, Meng Bingham, Jesse D. Erickson, John Sorin, Daniel J. Department of ECE Duke University United States Intel Corporation United States

ISBN: (纸本)9781479930975

the goal of this work is to design cache coherence protocols with many cores that can be verified with state-of-the-art automated verification methodologies. In particular, we focus on flat (non-hierarchical) coherence protocols, and we use a mostly-automated methodology based on parametric verification (PV). We propose several design guidelines that architects should follow if they want to design protocols that can be parametrically verified. We experimentally evaluate performance, storage overhead, and scalability of a protocol verified with PV compared to a highly optimized protocol that cannot be verified with PV. © 2014 IEEE.

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

Reducing the cost of persistence for nonvolatile heaps in end user devices

Reducing the cost of persistence for nonvolatile heaps in en...

引用

20th IEEE international symposium on high performance computer architecture, HPCA 2014

作者： Kannan, Sudarsun Gavrilovska, Ada Schwan, Karsten Georgia Institute of Technology College of Computing Atlanta United States

ISBN: (纸本)9781479930975

this paper explores the performance implications of using future byte addressable non-volatile memory (NVM) like PCM in end client devices. We explore how to obtain dual benefits - increased capacity and faster persistence - with low overhead and cost. Specifically, while increasing memory capacity can be gained by treating NVM as virtual memory, its use of persistent data storage incurs high consistency (frequent cache flushes) and durability (logging for failure) overheads, referred to as 'persistence cost'. these not only affect the applications causing them, but also other applications relying on the same cache and/or memory hierarchy. this paper analyzes and quantifies in detail the performance overheads of persistence, which include (1) the aforementioned cache interference as well as (2) memory allocator overheads, and finally, (3) durability costs due to logging. Novel solutions to overcome such overheads include (1) a page contiguity algorithm that reduces interference-related cache misses, (2) a cache efficient NVM write aware memory allocator that reduces cache line flushes of allocator state by 8X, and (3) hybrid logging that reduces durability overheads substantially. With these solutions, experimental evaluations with different end user applications and SPEC2006 benchmarks show up to 12% reductions in cache misses, thereby reducing the total number of NVM writes. © 2014 IEEE.

关键词： Durability

来源：评论

学校读者我要写书评

暂无评论

Supporting x86-64 address translation for 100s of GPU lanes

Supporting x86-64 address translation for 100s of GPU lanes

引用

20th IEEE international symposium on high performance computer architecture, HPCA 2014

作者： Power, Jason Hill, Mark D. Wood, David A. Department of Computer Sciences University of Wisconsin-Madison United States

ISBN: (纸本)9781479930975

Efficient memory sharing between CPU and GPU threads can greatly expand the effective set of GPGPU workloads. For increased programmability, this memory should be uniformly virtualized, necessitating compatible address translation support for GPU memory references. However, even a modest GPU might need 100s of translations per cycle (6 CUs * 64 lanes/CU) with memory access patterns designed for throughput more than locality. © 2014 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Best of SBAC-PAD 2012

引用

PARALLEL computing 2014年第9期40卷 512-513页

作者： Schnorr, Lucas Mello Alexandre Navaux, Philippe Olivier Univ Fed Rio Grande do Sul Inst Informat BR-91501970 Porto Alegre RS Brazil

this special issue presents new trends in computer architecture and in parallel and distributed systems. It is based on the best papers of the 24th international symposium on computer architecture and high performance computing, which was held in New York, NY, USA on October 24-26, 2012 in the Columbia University. the authors were invited to provide extended versions of the papers presented in the conference, taking into account suggestions by the double-blinded peer review process and comments gathered during the conference.

关键词： computer architecture Parallel and distributed systems high performance computing

来源：评论

学校读者我要写书评

暂无评论

Scalably verifiable dynamic power management

Scalably verifiable dynamic power management

引用

20th IEEE international symposium on high performance computer architecture, HPCA 2014

作者： Matthews, Opeoluwa Zhang, Meng Sorin, Daniel J. Department of Electrical and Computer Engineering Duke University United States

ISBN: (纸本)9781479930975

Dynamic power management (DPM) is critical to maximizing the performance of systems ranging from multicore processors to datacenters. However, one formidable challenge with DPM schemes is verifying that the DPM schemes are correct as the number of computational resources scales up. In this paper, we develop a DPM scheme such that it is scalably verifiable with fully automated formal tools. the key to the design is that the DPM scheme has fractal behavior;that is, it behaves the same at every scale. We show that the fractal design enables scalable formal verification and simulation shows that our scheme does not sacrifice much performance compared to an oracle DPM scheme that optimally allocates power to computational resources. We implement our scheme in a 2-socket 16-core x86 system and experimentally evaluate it. © 2014 IEEE.

关键词： Fractals

来源：评论

学校读者我要写书评

暂无评论

high-level service connectors for component-based high performance computing

High-level service connectors for component-based high perfo...

引用

19th international symposium on computer architecture and high performance computing

作者： de Carvalho-Junior, Francisco Heron Correa, Ricardo Cordeiro Araujo, Gisele Azevedo Silva, Jefferson Carvalho Lins, Rafael Duelre Univ Fed Ceara Dept Comp Fortaleza Ceara Brazil Univ Fed Pernambuco Dept Elect Sistemas Recife PE Brazil

ISBN: (纸本)9780769530147

Component-based programming has been applied to address the requirements of applications in high performance computing (HPC). the usual service connectors of commercial component models do not fit some requirements of HPC, mainly regarding the support of parallelism, however this paper looks at extensions to the usual notion of service connector to meet such requirements, using the # component model as a substratum, evidencing its expressiveness.

关键词： computer programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：