检索结果-内蒙古大学图书馆

Performance of Text-Independent Automatic Speaker Recognition on a Multicore System

Tsinghua Science and Technology 2024年第2期29卷 447-456页

作者： Rand Kouatly Talha Ali Khan Faculty of Tech and Software Engineering University of Europe for Applied SciencesPotsdam 14469Germany

This paper studies a high-speed text-independent Automatic Speaker Recognition(ASR)algorithm based on a multicore system's Gaussian Mixture Model(GMM).The high speech is achieved using parallel implementation of the feature's extraction and aggregation methods during training and testing *** memory parallel programming techniques using both OpenMP and PThreads libraries are developed to accelerate the code and improve the performance of the ASR *** experimental results show speed-up improvements of around 3.2 on a personal laptop with Intel i5-6300HQ(2.3 GHz,four cores without hyper-threading,and 8 GB of RAM).In addition,a remarkable 100%speaker recognition accuracy is achieved.

关键词： Automatic Speaker Recognition(ASR) Gaussian Mixture Model(GMM) shared memory parallel programming PThreads OpenMP

来源：评论

学校读者我要写书评

暂无评论

Program development environment for OpenMP programs on ccNUMA architectures 3rd

引用

3rd International Conference on Large-Scale Scientific Computing (ICLSSC 2001)

作者： Chapman, B Hernandez, O Patil, A Prabhakar, A Univ Houston Dept Comp Sci Houston TX USA

ISBN: (纸本)3540430431

OpenMP is emerging as a viable high-level programming model for shared memory parallel systems. Although it has also been implemented on ccNUMA architectures, it is hard to obtain high performance on such systems. In this paper, we discuss various ways in which OpenMP may be used on ccNUMA and NUMA architectures, and describe a programming style that can provide scalable high performance on such systems. We give an example of its use on the SGI Origin 2000, and on TreadMarks, a Software DSM system from Rice University. These results have encouraged us to work on a programming environment that provides general support for OpenMP application development and incorporates a system to translate standard loop-level parallel OpenMP code, with additional user input in the form of directives, into an equivalent OpenMP program relying on our alternative programming style. The equivalent program does not use constructs external to OpenMP.

关键词： shared memory parallel programming OpenMP ccNUMA architectures restructuring data locality data distribution software distributed shared memory programming environments

来源：评论

学校读者我要写书评

暂无评论

All-uses testing of shared memory parallel programs

引用

SOFTWARE TESTING VERIFICATION & RELIABILITY 2003年第1期13卷 3-24页

作者： Yang, CSD Pollock, LL Univ Delaware Newark DE 19716 USA W Chester Univ PA Comp Sci Dept W Chester PA 19383 USA

parallelism has become a way of life for many scientific programmers. A significant challenge in bringing the power of parallel machines to these programmers is providing them with a suite of software tools similar to the tools that sequential programmers currently utilize. Unfortunately, writing correct parallel programs remains a challenging task. In particular, automatic or semi-automatic testing tools for parallel programs are lacking. This paper takes a first step in developing an approach to providing all-uses coverage for parallel programs. A testing framework and theoretical foundations for structural testing are presented, including test data adequacy criteria and hierarchy, formulation and illustration of all-uses testing problems, classification of all-uses test cases for parallel programs, and both theoretical and empirical results with regard to what can be achieved with all-uses coverage for parallel programs. Copyright (C) 2003 John Wiley Sons, Ltd.

关键词： structural software testing shared memory parallel programming all-uses testing

来源：评论

学校读者我要写书评

暂无评论

Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2002年第8-9期14卷 713-739页

作者： Chapman, B Bregier, F Patil, A Prabhakar, A Univ Houston Dept Comp Sci Houston TX 77204 USA

OpenMP is emerging as a viable high-level programming model for shared memory parallel systems. It was conceived to enable easy, portable application development on this range of systems, and it has also been implemented on cache-coherent Non-Uniform memory Access (ccNUMA) architectures. Unfortunately, it is hard to obtain high performance on the latter architecture, particularly when large numbers of threads are involved. In this paper, we discuss the difficulties faced when writing OpenMP programs for ccNUMA systems, and explain how the vendors have attempted to overcome them. We focus on one such system, the SGI Origin 2000, and perform a variety of experiments designed to illustrate the impact of the vendor's efforts. We compare codes written in a standard, loop-level parallel style under OpenMP with alternative versions written in a Single Program Multiple Data (SPMD) fashion, also realized via OpenMP, and show that the latter consistently provides superior performance. A carefully chosen set of language extensions can help us translate programs from the former style to the latter (or to compile directly, but in a similar manner). Syntax for these extensions can be borrowed from HPF, and some aspects of HPF compiler technology can help the translation process. It is our expectation that an extended language, if well compiled, would improve the attractiveness of OpenMP as a language for high-performance computation on an important class of modern architectures. Copyright (C) 2002 John Wiley Sons, Ltd.

关键词： shared memory parallel programming OpenMP ccNUMA architectures restructuring data locality data distributions software distributed shared memory

来源：评论

学校读者我要写书评

暂无评论

An Architecture and programming Model for Accelerating parallel Commutative Computations via Privatization

引用

ACM SIGPLAN Notices 2017年第8期52卷 431-432页

作者： Balaji, Vignesh Tirumala, Dhruva Lucia, Brandon Carnegie Mellon University United States

来源：评论

学校读者我要写书评

暂无评论

Integrating parallel and Distributed Computing in Early Computing Classes 2023

Integrating Parallel and Distributed Computing in Early Comp...

引用

Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 2

作者： Sheikh Ghafoor Charles Weems Alan Sussman Ramachandran Vaidyanathan Sushil Prasad Tennessee Tech University Cookeville TN USA University of Massachusetts Amherst MA USA University of Maryland College Park MD USA Louisiana State University Baton Rouge LA USA University of Texas at San Antonio San Antonio TX USA

ISBN: (纸本)9781450394338

parallel and distributed computing (PDC) has become pervasive in all aspects of computing, and thus it is essential that students include parallelism and distribution in the computational thinking that they apply to problem solving, from the very beginning. Computer science education is still teaching to a 20th century model of algorithmic problem solving. Sequence, branch, and loop are taught in our early courses as the only organizing principles needed for algorithms, and we invest considerable time in showing how best to sequentially process large volumes of data. All computing devices that students use currently have multiple cores as well as a GPU in many cases. Most of their favorite applications use multiple cores and numbers of distributed processors. Often concurrency offers simpler solutions than sequential approaches. Industry is desperate for software engineers who think naturally in terms of exploiting these capabilities, rather than seeing them as an exotic upper-level topic that gets layered over a sequential solution. However, we are still teaching students to solve problems using sequential thinking. In this workshop we overview key PDC concepts and provide examples of how they may naturally be incorporated in early computing classes. We will introduce plugged and unplugged curriculum modules that have been successfully integrated in existing computing classes at multiple institutions. We will highlight the upcoming summer training workshop, for which we have funding to support attendance, as well as other CDER (Center for parallel and Distributed Computing Curriculum Development and Educational Resources) activities.

关键词： hpc education pdc education shared memory parallel programming computing education early computing class

来源：评论

学校读者我要写书评

暂无评论

POSTER: An Architecture and programming Model for Accelerating parallel Commutative Computations via Privatization 17

POSTER: An Architecture and Programming Model for Accelerati...

引用

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of parallel programming

作者： Vignesh Balaji Dhruva Tirumala Brandon Lucia Carnegie Mellon University Pittsburgh PA USA

ISBN: (纸本)9781450344937

Synchronization and data movement are the key impediments to an efficient parallel execution. To ensure that data shared by multiple threads remain consistent, the programmer must use synchronization (e.g., mutex locks) to serialize threads' accesses to data. This limits parallelism because it forces threads to sequentially access shared resources. Additionally, systems use cache coherence to ensure that processors always operate on the most up-to-date version of a value even in the presence of private caches. Coherence protocol implementations cause processors to serialize their accesses to shared data, further limiting parallelism and performance.

关键词： cache-coherence shared memory parallel programming commutativity

来源：评论

学校读者我要写书评

暂无评论

POSTER: HythTM: Extending the Applicability of Intel TSX Hardware Transactional Support 17

POSTER: HythTM: Extending the Applicability of Intel TSX Har...

引用

Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of parallel programming

作者： Arnamoy Bhattacharyya Mike Dai Wang Mihai Burcea Yi Ding Allen Deng Sai Varikooty Shafaaf Hossain Cristiana Amza University of Toronto Toronto ON Canada

ISBN: (纸本)9781450344937

In this work, we introduce and experimentally evaluate a new hybrid software-hardware Transactional memory prototype based on Intel's Haswell TSX architecture. Our prototype extends the applicability of the existing hardware support for TM by interposing a hybrid fall-back layer before the sequential, big-lock fall-back path, used by standard TSX-supported solutions in order to guarantee progress. In our experimental evaluation we use SynQuake, a realistic game benchmark modeled after Quake. Our results show that our hybrid transactional system,which we call HythTM, is able to reduce the number of transactions that go to the sequential software layer, hence avoiding hardware transaction aborts and loss of parallelism. HythTM optimizes application throughput and scalability up to 5.05x, when compared to the hardware TM with sequential fall-back path.

关键词： commutativity cache-coherence shared memory parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：