检索结果-内蒙古大学图书馆

Steal Locally, Share Globally A Strategy for multiprogramming in the Manycore Era

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING 2015年第5期43卷 894-917页

作者： Tousimojarad, Ashkan Vanderbauwhede, Wim Univ Glasgow Sch Comp Sci Glasgow Lanark Scotland

In a general-purpose computing system, several parallel applications run simultaneously on the same platform. Even if each application is highly tuned for that specific platform, additional performance issues are arising in such a dynamic environment in which multiple applications compete for the resources. Different scheduling and resource management techniques have been proposed either at operating system or user level to improve the performance of concurrent workloads. In this paper, we propose a task-based strategy called "Steal Locally, Share Globally" implemented in the runtime of our parallel programming model GPRM (Glasgow Parallel Reduction Machine). We have chosen a state-of-the-art manycore parallel machine, the Intel Xeon Phi, to compare GPRM with some well-known parallel programming models, OpenMP, Intel Cilk Plus and Intel TBB, in both single-programming and multiprogramming scenarios. We show that GPRM not only performs well for single workloads, but also outperforms the other models for multiprogramming workloads. There are three considerations regarding our task-based scheme: (i) It is implemented inside the parallel framework, not as a separate layer;(ii) It improves the performance without the need to change the number of threads for each application (iii) It can be further tuned and improved, not only for the GPRM applications, but for other equivalent parallel programming models.

关键词： Parallel programming multiprogramming Task stealing Manycore processors GPRM Intel Xeon Phi

来源：评论

学校读者我要写书评

暂无评论

Quantum Tunneling: From Theory to Error-Mitigated Quantum Simulation

引用

ADVANCED QUANTUM TECHNOLOGIES 2025年第1期8卷

作者： Catrina, Sorana Baicoianu, Alexandra Transilvania Univ Brasov Fac Math & Comp Sci Eroilor 29 Brasov 500036 Romania Transilvania Univ Brasov Dept Math & Comp Sci Eroilor 29 Brasov 500036 Romania

Ever since the discussions about a possible quantum computer arised, quantum simulations have been at the forefront of possible utilities, with the task of quantum simulations being one that promises quantum advantage. Recently, advancements have made it feasible to simulate complex molecules using Variational Quantum Eigensolvers or study the dynamics of many-body spin Hamiltonians. These simulations have the potential to yield valuable outcomes through the application of error mitigation techniques. Simulating smaller models carries a great amount of importance as well and currently, in the Noisy Intermediate Scale Quantum era, is more feasible since it is less prone to errors. The objective of this work is to examine the theoretical background and the circuit implementation of a quantum tunneling simulation, with an emphasis on hardware considerations. This study presents the theoretical background required for such implementation and highlights the main stages of its development. By building on classic approaches of quantum tunneling simulations, this study aims at improving the result of such simulations by employing error mitigation techniques, Zero Noise Extrapolation, and Readout Error Mitigation and uses them in conjunction with multiprogramming of the quantum chip, a technique used for solving the hardware under-utilization problem that arises in such contexts. Quantum simulations are regarded as a promising undertaking in the field of quantum computing. This study focuses on quantum tunneling and aims at simulating it on a quantum computer. With a focus on hardware run considerations for superconducting architectures, various circuit implementation alternatives are clarified. The role of the compiler, the need for hardware aware design, different error mitigation techniques and multiprogramming are discussed, giving a final workflow tailored for the Noisy Intermediate Scale Quantum era. image

关键词： compiler/transpiler error mitigation multiprogramming quantum simulation

来源：评论

学校读者我要写书评

暂无评论

Enabling Efficient Real-Time Calibration on Cloud Quantum Machines

IEEE TRANSACTIONS ON QUANTUM ENGINEERING

引用

IEEE TRANSACTIONS ON QUANTUM ENGINEERING 2023年第1期4卷 1页

作者： Liu, Yiding Li, Zedong Robertson, Alan Fu, Xin Song, Shuaiwen Leon Univ Houston Dept Elect & Comp Engn Houston TX 77204 USA Yangtze Delta Ind Innovat Ctr Quantum Sci & Techn Suzhou 215100 Peoples R China Univ Sydney Sch Comp Sci Camperdown NSW 2006 Australia

Noisy intermediate-scale quantum computers are widely used for quantum computing (QC) from quantum cloud providers. Among them, superconducting quantum computers, with their high scalability and mature processing technology based on traditional silicon-based chips, have become the preferred solution for most commercial companies and research institutions to develop QC. However, superconducting quantum computers suffer from fluctuation due to noisy environments. To maintain reliability for every execution, calibration of the quantum processor is significantly important. During the long procedure to calibrate physical quantum bits (qubits), quantum processors must be turned into offline mode. In this work, we propose a real-time calibration framework (RCF) to execute quantum program tasks and calibrate in-demand qubits simultaneously, without interrupting quantum processors. Across a widely used noisy intermediate-scale quantum (NISQ) evaluation benchmark suite such as QASMBench, RCF achieves up to 18% reliability improvement for applications. For reliability on different physical qubits, RCF achieves an average gain of 15.7% (up to 36.7%). For cloud quantum machines, the throughput can be improved up to 9.5 throughput per minute (6.5 on average) based on baseline calibration time. In conclusion, RCF offers a reliable solution for large-scale, long-serving quantum machines.

关键词： Compiler multiprogramming noisy intermediate-scale quantum (NISQ) quantum computing (QC) Compiler multiprogramming noisy intermediate-scale quantum (NISQ) quantum computing (QC)

来源：评论

学校读者我要写书评

暂无评论

Simultaneous Execution of Quantum Circuits on Current and Near-Future NISQ Systems

IEEE TRANSACTIONS ON QUANTUM ENGINEERING

引用

IEEE TRANSACTIONS ON QUANTUM ENGINEERING 2022年第1期3卷 1页

作者： Ohkura, Yasuhiro Satoh, Takahiko Van Meter, Rodney Keio Univ Quantum Comp Ctr Yokohama Kanagawa 2238522 Japan Keio Univ SFC Grad Sch Media & Governance Fujisawa Kanagawa 2520882 Japan Keio Univ Fac Sci & Technol Grad Sch Sci & Technol Yokohama Kanagawa 2238522 Japan Keio Univ SFC Fac Environm & Informat Studies Fujisawa Kanagawa 2520882 Japan

In the noisy intermediate-scale quantum (NISQ) era, the idea of quantum multiprogramming, running multiple quantum circuits (QCs) simultaneously on the same hardware, helps to improve the throughput of quantum computation. However, the crosstalk, unwanted interference between qubits on NISQ processors, may cause performance degradation when using multiprogramming. To address this challenge, we introduce palloq (parallel allocation of QCs), a novel compilation protocol. Palloq improves the performance of quantum multiprogramming on NISQ processors, while paying attention to 1) the combination of QCs chosen for parallel execution and 2) the assignment of program qubit variables to physical qubits, to reduce unwanted interference among the active set of QCs. We also propose a software-based crosstalk detection protocol using a new combination of randomized benchmarking methods. Our method successfully characterizes the suitability of hardware for multiprogramming with relatively low detection costs. We found a tradeoff between the success rate and execution time of the multiprogramming. Our results will be of value when device throughput becomes a significant bottleneck. Until service providers have enough quantum processors available to more than meet demand, this approach will be attractive to the service providers and users who want to optimize job management and throughput of the processor.

关键词： Compiler crosstalk multiprogramming noisy intermediate-scale quantum (NISQ) quantum computing

来源：评论

学校读者我要写书评

暂无评论

Improving the efficiency of graph algorithm executions on high-performance computing

Improving the efficiency of graph algorithm executions on hi...

引用

作者： Moori, Marcelo K. de A. Rocha, Hiago Mayk G. Schwarzrock, Janaina Lorenzon, Arthur F. Beck, Antonio Carlos S. Institute of Informatics Federal University of Rio Grande do Sul Rio Grande do Sul Porto Alegre Brazil Optimization Systems Laboratory Federal University of Pampa Rio Grande do Sul Alegrete Brazil

The growing need for extracting information from large graphs has been pushing the development of parallel graph algorithms. However, the highly irregular structure of the real-world graphs limits the performance and energy improvements of graph applications. In this paper, we show that, in most cases, using all the available cores of the multiprocessor is not the best option in terms of the aforementioned non-functional requirements. Based on that, we propose GraphKat, a framework that enables the simultaneous processing of several algorithms/graphs instead of executing them serially (i.e., one after another), increasing efficiency in terms of performance and energy. GraphKat works in two steps: (i) it characterizes the graph applications with a specific number of threads based on their efficiency levels;and (ii) it defines the execution order of all graph applications in the target system. Experimental results on three multicore processors (Intel and AMD) show that GraphKat improves the overall system's efficiency related to performance (up to (Figure presented.)) and energy-saving (up to 245.21 (Figure presented.)), and reduces the graph applications' execution time (up to (Figure presented.)) and energy consumption (up to 6.64 (Figure presented.)) compared to the default execution of parallel applications on HPC systems. © 2022 John Wiley & Sons, Ltd.

关键词： multiprogramming

来源：评论

学校读者我要写书评

暂无评论

Working Set Analytics

引用

ACM COMPUTING SURVEYS 2021年第6期53卷 1页

作者： Denning, Peter J. Naval Postgrad Sch 1 Univ Circle Monterey CA 93943 USA

The working set model for program behavior was invented in 1965. It has stood the test of time in virtual memory management for over 50 years. It is considered the ideal for managing memory in operating systems and caches. Its superior performance was based on the principle of locality, which was discovered at the same time;locality is the observed tendency of programs to use distinct subsets of their pages over extended periods of time. This tutorial traces the development of working set theory from its origins to the present day. We will discuss the principle of locality and its experimental verification. We will show why working set memory management resists thrashing and generates near-optimal system throughput. We will present the powerful, linear-time algorithms for computing working set statistics and applying them to the design of memory systems. We will debunk several myths about locality and the performance of memory systems. We will conclude with a discussion of the application of the working set model in parallel systems, modern shared CPU caches, network edge caches, and inventory and logistics management.

关键词： Working set working set model program behavior virtual memory cache paging policy locality locality principle thrashing multiprogramming memory management optimal paging

来源：评论

学校读者我要写书评

暂无评论

Efficient Exact Response Time Analysis for Fixed Priority Scheduling in Lowest Priority First-Based Feasibility Tests

引用

IEEE EMBEDDED SYSTEMS LETTERS 2021年第3期13卷 69-72页

作者： Kim, Saehwa Hankuk Univ Foreign Studies Dept Informat & Commun Engn Yongin 17035 South Korea

The exact response time analysis for fixed priority scheduling (FPS) in the lowest priority first-based feasibility tests is commonly required as a part of system design tools. This letter proposes an efficient method for this, which we named incremental lower bound (ILB) calculation method. Compared to the best algorithm that has been known so far, which is the incremental calculation method, ILB reduces the feasibility test iterations/run times by more than 38% and 20% regardless of varying utilization and the number of tasks in task sets.

关键词： Task analysis Time factors Job shop scheduling Standards Dynamic scheduling Mathematical model Multiprocessing multiprogramming multitasking real-time and embedded systems scheduling worst-case response time (WCRT) analysis

来源：评论

学校读者我要写书评

暂无评论

Performance optimization of task intensive real-time applications on multicore ECUs-A hybrid scheduler

引用

JOURNAL OF ENGINEERING-JOE 2021年第9期2021卷 510-516页

作者： Mishra, Geetishree Hegde, Rajeshwari BMS Coll Engn Elect & Commun Engn Bangalore Karnataka India BMS Coll Engn Elect & Telecommun Engn Bangalore Karnataka India

In the current approach of automotive electronic system design, the multicore processors have prevailed to achieve high computing performance at low thermal dissipation. Multicore processors offer functional parallelism that helps in meeting the safety critical requirements of vehicles. The number of Electronic Control Units (ECUs) in high-end cars could be reduced by conglomerating more functions into a multicore ECU. AUTOSAR stack has been designed to support the applications developed for multicore ECUs. The real challenges lie in adapting new design methods while developing sophisticated applications with multicore constraints. It is imperative to utilize the most of multicore computational capability towards enhancing the overall performance of ECUs. In this context, the scheduling of the real-time multitasking software components by the operating system is one of the challenging issues to be addressed. Here, the state-of-the-art scheduling algorithm is reviewed and its merits and limitations are identified. A hybrid scheduler has been proposed, tested and compared with the state-of-the-art algorithm that offers better performance in terms of CPU utilization, average response time and deadline missing rate both in normal and high load conditions.

关键词： real-time systems multicore constraints multiprogramming multiprocessing systems operating systems (computers) low thermal dissipation high load conditions automotive electronic system design automotive engineering embedded systems multicore ECU Software engineering techniques high-end cars state-of-the-art scheduling algorithm Multiprocessing systems Operating systems sophisticated applications Automobile electronics real-time multitasking software components design methods hybrid scheduler multicore processors functional parallelism multicore computational capability real-time applications scheduling Electronic Control Units ECUs average response time high computing performance operating system normal load conditions automotive electronics

来源：评论

学校读者我要写书评

暂无评论

Improving the efficiency of graph algorithm executions on high-performance computing

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2023年第18期35卷

作者： Moori, Marcelo K. Rocha, Hiago Mayk G. de A. Schwarzrock, Janaina Lorenzon, Arthur F. Beck, Antonio Carlos S. Univ Fed Rio Grande do Sul Inst Informat Porto Alegre RS Brazil Fed Univ Pampa Optimizat Syst Lab Alegrete RS Brazil

The growing need for extracting information from large graphs has been pushing the development of parallel graph algorithms. However, the highly irregular structure of the real-world graphs limits the performance and energy improvements of graph applications. In this paper, we show that, in most cases, using all the available cores of the multiprocessor is not the best option in terms of the aforementioned non-functional requirements. Based on that, we propose GraphKat, a framework that enables the simultaneous processing of several algorithms/graphs instead of executing them serially (i.e., one after another), increasing efficiency in terms of performance and energy. GraphKat works in two steps: (i) it characterizes the graph applications with a specific number of threads based on their efficiency levels;and (ii) it defines the execution order of all graph applications in the target system. Experimental results on three multicore processors (Intel and AMD) show that GraphKat improves the overall system's efficiency related to performance (up to 434.26x$$ 434.26\times $$) and energy-saving (up to 245.21x$$ \times $$), and reduces the graph applications' execution time (up to 17.70x$$ 17.70\times $$) and energy consumption (up to 6.64x$$ \times $$) compared to the default execution of parallel applications on HPC systems.

关键词： graph analytics parallel graph processing multiprogramming system efficiency energy-saving

来源：评论

学校读者我要写书评

暂无评论

Hybrid IT and Multi Cloud an Emerging Trend and Improved Performance in Cloud Computing

引用

SN Computer Science 2020年第5期1卷 1-6页

作者： Gundu, Srinivasa Rao Panem, Charan Arur Thimmapuram, Anuradha Department of Computer Science Dravidian University Kuppam India Department of Electronics Goa University Goa India

In the present day scenario cloud computing is an attractive subject for IT and non IT personnel. It is a service-oriented pay per use computational model. Cloud has working models with service-oriented delivery mechanism as well as deployment-oriented infrastructure mechanism. Data centers are the backbone of cloud computing. The massive participation of public has also increased the load on the cloud servers. Proper scheduling of resources is always needed. Quality of service is to be provided as per the service level agreement. Virtualization technique is the main reason behind the huge success of cloud. Multi-cloud exchanges to optimize connectivity today, multi-cloud exchanges offer the next level in direct connectivity, allowing organizations to safely and easily expand multi-cloud capabilities. Exchanges eliminate the added worries that an open Internet can bring as well as the tedious provisioning and configuring that comes with connecting to the public Internet. Importantly, multi-cloud exchanges allow organizations to establish a single connection to multiple cloud providers at the same time through an Ethernet switching platform, rather than wrestling with multiple individual connections to cloud providers. © 2020, Springer Nature Singapore Pte Ltd.

关键词： Cloud computing Device independent computing Digital revolution Distributed computing Location independent computing Multi-cloud multiprogramming Scalability Service level agreement Service-oriented architecture Sophisticated technology Virtualization Virtualization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：