Cloud computing, the fastest-growing IT technology, offers flexibility and scalability with pay-per-use models, but it raises concerns about data security due to third-party storage and online accessibility. Effective...
详细信息
The primary targets for improving efficiency for large-scale matrix factorization are reducing synchronization, addressing the overlap in communication and computation, and improving load balance. In recent years, til...
详细信息
ISBN:
(纸本)9798400708893
The primary targets for improving efficiency for large-scale matrix factorization are reducing synchronization, addressing the overlap in communication and computation, and improving load balance. In recent years, tiled algorithms with task parallelism in multicore shared memory systems have become well-established as efficient methods for conducting fine-grained computations on smaller tiles. Moreover, they provide flexible execution orders for a runtime system in many situations. However, traditional hybrid programs with MPI and OpenMP for distributed memory systems use a fork-join model for multi-threads in each process, which leads to thread-parallelcomputing tasks interchange with sequential communication tasks. In this paper, we incorporate task parallelism and low-rank approximation into a hybrid task-based Cholesky factorization in a distributed environment and propose some low-rank variants. We evaluate the performance of our programs on both full-rank inputs and low-rank inputs and report the pros and cons of the proposed programs.
Task-based execution frameworks, such as parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application...
详细信息
ISBN:
(纸本)9798350365627;9798350365610
Task-based execution frameworks, such as parallel programming libraries, computational workflow systems, and function-as-a-service platforms, enable the composition of distinct tasks into a single, unified application designed to achieve a computational goal and abstract the parallel and distributed execution of those tasks on arbitrary hardware. Research into these task executors has accelerated as computational sciences increasingly need to take advantage of parallel compute and/or heterogeneous hardware. However, the lack of evaluation standards makes it challenging to compare and contrast novel systems against existing implementations. Here, we introduce TAPS, the Task Performance Suite, to support continued research in distributed task executor frameworks. TAPS provides (1) a unified, modular interface for writing and evaluating applications using arbitrary execution frameworks and data management systems and (2) an initial set of reference synthetic and real-world science applications. We discuss how the design of TAPS supports the reliable evaluation of frameworks and demonstrate TAPS through a survey of benchmarks using the provided reference applications.
Remote Memory Access (RMA) enables direct access to remote memory to achieve high performance for HPC applications. However, most modern parallel programming models lack schemes for the remote process to detect the co...
详细信息
ISBN:
(数字)9798350352917
ISBN:
(纸本)9798350352924;9798350352917
Remote Memory Access (RMA) enables direct access to remote memory to achieve high performance for HPC applications. However, most modern parallel programming models lack schemes for the remote process to detect the completion of RMA operations. Many previous works have proposed programming models and extensions to notify the communication peer, but they did not solve the multi-NIC aggregation, portability, hardware-software co-design, and usability problems. In this work, we proposed a Unified Notifiable RMA (UNR) library for HPC to address these challenges. In addition, we demonstrate the best practice of utilizing UNR within a real-world scientific application, PowerLLEL. We deployed UNR across four HPC systems, each with a different interconnect. The results show that PowerLLEL powered by UNR achieves up to a 36% acceleration on 1728 nodes of the Tianhe-Xingyi supercomputing system.
We explore the landscape of distributed machine learning, focusing on advancements, challenges, and potential future directions in this rapidly evolving field. We delve into the motivation for distributed machine lear...
详细信息
Given the contemporary complexities within the realm of digital forensics, distributedcomputing emerges as an imperative for effective password cracking. When confined to exclusively utilizing open-source software fo...
详细信息
ISBN:
(纸本)9798350386783;9798350386776
Given the contemporary complexities within the realm of digital forensics, distributedcomputing emerges as an imperative for effective password cracking. When confined to exclusively utilizing open-source software for password-cracking endeavors, it becomes evident that hashcat stands out unequivocally due to its exceptional speed and an extensive repertoire of supported hash formats. Traditionally, Hashcat is able to support a distributed hash-cracking system via overlays. In the paper, we show how to make hashcat parallel intuitively by introducing a message-passing interface (MPI) and provide a working solution for performing different cracking attacks. Experimental results of multiple cracking tasks demonstrate that the proposed approach is effective.
The geographically distributed edge servers can naturally draw power from nearby renewable energy (RE) generators. Complemented by the dynamic scheduling of energy storage batteries, edge service providers (ESPs) can ...
详细信息
ISBN:
(纸本)9798400717932
The geographically distributed edge servers can naturally draw power from nearby renewable energy (RE) generators. Complemented by the dynamic scheduling of energy storage batteries, edge service providers (ESPs) can thus build low- or even zero-carbon edge computing systems. Nevertheless, the distributed and heterogeneous nature of edge computing systems, as well as the limited information sharing among ESPs, leads to a more complex battery planning problem than that in cloud computing. The unpredictability of RE resources further complicates the problem, making conventional model-based approaches ineffective. To this end, we propose a multi-agent deep reinforcement learning (MADRL) approach for the independent decision making of individual ESPs. Particularly, MADRL takes privacy into account by ensuring that no sensitive information is disclosed among ESPs. For better model training, we further customize the invalid action masking and develop action transformation techniques based on segmented linear optimization. Extensive experiments demonstrate that, with our proposed approach, the overall carbon emission of edge computing systems can be significantly reduced (by over 60%) while maintaining acceptable operation costs in battery scheduling.
Blockchain technology is characterized by its distributed, decentralized, and immutable ledger system which serves as a fundamental platform for managing smart contract transactions (SCTs). However, these SCTs undergo...
详细信息
ISBN:
(纸本)9783031814037;9783031814044
Blockchain technology is characterized by its distributed, decentralized, and immutable ledger system which serves as a fundamental platform for managing smart contract transactions (SCTs). However, these SCTs undergo sequential validation within a block which introduces performance bottlenecks in blockchain. In response, this paper introduces a framework called the Multi-Bin parallel Scheduler (MBPS) designed for parallelizing blockchain smart contract transactions to leverage the capabilities of multicore systems. Our proposed framework facilitates concurrent execution of SCTs, enhancing performance by allowing non-conflicting transactions to be processed simultaneously while preserving deterministic order. The framework comprises of three vital stages: conflict detection, bin creation, and execution. We conducted an evaluation of our MBPS framework in Hyperledger Sawtooth v1.2.6, revealing substantial performance enhancements compared to existing parallel SCT execution frameworks across various smart contract applications. This research contributes to the ongoing optimization efforts in blockchain technology demonstrating its potential for scalability and efficiency in real-world scenarios.
distributedcomputing enables Internet of vehicle (IoV) services by collaboratively utilizing the computing resources from the network edge and the vehicles. However, the computing interruption issue caused by frequen...
详细信息
ISBN:
(纸本)9781538674628
distributedcomputing enables Internet of vehicle (IoV) services by collaboratively utilizing the computing resources from the network edge and the vehicles. However, the computing interruption issue caused by frequent edge network handoffs, and a severe shortage of computing resources are two problems in providing IoV services. High altitude platform station (HAPS) computing can be a promising addition to existing distributedcomputing frameworks due to its wide coverage and strong computational capabilities. In this regard, this paper proposes an adaptive scheme in a new distributedcomputing framework that involves HAPS computing to deal with the two problems of the IoV. Based on the diverse demands of vehicles, network dynamics, and the time-sensitivity of handoffs, the proposed scheme flexibly divides each task into three parts and assigns them to the vehicle, roadside units (RSUs), and a HAPS to perform synchronous computing. The proposed scheme also constrains the computing of tasks at RSUs such that they are completed before handoffs to avoid the risk of computing interruptions. We formulate a delay minimization problem that considers task-splitting ratio, transmit power, bandwidth allocation, and computing resource allocation. To solve the problem, variable replacement and successive convex approximation-based methods are proposed. The simulation results show that this scheme not only avoids the negative effects caused by handoffs in a flexible manner but also it improves the delay performance and maintains the delay stability.
Serverless computing enables a new way of building and scaling cloud applications by allowing developers to write fine-grained functions. The execution duration of a cloud function is typically short, usually ranging ...
详细信息
暂无评论