the technology of parallel 10 is one of the key technologies for high performance computer. Firstly, the 10 system of the newest Top500 typical machines will be introduced in this paper. Secondly, a new distributed sh...
详细信息
ISBN:
(纸本)9780769530994
the technology of parallel 10 is one of the key technologies for high performance computer. Firstly, the 10 system of the newest Top500 typical machines will be introduced in this paper. Secondly, a new distributed shared parallel 10 system for high performance computer (DSPIO) will be put forward, and some key technologies implemented in the system 14411 be discussed Finally, a prototype system is built. the experiment results show that this architecture can offer high 10 bandwidth, good scalability, and suit for high performance computing very much.
the rapid rise in communication delay and the speed of downloading Wireless Communication devices with Fifth Generation Wireless Networks (5-G) wireless networks, mobile edge servers may use the high-speed local link ...
详细信息
ISBN:
(纸本)9781665414852
the rapid rise in communication delay and the speed of downloading Wireless Communication devices with Fifth Generation Wireless Networks (5-G) wireless networks, mobile edge servers may use the high-speed local link to reduce the demand of their core networks by caching or prefetching content. the Information-Centric Networking (ICN) with Mobile Edge computing (MEC) aims to upgrade its system to overcome the lack of network capacity. the aim is to provide a network service that is more suitable today (particularly resource allocation and mobility) and resilient to interruptions and failures. Cache resource allocation and mobility is one of the problematic and realistically major issues of Information-Centric-Network and in Mobile Edge computing. though several studies have been optimized. A minimum consideration has been devoted to minimizing user energy consumption to improve caching efficiency and service quality. To address this deficiency, the focus of this study is optimized cache resource allocation for closer users and connected networks with minimum time delay and energy consumption considering mobility. For this reason, the problem of caching resource allocation (CRA) is divided into two subproblems: the problem of Base Station caching ability (BSCC) and the challenge of Request MAtching (RMAT). then we can one by one solve BSCC and RMAT and merge their CRA solutions.
the world of HPC systems is changing to a more complicated system because the performance improvement of processors has been slowed down. One of the promising approaches is Domain-Specific Language (DSL), which provid...
详细信息
ISBN:
(纸本)9781665497473
the world of HPC systems is changing to a more complicated system because the performance improvement of processors has been slowed down. One of the promising approaches is Domain-Specific Language (DSL), which provides a productive environment to create a high-efficient program without pain. However, existing DSL platforms themselves often lack portability and cost DSL developers great effort. To solve this issue, we propose an Aspect-Oriented Programming (AOP) based DSL constructing platform, enabling developers to build a DSL platform by combining Aspect modules. Aspect modules manage abstracted application flow, data structure, and memory access on our platform. therefore, developers can create any DSL platform whose target application has the attributes which HPC applications usually have, the abstraction assumes. this study implemented a prototype platform that can handle MPI and OpenMP layers. the prototype supports three types of applications (Structured-Grid, Unstructured-Grid, and Particle Simulation). then, we evaluated the overheads caused by achieving flexibility and productivity of the platform.
Flooding and random walk are two basic mechanisms for blind search in unstructured peer-to-peer overlays. Although these mechanisms have been widely studied experimentally and via simulations, they have not been analy...
详细信息
ISBN:
(纸本)0769526403
Flooding and random walk are two basic mechanisms for blind search in unstructured peer-to-peer overlays. Although these mechanisms have been widely studied experimentally and via simulations, they have not been analytically modeled. Time overhead, message overhead, and success rate are often used as metrics for search schemes. this paper shows that node coverage is an important metric to estimate performance metrics such as the message efficiency, success rate, and object recall of a blind search. the paper then presents two simple models to analyze node coverage in random graph overlays. these models are useful to set query parameters, evaluate search efficiency, and to estimate object replication on a statistical basis.
DNA sequence assembly is a fundamental part of biological computing. However, most of the large-scale sequence assemblies require intensive computing power and huge storage. To speed up the assembly process, we here p...
详细信息
ISBN:
(纸本)0769526950
DNA sequence assembly is a fundamental part of biological computing. However, most of the large-scale sequence assemblies require intensive computing power and huge storage. To speed up the assembly process, we here propose a method for large-scale DNA sequence assembly by using computing grid the central idea of our method is to first cluster the input of fragment set into many non-intersected subsets using k-mers and then to distribute them to all nodes of the grid-computing system. Our method has accuracy of more than 92% on the test data sets under the simulated grid-computing system but costing shorter time and lower storage. Our method can efficiently process large-scale DNA sequence assembly by taking advantage of huge storage and computing capacity of computing gird.
Mobile devices have limited resources including short battery life, storage capacity and processor performance. this limits the applications that can run on it. Mobile applications can be partitioned so that some of t...
详细信息
An effective approach to accelerate applications is to execute them in parallel. there are value localities in values of program variables. Data value reuse is able to enhance performance in applications by canceling ...
详细信息
ISBN:
(纸本)3540240136
An effective approach to accelerate applications is to execute them in parallel. there are value localities in values of program variables. Data value reuse is able to enhance performance in applications by canceling same calculations. We propose the use of data value reuse and speculative parallelism with software to execute existing sequential applications in parallel. this study profiles value localities that exist in method arguments of benchmark programs, and evaluates performance improvements by applying data value reuse and speculative parallelism.
Nowadays, in-memory data analytic platforms, such as Spark, are widely adopted in big data processing. the proper memory capacity configuration has been proved to be an efficient way to guarantee the workload performa...
详细信息
ISBN:
(纸本)9781538637906
Nowadays, in-memory data analytic platforms, such as Spark, are widely adopted in big data processing. the proper memory capacity configuration has been proved to be an efficient way to guarantee the workload performance in such platforms. Currently, Spark adopts the static way to configure the memory capacity for workloads based on user specifications. However, due to the lack of deep knowledge of the target platform and workload characteristics, nonexpert users often conservatively configure the memory capacity in an excessive way, which reduces the memory utilization significantly. On the other hand, as the memory requirements are quite different among diverse workloads, there is not the one-size-fits-all solution for memory capacity configuration. Aiming on these issues, we propose WSMC, a workload-specific memory capacity configuration approach for the Spark workloads, which guides users on the memory capacity configuration withthe accurate prediction of the workload's memory requirement under various input data size and parameter settings. First, WSMC classifies the in-memory computing workloads into four categories according to the workloads' Data Expansion Ratio. Second, WSMC establishes a memory requirement prediction model withthe consideration of the input data size, the shuffle data size, the parallelism of the workloads and the data block size. For the ad-hoc workload, WSMC can profile its Data Expansion Ratio with small-sized input data and decide the category that the workload falls into. Users can then determine the accurate configuration in accordance withthe corresponding memory requirement prediction. through the comprehensive evaluations with SparkBench workloads, we found that, contrasting withthe default configuration, configuration withthe guide of WSMC can save over 40% memory capacity withthe workload performance slight degradation (only 5%), and compared to the proper configuration found out manually, the configuration withthe guide
We define a method to automatically synthesize efficient distributed implementations from high-level global choreographies. A global choreography describes the execution and communication logic between a set of provid...
We define a method to automatically synthesize efficient distributed implementations from high-level global choreographies. A global choreography describes the execution and communication logic between a set of provided processes which are described by their interfaces. At the choreography level, the operations include multiparty communications, choice, loop, and branching. A choreography is master triggered: it has one master to trigger its execution. this allows us to automatically generate conflict-free distributed implementations without controllers. the behavior of the synthesized implementations follows the behavior of choreographies. In addition, the absence of controllers ensures the efficiency of the implementation and reduces the communication needed at runtime. Moreover, we define a translation of the distributed implementations to equivalent Promela versions. the translation allows verifying the distributed system against behavioral properties. We implemented a Java prototype to validate the approach and applied it to automatically synthesize micro-service architectures. We also illustrate our method on the automatic synthesis of a verified distributed buying system. (C) 2020 Elsevier Inc. All rights reserved.
暂无评论