Skeletal parallelism offers a good trade-off between programming productivity and execution efficiency. In this style of parallelism, an application is a composition of algorithmic skeletons. An algorithmic skeleton c...
详细信息
Program analysis supporting software development is often part of edit-compile-cycles and precise program analysis is time consuming. Withthe availability parallel processing power on desktop computers, paral-lelizat...
详细信息
ISBN:
(纸本)9780889867741
Program analysis supporting software development is often part of edit-compile-cycles and precise program analysis is time consuming. Withthe availability parallel processing power on desktop computers, paral-lelization is a way to speed up program analysis. this paper introduces a parallelization schema for program analysis that can be translated to parallel machines using standard scheduling techniques. First benchmarks analyzing a number of Java programs indicate that the schema scales well for up to 8 processors, but not very well for 128 processors. these results are a first step towards more precise program analysis in Integrated Development Environments utilizing the computational power of today's custom computers.
We propose a new approach, called cluster-based search (CBS), for scheduling large task graphs in parallel on a heterogeneous cluster of workstations connected by a high-speed network (e.g., using an ATM switch at OC-...
详细信息
We propose a new approach, called cluster-based search (CBS), for scheduling large task graphs in parallel on a heterogeneous cluster of workstations connected by a high-speed network (e.g., using an ATM switch at OC-3 speed). the CBS algorithm uses a parallel random neighborhood search which works by refining multiple different initial schedules simultaneously using different workstations. the workstations communicate periodically to exchange their best solutions found thus far in order to direct the search to more promising regions in the search space. Heterogeneity of machines is exploited by the biased partitioning of the search space. the parallel random neighborhood search is fault-tolerant in that the workload of a failed workstation is automatically redistributed to other workstations so that the search can continue. We have implemented the CBS algorithm as a core function of our on-going development of SSI middleware for a Sun workstation cluster.
Fault tolerance with active replication and load balancing are two complementary techniques. their marriage not only enhances a distributed system's robustness, but also improves a distributed system's efficie...
详细信息
Fault tolerance with active replication and load balancing are two complementary techniques. their marriage not only enhances a distributed system's robustness, but also improves a distributed system's efficiency. this paper analyzes the pros and cons of both techniques, presents a novel load balancing framework for fault tolerant systems with active replication. Hierarchical architecture is described in detail. Further, three potential task scheduler group selection methods are given out and their advantages and disadvantages are addressed and compared.
the design, implementation and experimental study of an efficient VSM-based parallel text retrieval method oriented to PC-Cluster environments are presented in this paper. Our approach is based on direct VSM-based ind...
详细信息
As the computing ability of high performance computers are improved by increasing the number of computing elements, how to utilize the available computing resources becomes an important issue. Different strategies to ...
详细信息
ISBN:
(纸本)9781605585871
As the computing ability of high performance computers are improved by increasing the number of computing elements, how to utilize the available computing resources becomes an important issue. Different strategies to solve an problem based on a multi-processing system can bring about distinct performance. In this paper, we propose a method to predict the performance of parallel applications. the method describes the parallel features of the multi-processing systems in a hierarchy way, and evaluates solutions based on the description. In this way, programmers can find the better solution of an application before real programming.
Although deep learning takes researchers out of complicated feature engineering, designing practical deep learning models is still a complicated process. Neural Architecture Search (NAS) is famous for automating the d...
详细信息
Large-scale graph problems are becoming increasingly important in science and engineering. the irregular, sparse instances are especially challenging to solve on cache-based architectures as they are known to incur er...
详细信息
ISBN:
(纸本)9780889867048
Large-scale graph problems are becoming increasingly important in science and engineering. the irregular, sparse instances are especially challenging to solve on cache-based architectures as they are known to incur erratic memory access patterns. Yet many of the algorithms also exhibit some degree of regularity with memory accesses. It is important to characterize the locality behavior in order to bridge the gap between algorithm and architecture. In our study we quantify the locality of several fundamental graph algorithms, both sequential and parallel, and correlate our observations withthe algorithmic design. Our study of locality behavior brings insight into the impact of different cache architectures on the performance of both sequential and parallel graph algorithms.
the severe energy constraints of wireless sensor networks (WSNs) require energy-efficient communication protocols in order to fulfill the objectives of the application. Cross-layer design is a technique which can pote...
详细信息
ISBN:
(纸本)9780889867048
the severe energy constraints of wireless sensor networks (WSNs) require energy-efficient communication protocols in order to fulfill the objectives of the application. Cross-layer design is a technique which can potentially be used to improve the overall performance of WSNs by way of jointly optimizing and exploiting the interactions between various layers of the network protocol stack. In this paper, we propose a cross-layer framework design for the Embedded Middleware in Mobility Applications (EMMA) project. this optimization agent based framework design provides efficient data exchange between the various protocols layers via a state repository to improve the performance of WSN applications in terms of memory consumption and processing overhead.
暂无评论