Smart phones are starting to find use in mission critical applications, such as search-and-rescue operations, wherein the mission capabilities are realized by deploying a collaborating set of services across a group o...
详细信息
Smart phones are starting to find use in mission critical applications, such as search-and-rescue operations, wherein the mission capabilities are realized by deploying a collaborating set of services across a group of smart phones involved in the mission. Since these missions are deployed in environments where replenishing resources, such as smart phone batteries, is hard, it is necessary to maximize the lifespan of the mission while also maintaining its real-time quality of service (QoS) requirements. To address these requirements, this paper presents a deployment framework called Smart Deploy, which integrates bin packing heuristics with evolutionary algorithms to produce near-optimal deployment solutions that are computationally inexpensive to compute for maximizing the lifespan of smart phone-based mission critical applications. The paper evaluates the merits of deployments produced by Smart Deploy for a search-and-rescue mission comprising a heterogeneous mix of smart phones by integrating a worst-fit bin packing heuristic with particle swarm optimization and genetic algorithm. Results of our experiments indicate that the missions deployed using Smart Deploy have a lifespan that is 20% to 162% greater than those deployed using just the bin packing heuristic or evolutionary algorithms. Although Smart Deploy is slightly slower than the other algorithms, the slower speed is acceptable for offline computations of deployment.
This paper describes an architecture and FPGA synthesis tool chain for building specialized, energy-saving coprocessors called Irregular Code Energy Reducers (ICERs) for a wide range of unmodified C programs. FPGAs ar...
详细信息
This paper describes an architecture and FPGA synthesis tool chain for building specialized, energy-saving coprocessors called Irregular Code Energy Reducers (ICERs) for a wide range of unmodified C programs. FPGAs are increasingly used to build large-scale systems, and many large softwaresystems contain relatively little code that is amenable to automatic, semi-automatic, or even manual parallelization. Whereas accelerator approaches have traditionally achieved energy benefits as a side effect from increasing performance via parallel execution, ICERs aim to achieve energy gains even on code with little exploitable parallelism. Traditional approaches to automatically generating accelerators from existing software rely on inferring parallel execution from serial code, so they face the same code analysis challenges as parallelizing compilers. In contrast, because the ICER approach targets energy rather than performance, it easily scales to large, irregular applications that are poor candidates for traditional acceleration. Our results show that, compared to a baseline system with soft processor cores, ICERs can reduce energy consumption by up to 9.5x for the code they target and 2.8x for whole applications.
As cloud services proliferate, it becomes difficult to facilitate service composition and testing in clouds. In traditional service-oriented computing, service composition and testing are carried out independently. Th...
详细信息
As cloud services proliferate, it becomes difficult to facilitate service composition and testing in clouds. In traditional service-oriented computing, service composition and testing are carried out independently. This paper proposes a new approach to manage services on the cloud so that it can facilitate service composition and testing. The paper uses service implementation selection to facilitate service composition similar to Google's Guice and Spring tools, and apply the group testing technique to identify the oracle, and use the established oracle to perform continuous testing for new services or compositions. The paper extends the existing concept of template based service composition and focus on testing the same workflow of service composition. In addition, all these testing processes can be executed in parallel, and the paper illustrates how to apply service-level MapReduce technique to accelerate the testing process.
Product derivation is an essential part of the software Product Line (SPL) development process. The paperproposes a model transformation for deriving automatically a UML model of a specific product from the UML model ...
详细信息
Product derivation is an essential part of the software Product Line (SPL) development process. The paperproposes a model transformation for deriving automatically a UML model of a specific product from the UML model of a product line. This work is a part of a larger project aiming to integrate performance analysis in the SPL model-driven development. The SPL source model is expressed in UML extended with two separate profiles: a "product line" profile from literature for specifying the commonality and variability between products, and the MARTE profile recently standardized by OMG for performance annotations. The automatic derivation of a concrete product model based on a given feature configuration is enabled through the mapping between features from the feature model and their realizations in the design model. The paper proposes an efficient mapping technique that aims to minimize the amount of explicit feature annotations in the UML design model of SPL. Implicit feature mapping is inferred during product derivation from the relationships between annotated and non-annotated model elements as defined in the UML metamodel and well formedness rules. The transformation is realized in the Atlas Transformation Language (ATL) and illustrated with an ecommerce case study that models structural and behavioural SPL views.
More and more clustering approaches are used in educational data mining for too many data are generated by web-based educational systems. To deal with the problem of premature convergence of the traditional K-means al...
详细信息
More and more clustering approaches are used in educational data mining for too many data are generated by web-based educational systems. To deal with the problem of premature convergence of the traditional K-means algorithm and computational costs, an improved K-means clustering algorithm based cooperative PSO frame is proposed in this paper. Cooperative PSO has been proved effective for large scale and complex problems via a divide-and-conquer strategy by simulating coevolutionary techniques in nature. Therefore, K-means clustering algorithm based cooperative PSO frame is effective in educational data clustering.
Optimal scheduling of parallel tasks with some precedence is critical for achieving high performance in heterogeneous computing system. The application scheduling is known to be NP-complete in general cases. The c...
详细信息
Optimal scheduling of parallel tasks with some precedence is critical for achieving high performance in heterogeneous computing system. The application scheduling is known to be NP-complete in general cases. The complexity of the problem increase when task scheduling is to be done in a heterogeneous environment This paper presents a recursive task scheduling algorithm for a bounded number of heterogeneous processors run on the network of Heterogeneous systems. It is three-phase task scheduling algorithm. The task-prioritizing phase is to is to compute the upward rank of each task and assign the priority to all tasks. The processor selection phase is to schedule the tasks onto the processors that give the latest start time for the task. The moving phase is to move all the possible tasks until the starting time of the entry task is zero. The performance of the algorithm is illustrated by comparing the scheduling length ratio, frequency of best results with the existing effectively scheduling algorithms, Heterogeneous Earliest Finish Time and Iterative List Scheduling algorithm.
Estimating the worst-case execution time (WCET) of real-time embedded systems is compulsory for the verification of their correct functioning. Traditionally, the WCET of a program is estimated assuming availability of...
详细信息
Estimating the worst-case execution time (WCET) of real-time embedded systems is compulsory for the verification of their correct functioning. Traditionally, the WCET of a program is estimated assuming availability of the program's binary which is disassembled to reconstruct the program, and in some cases its source code to derive useful high-level execution information. However, in certain scenarios the program's owner requires that the binary of the program not be reverse-engineered to protect intellectual property, and in extreme situations, the program's binary is not available for the analysis, in which case it is substituted by program-execution traces. In this paper we show that we can obtain WCET estimates for programs based on runtime-generated or owner-provided time-stamped execution traces and without the need to access the source code or reverse-engineer the binaries of the programs. We show that we can provide very accurate WCET estimations using both integer linear programming (ILP) and constraint logic programming (CLP). Our method generates safe and tight WCET estimations for all the benchmarks used in the evaluation.
For real-time applications that consist of massive number of rules, partitioning of the rules to support parallel processing is important. This paper proposes a suite of algorithms called GAPCM for parallel processing...
详细信息
For real-time applications that consist of massive number of rules, partitioning of the rules to support parallel processing is important. This paper proposes a suite of algorithms called GAPCM for parallel processing of massive number of rules. By considering even distribution, minimal waiting time and minimal inter-processor communication, we propose three algorithms for subnet allocation, and apply these algorithms to association rule mining.
Lately, the use of GPUs is dominant in the field of high performance computing systems for computer graphics. However, since there is "not good for everything" solution, GPUs have also some drawbacks that ma...
详细信息
Lately, the use of GPUs is dominant in the field of high performance computing systems for computer graphics. However, since there is "not good for everything" solution, GPUs have also some drawbacks that make them not the best choice in certain scenarios: poor performance per watt ratio, difficulty to rewrite code to explode the parallelism and synchronization issues between computing cores, for example. In this work, we present the R-GRID approach based on the grid computing paradigm, with the purpose of integrating heterogenous reconfigurable devices under the umbrella of the distributed object paradigm. With R-GRID the aim is to offer an easy way to non experience hardware developers for building image processing applications using a component model. Deployment, communication, resource sharing, data access and replication of the processing cores is handled in an automatic and transparent manner, so coarse grained parallelism can be exploited effortless in R-GRID, accelerating image processing operations.
Cluster computing is a distributedparallel computing system. As a small cloud,cluster is mainly used in high availability,reliability or high performance computing. Kusu,which is the first open-source software develo...
详细信息
Cluster computing is a distributedparallel computing system. As a small cloud,cluster is mainly used in high availability,reliability or high performance computing. Kusu,which is the first open-source software developed by Platform Computing,is a tool for the deployment of cluster computing. Kusu has the advantages of operating and managing the cluster with easy,deploying cluster nodes rapidly and supporting multiple Linux operating systems. This paper describes the Kusu cluster concepts,benefits,structure and how to deploy nodes in details.
暂无评论