For distributed networks which will be mass produced, such as computer systems in modern vehicles, it is crucial to find cost efficient hardware. A distributed network in a vehicle consists of several ECUs (electronic...
详细信息
For distributed networks which will be mass produced, such as computer systems in modern vehicles, it is crucial to find cost efficient hardware. A distributed network in a vehicle consists of several ECUs (electronic control unit). In this paper we consider the amount of memory needed for these ECUs. They should contain enough memory to survive several software generations, without inducing unnecessary cost of too much memory. Our earlier work shows that UML component diagrams can be used to collect enough information for estimating memory size using a functional size measurement method. This paper replicates our earlier experiment with more software components of a different type. We compare the results from the two experiments.
Service-oriented architecture is emerging as one of the primary research areas in softwareengineering and one of the key technologies to the integration of enterprise information system, development of distributed so...
详细信息
Service-oriented architecture is emerging as one of the primary research areas in softwareengineering and one of the key technologies to the integration of enterprise information system, development of distributedsoftware system. Complexity is an important aspect for software quality assessment and must be appropriately addressed in service-oriented architecture. In this paper we introduce the features of service-oriented systems for the analysis of a system's complexity. We analyze the key aspects in measuring service-oriented systems. On the basis of complexity metrics of product metrics for service-oriented infrastructures, we improve the complexity metrics by considering factors that influence the complexity of service oriented systems. Finally, we propose a set of complexity metrics for service-oriented systems.
In this paper, the interface between MapReduce and different storage systems is proposed. MapReduce based computing platform can access various file systems by the interface without modifying the existing computing sy...
详细信息
In this paper, the interface between MapReduce and different storage systems is proposed. MapReduce based computing platform can access various file systems by the interface without modifying the existing computing system so that to simplify the construction of distributed applications. Different file systems can be configured and get quick switch by the interface. The interface is also integrated in Hadoop to make the experiments. The results show the interface can achieve different storage systems switched at the same time the data access efficiency gets better with increasing data volume. Further experiment with larger data volume will be done and the deeper development of the interface is being taken in the computing platform of our work.
With wide adoption of multi-core processor based systems, there is a need for benchmarking such systems at both application and operating system levels. Developing benchmarks for multi-core systems is a cumbersome tas...
详细信息
With wide adoption of multi-core processor based systems, there is a need for benchmarking such systems at both application and operating system levels. Developing benchmarks for multi-core systems is a cumbersome task due to underlying parallel architecture and complexity of parallel programming paradigms. In this paper, we introduce multi-core processor architecture and communication (MPAC) benchmarking library, which provides a common infrastructure for developing specification-driven micro-benchmarks, application benchmarks, and network traffic load generators. We describe the software architecture of MPAC and demonstrate its efficacy by implementing the specifications of well-known Stream and Netperf micro-benchmarks. We use these benchmarks to validate MPAC based performance measurements for single thread on Intel, AMD, and Cavium multi-core processors based platforms. We also develop a CPU micro-benchmark using our own specifications. In addition, we extend these micro-benchmarks through MPAC library to measure the scaling characteristics of our target multi-core processors based platforms.
Node churn can have a severe impact on the performance of P2P applications. In this paper, we consider the design of reliable P2P networks that can provide predictable performance. We exploit the experimental finding ...
详细信息
Node churn can have a severe impact on the performance of P2P applications. In this paper, we consider the design of reliable P2P networks that can provide predictable performance. We exploit the experimental finding that the age of a node can be a reliable predictor of longer residual lifetime to develop mechanisms that organize the network around these more reliable nodes. We propose two protocols, TrebleCast and TrebleCast*, to implement reliable overlay networks. These protocols dynamically create reliable layers of peers by moving nodes with higher expected lifetime to the center of the overlay. These more reliable layers can then be called upon to deliver predictable performance in the presence of churn.
High-level synthesis is the process of balancing the distribution of RTL components throughout the execution of applications. However, a lot of balancing and optimization opportunities exist below RTL. In this paper, ...
详细信息
High-level synthesis is the process of balancing the distribution of RTL components throughout the execution of applications. However, a lot of balancing and optimization opportunities exist below RTL. In this paper, a coarse grain reconfigurable RTL component that combines a multiplier and a number of additions is presented and involved in high-level synthesis. The gate-level synthesis methodology for this component imposes practically no extra hardware than a normal multiplier while involvement in high-level synthesis is performed with a scheduling postprocessor. Following this approach, components that would remain idle in certain control steps are working full-time in two different modes, without any reconfiguration overhead applied to the critical path of the application. The results obtained with different DSP benchmarks show a maximum performance gain of almost 70% with a 45% datapath area gain.
The master/worker pattern is widely used to construct the cross-domain, large scale computing infrastructure. The applications supported by this kind of infrastructure usually features long-running, speculative execut...
详细信息
The master/worker pattern is widely used to construct the cross-domain, large scale computing infrastructure. The applications supported by this kind of infrastructure usually features long-running, speculative execution etc. Fault recovery mechanism is significant to them especially in the wide area network environment, which consists of error prone components. Inter-node cooperation is urgent to make the recovery process more efficient. The traditional log-based rollback recovery mechanism which features independent recovery cannot fulfill the global cooperation requirement due to the waste of bandwidth and slow application data transfer which is caused by the exchange of a large amount of logs. In this paper, we propose a two-phase log-based recovery mechanism which is of merits such as space saving and global optimization and can be used as a complement of the current log-based rollback recovery approach in some specific situations. We have demonstrated the use of this mechanism in the Drug Discovery Grid environment, which is supported by China National Grid. Experiment results have proved efficiency of this mechanism.
The overhead caused by virtualization makes it difficult to apply VM in the applications which require high degrees of both performance isolation and efficiency, such as the high performance computing. In this paper, ...
详细信息
The overhead caused by virtualization makes it difficult to apply VM in the applications which require high degrees of both performance isolation and efficiency, such as the high performance computing. In this paper, we present a lightweight virtual machine, named Solo. It simplifies the design of VMM greatly by making most privileged instructions bypass the VMM, except the I/O operations. Solo allows VM running directly on hardware with the highest privileges, therefore greatly reduces the overhead caused by virtualization. Our evaluation shows that Solo not only guarantees the VM performance isolation, but also improves VM performance to the level of traditional OS, and thus meets the requirements of the high performance applications without special hardware support.
This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the ldquoDCGNrdquo API on NVIDIA GPUs that is si...
详细信息
This paper explores the challenges in implementing a message passing interface usable on systems with data-parallel processors. As a case study, we design and implement the ldquoDCGNrdquo API on NVIDIA GPUs that is similar to MPI and allows full access to the underlying architecture. We introduce the notion of data-parallel thread-groups as a way to map resources to MPI ranks. We use a method that also allows the data-parallel processors to run autonomously from user-written CPU code. In order to facilitate communication, we use a sleep-based polling system to store and retrieve messages. Unlike previous systems, our method provides both performance and flexibility. By running a test suite of applications with different communication requirements, we find that a tolerable amount of overhead is incurred, somewhere between one and five percent depending on the application, and indicate the locations where this overhead accumulates. We conclude that with innovations in chipsets and drivers, this overhead will be mitigated and provide similar performance to typical CPU-based MPI implementations while providing fully-dynamic communication.
A premier goal of resource allocators in virtualization environments is to control the relative resource consumption of the different virtual machines, and moreover, to be able to change the relative allocations at wi...
详细信息
A premier goal of resource allocators in virtualization environments is to control the relative resource consumption of the different virtual machines, and moreover, to be able to change the relative allocations at will. However, it is not clear what it means to provide a certain fraction of the machine when multiple resources are involved. We suggest that a promising interpretation is to identify the system bottleneck at each instant, and to enforce the desired allocation on that device. This in turn induces an efficient allocation of the other devices.
暂无评论