Parallax, a new operating system, implements scalable, distributed, and parallel computing to take advantage of the new generation of 64-bit multi-core processors. Parallax uses the distributed Intelligent Managed Ele...
详细信息
Parallax, a new operating system, implements scalable, distributed, and parallel computing to take advantage of the new generation of 64-bit multi-core processors. Parallax uses the distributed Intelligent Managed Element (DIME) network architecture, which incorporates a signaling network overlay and allows parallelism in resource configuration, monitoring, analysis and reconfiguration on-the-fly based on workload variations, business priorities and latency constraints of the distributedsoftware components. A workflow is implemented as a set of tasks, arranged or organized in a directed acyclic graph (DAG) and executed by a managed network of DIMEs. These tasks, depending on user requirements are programmed and executed as loadable modules in each DIME. Parallax is implemented using the assembler language at the lowest level for efficiency and provides a C/C++ programming API for higher level programming.
Superior and fast semantic comparison improves the quality of web-search. Semantic comparison involves dot product computation of large sparse tensors which is time consuming and expensive. In this paper we present a ...
详细信息
Superior and fast semantic comparison improves the quality of web-search. Semantic comparison involves dot product computation of large sparse tensors which is time consuming and expensive. In this paper we present a low power parallel architecture that consumes only 15.41 Watts and demonstrates a speed-up in the order of 10 5 compared to a contemporary hardware design, and in the order of 10 4 compared to a purely software approach. Such high performance low power architecture can be used in semantic routers to elegantly implement energy efficient distributed search engines.
Formal verification is becoming a fundamental step of safety-critical and model-based software development. As part of the verification process, model checking is one of the current advanced techniques to analyze the ...
详细信息
Formal verification is becoming a fundamental step of safety-critical and model-based software development. As part of the verification process, model checking is one of the current advanced techniques to analyze the behavior of a system. In this paper, we examine an existing parallel model checking algorithm and we propose improvements to eliminate some computational bottlenecks. Our measurements show that the resulting new algorithm has better scalability and performance than both the former parallel approach and the sequential algorithm.
Current distributed computing systems comprising of commodity computers like Network of Workstations (NOW) are obliged to deploy multicore processors to raise their performance. However, because multicore processors w...
详细信息
Current distributed computing systems comprising of commodity computers like Network of Workstations (NOW) are obliged to deploy multicore processors to raise their performance. However, because multicore processors were absent when traditional standard programming models and APIs for distributed computing such as MPI and PVM were designed, traditional models are not suitable for programming multicore processors. In this paper, we argue in favor of a powerful programming model called the task-oriented programming model. This model is recently used for programming applications for both multicore processors and distributed computing systems such as computational grids. We argue that because of simplicity and the ability of automatic scaling of applications developed under this model, the task-oriented programming model fits the requirements of programming multicore enabled systems better than traditional models like message passing or multi-threading.
The exponential increase in the generation and collection of data has led us in a new era of data analysis and information extraction. Conventional systems based on general-purpose processors are unable to keep pace w...
详细信息
The exponential increase in the generation and collection of data has led us in a new era of data analysis and information extraction. Conventional systems based on general-purpose processors are unable to keep pace with the heavy computational requirements of data mining techniques. High performance co-processors like GPUs and FPGAs have the potential to handle large computational workloads. In this paper, we present a scalable framework aimed at providing a platform for developing and using high performance data mining applications on heterogeneous platforms. The framework incorporates a software infrastructure and a library of high performance kernels. Furthermore, it includes a variety of optimizations which increase the throughput of applications. The framework spans multiple technologies including R, GPUs, multi-core CPUs, MPI, and parallelnet CDF harnessing their capabilities for high-performance computations. This paper also introduces the concept of interleaving GPU kernels from multiple applications providing significant performance gain. Thus, in comparison to other tools available for data mining, our framework provides an easy-to-use and scalable environment both for application development and execution. The framework is available as a software package which can be easily integrated in the R programming environment.
Computational Intelligence systems (CIS) is one of advanced softwares. CIS has been important position for solving single-objective/reverse/inverse and multi-objective design problems in engineering. The paper hybridi...
详细信息
Computational Intelligence systems (CIS) is one of advanced softwares. CIS has been important position for solving single-objective/reverse/inverse and multi-objective design problems in engineering. The paper hybridise a CIS for optimisation with the concept of Nash-Equilibrium as an optimisation pre-conditioner to accelerate the optimisation process. The hybridised CIS (Hybrid Intelligence System) coupled to the Finite Element Analysis (FEA) tool and one type of Computer Aided Design (CAD) system, GiD is applied to solve an inverse engineering design problem, reconstruction of High Lift systems (HLS). Numerical results obtained by the hybridised CIS are compared to the results obtained by the original CIS. The benefits of using the concept of Nash-Equilibrium are clearly demonstrated in terms of solution accuracy and optimisation efficiency.
The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (...
详细信息
The seemingly interminable dwindle of technology feature sizes well into the nano-scale regime has afforded computer architects with an abundance of computational resources on a single chip. The Chip Multi-Processor (CMP) paradigm is now seen as the de facto architecture for years to come. However, in order to efficiently exploit the increasing number of on-chip processing cores, it is imperative to achieve and maintain efficient utilization of the resources at run time. Uneven and skewed distribution of workloads misuses the CMP resources and may even lead to such undesired effects as traffic and temperature hotspots. While existing techniques rely mostly on software for the undertaking of load balancing duties and exploit hardware mainly for synchronization, we will demonstrate that there are wider opportunities for hardware support of load balancing in CMP systems. Based on this fact, this paper proposes IsoNet, a conflict-free dynamic load distribution engine that exploits hardware aggressively to reinforce massively parallel computation in many core settings. Moreover, the proposed architecture provides extensive fault-tolerance against both CPU faults and intra-IsoNet faults. The hardware takes charge of both (1) the management of the list of jobs to be executed, and (2) the transfer of jobs between processing elements to maintain load balance. Experimental results show that, unlike the existing popular techniques of blocking and job stealing, IsoNet is scalable with as many as 1024 processing cores.
We are proposing a systematic approach to building reliable distributed applications. The main objective of this approach is to consider reliability from application inception to completion, adding reliability pattern...
详细信息
We are proposing a systematic approach to building reliable distributed applications. The main objective of this approach is to consider reliability from application inception to completion, adding reliability patterns along the lifecycle and in all architectural layers. We start by enumerating the possible failures of the application, considering every activity in the use cases of the application. The identified failures are then handled by applying reliability policies. We evaluate the benefits of this approach and compare our approach to others.
In this paper, snoopy cache coherence protocols with update strategy have been studied, since description of protocols is apparently different in references, so we offer a common description in order to show simplicit...
详细信息
In this paper, snoopy cache coherence protocols with update strategy have been studied, since description of protocols is apparently different in references, so we offer a common description in order to show simplicity and comparability protocols. Evaluating of protocols is in simulation level By limes simulator tools. Limes includes dragon protocol, that we added WTU and firefly protocols to it. As the simulation results show, precision of block sharing information in multiprocessor system, is major reason for different performance of update snoopy cache coherence protocols, so if this information amount to be high, then performance will be high, but on the other hand, implementation cost will be high, as well.
Bayesian networks (BN) are probabilistic graphical models which are widely utilized in modeling complex biological interactions in the cell. Learning the structure of a BN is an NP-hard problem and existing exact and ...
详细信息
Bayesian networks (BN) are probabilistic graphical models which are widely utilized in modeling complex biological interactions in the cell. Learning the structure of a BN is an NP-hard problem and existing exact and heuristic solutions do not scale to large enough domains to allow for meaningful modeling of many biological processes. In this work, we present efficient parallel algorithms which push the scale of both exact and heuristic BN structure learning. We demonstrate the applicability of our methods by implementations on an IBM Blue Gene/L and an AMD Opteron cluster, and discuss their significance for future applications to systems biology.
暂无评论