In the era of big data, the volume of semantic data grows rapidly. The large scale semantic data contains a lot of significant but often implicit information that needs to be derived by reasoning. The semantic data re...
详细信息
ISBN:
(纸本)9781479986484
In the era of big data, the volume of semantic data grows rapidly. The large scale semantic data contains a lot of significant but often implicit information that needs to be derived by reasoning. The semantic data reasoning is a challenging process. On one hand, the traditional single-node reasoning systems can hardly cope with such large amount of data due to the resource limitations. On the other hand, the existing large scale reasoning systems are not very efficient and scalable due to the complexity of reasoning process. In this paper, we propose Cichlid, an efficient distributed reasoning engine for the widely-used RDFS and OWL Horst rule sets. Cichlid is built on top of Spark. It implements parallel reasoning algorithms with the Spark RDD programming model. Further, we proposed the optimized parallel RDFS reasoning algorithm from three aspects, including data partition model, the execution order of reasoning rules and removing of duplicated data. Then, for the parallel OWL reasoning process, we optimized the most time-consuming parts, including large-scale data join, the transitive closure computation and the equivalent relation computation. In addition to above optimizations at the reasoning algorithm level, we also optimized the inner Spark execution mechanism by proposing an off-heap memory storage mechanism for RDD. This system-level optimization patch has been accepted and integrated into Apache Spark 1.0. The experimental results show that Cichlid is around 10 times faster on average than the state-of-the-art distributed reasoning systems for both large scale synthetic and real-world benchmarks. The proposed reasoning algorithms and engine also achieve excellent scalability and fault tolerance.
We discuss here the emergent Web based distributed environments for HPCC on the NII with the focus on Java as an enabling technology. We start with a review of the past, presence and the near term future of the 'J...
详细信息
ISBN:
(纸本)0818675829
We discuss here the emergent Web based distributed environments for HPCC on the NII with the focus on Java as an enabling technology. We start with a review of the past, presence and the near term future of the 'Java phenomenon', exposed here in the background of some related previous approaches towards a distributed interpretative virtual machine architecture.
Desktop Grids are rapidly gaining popularity as a cost-effective computing platform for the execution of applications with extensive computing needs. As opposed to grids and clusters, these systems are characterized b...
详细信息
ISBN:
(纸本)9781424437511
Desktop Grids are rapidly gaining popularity as a cost-effective computing platform for the execution of applications with extensive computing needs. As opposed to grids and clusters, these systems are characterized by having a non-dedicated infrastructure. These unique characteristics need to be considered in developing resource management strategies for Desktop Grids. Several frameworks for the performance evaluation of resource management strategies have been suggested for grids. However, similar projects for Desktop Grids are still lacking. This paper presents MGST the first performance testing framework for Desktop Grids. We discuss the design of the tool and show how it can be used to analyze and improve the performance of an existing Desktop Grid scheduling policy.
Floating point arithmetic, as specified in the IEEE standard, is used extensively in programs for science and engineering. This use is expanding rapidly into other domains, for example with the growing application of ...
详细信息
ISBN:
(纸本)9781538643686
Floating point arithmetic, as specified in the IEEE standard, is used extensively in programs for science and engineering. This use is expanding rapidly into other domains, for example with the growing application of machine learning everywhere. While floating point arithmetic often appears to be arithmetic using real numbers, or at least numbers in scientific notation, it actually has a wide range of gotchas. Compiler and hardware implementations of floating point inject additional surprises. This complexity is only increasing as different levels of precision are becoming more common and there are even proposals to automatically reduce program precision (reducing power/energy and increasing performance) when results are deemed "good enough." Are software developers who depend on floating point aware of these issues? Do they understand how floating point can bite them? To find out, we conducted an anonymous study of different groups from academia, national labs, and industry. The participants in our sample did only slightly better than chance in correctly identifying key unusual behaviors of the floating point standard, and poorly understood which compiler and architectural optimizations were nonstandard. These surprising results and others strongly suggest caution in the face of the expanding complexity and use of floating point arithmetic.
distributed storage systems have become popular for handling the enormous amounts of data in network-centric systems. A distributed storage system provides client processes with the abstraction of a shared variable th...
详细信息
ISBN:
(纸本)9781424437511
distributed storage systems have become popular for handling the enormous amounts of data in network-centric systems. A distributed storage system provides client processes with the abstraction of a shared variable that satisfies some consistency and reliability properties. Typically the properties are ensured through a replication-based implementation. This paper presents an algorithm for a replicated read-write register that cat? tolerate Byzantine failures of some of the replica servers. The targeted consistency condition is a version of regularity that supports multiple writers. Although regularity is weaker than the more frequently supported condition of atomicity it is still strong enough to be useful in some important applications. By weakening the consistency condition, the algorithm can support multiple writers more efficiently than the known multi-writer algorithms for atomic consistency.
Cyber-physical systems (CPS) are computer systems with integrated software and physical components that ideally seamlessly interact with the real world and each other. While the use of distributed CPS has rapidly grow...
详细信息
ISBN:
(纸本)9781665438193
Cyber-physical systems (CPS) are computer systems with integrated software and physical components that ideally seamlessly interact with the real world and each other. While the use of distributed CPS has rapidly grown over the past decade, so has the need for developing efficient methods to ascertain reliability of these systems by validating their correctness. Since exhaustively validating correctness of a distributed CPS is usually not feasible nor possible, many modern validation methods involve run-time verification of distributed CPS based on safety properties. Our work focuses on developing time and resource efficient assurance techniques that can run in parallel with the execution of these systems to ensure reliability.
DAISy (distributed Array of Inexpensive systems) is a 16 node PC cluster running a full UNIX compatible operating system. The network media used includes standard 10Mb/s (10BASE-2) Ethernet (used for client node NFS m...
详细信息
ISBN:
(纸本)0818675829
DAISy (distributed Array of Inexpensive systems) is a 16 node PC cluster running a full UNIX compatible operating system. The network media used includes standard 10Mb/s (10BASE-2) Ethernet (used for client node NFS mounts and any client node interactive work users find necessary), and, switched 100Mbs/ (100BASE-TX) Fast Ethernet (used for user program message passing traffic). The DAISy cluster is used to investigate the viability of commodity PC technology to perform computation of scientific and engineering problems traditionally performed on 'Supercomputers,' and more recently high performance RISC workstations and clusters of RISC workstations. Performance analysis of the various single node subsystems were carried out, along with performance analysis of the cluster as a whole on a number of parallel applications. The results show that the current Pentium 90MHz CPU and motherboards used are well within that of many low-end workstations offered by traditional workstation vendors.
Self-adaptive clouds extend upstream the regular cloud platforms with special autonomy features dedicated to handling increasing workload and service failures. The identification of such features is not necessarily an...
详细信息
ISBN:
(纸本)9781479941162
Self-adaptive clouds extend upstream the regular cloud platforms with special autonomy features dedicated to handling increasing workload and service failures. The identification of such features is not necessarily an easy task. Sometimes those can be explicitly stated by QoS requirements or in preliminary material available to requirements engineers. Often though, they are implicit so that autonomy features capturing has to be undertaken. This paper elaborates on a methodology of capturing autonomy requirements for self-adaptive clouds with ARE, the Autonomy Requirements engineering approach. In this approach, autonomy features are detected as special self-* objectives backed up by different capabilities and quality characteristics.
This paper describes compiler techniques that can translate standard OpenMP applications into code for distributed computer systems. OpenMP has emerged as an important model and language extension for shared-memory pa...
详细信息
This paper describes compiler techniques that can translate standard OpenMP applications into code for distributed computer systems. OpenMP has emerged as an important model and language extension for shared-memory parallel programming. However, despite OpenMP's success on these platforms, it is not currently being used on distributed system. The long-term goal of our project is to quantify the degree to which such a use is possible and develop supporting compiler techniques. Our present compiler techniques translate OpenMP programs into a form suitable for execution on a software DSM system. We have implemented a compiler that performs this basic translation, and we have studied a number of hand optimizations that improve the baseline performance. Our approach complements related efforts that have proposed language extensions for efficient execution of OpenMP programs on distributedsystems. Our results show that, while kernel benchmarks can show high efficiency of OpenMP programs on distributedsystems, full applications need careful consideration of shared data access patterns. A naive translation ( similar to OpenMP compilers for SMPs) leads to acceptable performance in very few applications only. However, additional optimizations, including access privatization, selective touch, and dynamic scheduling, resulting in 31% average improvement on our benchmarks.
The concept of software architecture, also said system structure or system configuration, is especially important to design complex softwaresystems, providing a model of the large scale structural properties of syste...
详细信息
The concept of software architecture, also said system structure or system configuration, is especially important to design complex softwaresystems, providing a model of the large scale structural properties of systems. Module interconnection languages (MILs) introduced the idea of creating program modules and connecting them to form larger structures. However, MILs do not support the description of important architectural elements. A new class of description languages, referred to as architectural description languages (ADLs), have recently emerged. Most ADLs, however, support only the description of static software architectures and not dynamic or reconfigurable software architectures. A further limitation of current ADLs is that they focus mainly on the formal notation and usually do not offer proof systems and tools to enable designers to formally verify the properties of their designs. We have developed the ZCL framework, which is a formal framework, specified in Z, to describe and reason about dynamic distributedsoftware architectures. In this paper, we use a simple case study - the client-server system - to demonstrate how our formal framework ZCL can be used to specify and verify reconfigurable software architectures.
暂无评论