In view of the traditional parallel FP-growth algorithm (PFP) that suffers from two major limitations, namely, multiple database scans requirement (i.e., high I/O cost) and high inter-processor communications cost, th...
详细信息
ISBN:
(纸本)9781479914067
In view of the traditional parallel FP-growth algorithm (PFP) that suffers from two major limitations, namely, multiple database scans requirement (i.e., high I/O cost) and high inter-processor communications cost, therefore we design and implement a parallel association rules mining method based on cloud computing. the algorithm adopts the separation strategy to simply visit a local database only once, thus, the inter-processor communication I/O overhead is reduced. What's more, the MapReduce model is used to solve the problem of huge amounts of data mining, as well as the calculated execution taking place in the local data storage node, which can avoid large amounts of data on the network transmission and reduce the communication overhead. By using ordinary PC structures, Hadoop cluster experimental results verify that the proposed algorithm based on cloud computing offers higher efficiency and has a good speedup.
We study computing maximum flows for real-world networks. In contrast to prior studies, our implementation is bulk synchronous. Our algorithm applies push and relabel operations on all active vertices in parallel, and...
详细信息
ISBN:
(纸本)9781479961245
We study computing maximum flows for real-world networks. In contrast to prior studies, our implementation is bulk synchronous. Our algorithm applies push and relabel operations on all active vertices in parallel, and maintains a preflow through a delayed flow update approach with handshakes between flow pushing and flow receiving. In our implementation the heuristics are no longer entangled withthe push and relabel operations as in prior implementations. We apply two heuristics well known for the sequential algorithm, that is, global relabel and gap relabel, to our parallel implementation. Experiments on networks constructed for computer vision images show that our parallel implementation on the target 8-core machine is up to 8.6 times faster than the best sequential implementation. We also propose a new augmenting path based heuristic for small-world graphs. On large social networks with up to billions of edges our implementation achieves close to 40 times parallel speedup.
the proceedings contain 18 papers. the topics discussed include: semantically aware contention management for distributed applications;FITCH: supporting adaptive replicated services in the cloud;network forensics for ...
ISBN:
(纸本)9783642385407
the proceedings contain 18 papers. the topics discussed include: semantically aware contention management for distributed applications;FITCH: supporting adaptive replicated services in the cloud;network forensics for cloud computing;dynamic deployment of sensing experiments in the wild using smartphones;AJITTS: adaptive just-in-time transaction scheduling;strategies for generating and evaluating large-scale powerlaw-distributed P2P overlays;ambient clouds: reactive asynchronous collections for mobile ad hoc network applications;bandwidth prediction in the face of asymmetry;a scalable benchmark as a service platform;failure analysis and modeling in large multi-site infrastructures;evaluating the price of consistency in distributed file storage services;an effective scalable SQL Engine for N0SQL databases;and EZ: towards efficient asynchronous protocol gateway construction.
More and more users joined in social network. the precise social trust value is critical for application system such as recommendation system. To a user, the egocentric network is formed by the user, his friends and s...
详细信息
ISBN:
(纸本)9780769550947
More and more users joined in social network. the precise social trust value is critical for application system such as recommendation system. To a user, the egocentric network is formed by the user, his friends and social relationships between him and other users. We proposed an algorithm for inferring dynamic trust based on trust chains and interactions. Indirect trust values are calculated depending on direct trust values and trust chains in the egocentric network. As the social network evolves, the dynamic trust values can be resulted from the interactions between a user and his friends and trust reference. By statistical analysis of trust value in social network, this algorithm improved the accuracy of trust transitivity and trust value computing compared with a classical trust algorithm.
Signature-based intrusion detection systems have been widely deployed in current network environments to defend against various attacks, but the expensive process of signature matching is a major suffering problem for...
详细信息
ISBN:
(纸本)9780769550886
Signature-based intrusion detection systems have been widely deployed in current network environments to defend against various attacks, but the expensive process of signature matching is a major suffering problem for these detection systems. thus, a high-performance signature matching scheme is of great importance for a signature-based IDS. In our previous work, we have developed an exclusive signature matching scheme that aims to identify a mismatch instead of locating an accurate match and achieved good results in the experiments. Withthe advent of Cloud computing, IDS as a service (IDSaaS) has been proposed as an alternative by offloading the expensive operations such as the process of signature matching to the cloud. In this paper, we attempt to design a parallel model to conduct the exclusive signature matching in a cloud. In the evaluation, we implemented our model in a cloud environment and investigated its performance compared with Snort. the experimental results indicate that our proposed model can achieve promising performance in such a cloud environment.
the growing size of the multiprocessor system increases its vulnerability to component failures. the fault diagnosis is the process of identifying faulty processors in a system through self-testing, and the diagnosabi...
详细信息
ISBN:
(纸本)9783642408199;9783642408205
the growing size of the multiprocessor system increases its vulnerability to component failures. the fault diagnosis is the process of identifying faulty processors in a system through self-testing, and the diagnosability is an important parameter to measure the reliability of an interconnection network. As a new measure of fault tolerance, conditional diagnosability can better evaluate the real diagnosability of interconnection networks. In this paper, we derive the conditional diagnosability of the multiprocessor systems in terms of Complete Josephus Cubes CJC(n) (n >= 8) under the comparison model.
In order to develop reliable applications for parallel machines, programming languages and systems need to provide for flexible parallel programming coordination techniques. Barriers, clocks and phasers constitute pro...
详细信息
the performance of parallel distributed data management systems becomes increasingly important withthe rise of Big Data. parallel joins have been widely studied both in the parallel processing and the database commun...
详细信息
ISBN:
(纸本)9780769550886
the performance of parallel distributed data management systems becomes increasingly important withthe rise of Big Data. parallel joins have been widely studied both in the parallel processing and the database communities. Nevertheless, most of the algorithms so far developed do not consider the data skew, which naturally exists in various applications. State of the art methods designed to handle this problem are based on extensions to either of the two prevalent conventional approaches to parallel joins - the hash-based and duplication-based frameworks. In this paper, we introduce a novel parallel join framework, query-based distributed join (QbDJ), for handling data skew on distributed architectures. Further, we present an efficient implementation of the method based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate the performance of our approach on a cluster of 192 cores (16 nodes) and datasets of 1 billion tuples with different skews. the results show that the method is scalable, and also runs faster with less network communication compared to state-of-art PRPD approach in [1] under high data skew.
Computer forensics involves the collection, analysis, and reporting of information about security incidents and computer-based criminal activity. Cloud computing causes new challenges for the forensics process. this p...
详细信息
In a cloud computing environment, virtual machine allocation is an important task for providing infrastructure services. Generally, the datacenters, on which a cloud computing platform runs, are distributed over a wid...
详细信息
ISBN:
(纸本)9783642408199;9783642408205
In a cloud computing environment, virtual machine allocation is an important task for providing infrastructure services. Generally, the datacenters, on which a cloud computing platform runs, are distributed over a wide area network. therefore, communication cost should be taken into consideration when allocating VMs across servers of multiple datacenters. A network-aware VM allocation algorithm for cloud is developed. It tries to minimize the communication cost and latency between servers, withthe number of VMs, VM configurations and communication bandwidths are satisfied to users. Specifically, a two-dimensional knapsack algorithm is applied to solve this problem. the algorithm is evaluated and compared with other ones through experiments, which shows satisfying results.
暂无评论