MapReduce and its open software implementation Hadoop are now widely deployed for big data analysis. As MapReduce runs over a cluster of massive machines, data transfer often becomes a bottleneck in job processing. In...
详细信息
MapReduce and its open software implementation Hadoop are now widely deployed for big data analysis. As MapReduce runs over a cluster of massive machines, data transfer often becomes a bottleneck in job processing. In this paper, we explore the influence of data transfer to job processing performance and analyze the mechanism of job performance deterioration caused by data transfer oriented congestion at disk I/O and/or network I/O. Based on this analysis, we update Hadoop's Heartbeat messages to contain the real time system status for each machine, like disk I/O and link usage rate. This enhancement makes Hadoop's scheduler be aware of each machine's workload and make more accurate decision of scheduling. The experiment has been done to evaluate the effectiveness of enhanced scheduling methods and discussions are provided to compare the several proposed scheduling policies.
We are witnessing the advent of personal manufacturing, where home users and small and medium enterprises manufacture products locally, at the point and time of need. The impressively fast adoption of these technologi...
详细信息
We are witnessing the advent of personal manufacturing, where home users and small and medium enterprises manufacture products locally, at the point and time of need. The impressively fast adoption of these technologies indicates this approach to manufacturing can become a key enabler of the real-time economy of the future. In this paper, we contribute a secure and dependable infrastructure and architecture for that new paradigm. Our solution leverages physical limitations of the computational process into a defense strategy that makes distributed file storage and transfer highly secure. The main idea is to replace asymmetric or public-key encryption functions with an unkeyed, collision, second preimage, and preimage resistant cryptographic hash function. Such a cryptosystem does not have an inverse function . We challenge each block hash against the full hash table to recreate the original message. To illustrate the approach, we describe secured protocols that provide a number of desirable properties during both data storage and streaming. Similar to proof-of-work blockchain consensus algorithms, we parameterized the solution based on the amount of infrastructure available. Experiments show the proposed method can recalculate hashes for a 3-dimensional of 256(3) at an average of 14 revisions per second, and one revision every 5 minutes for a bigger matrix of 4096(3). The increase in cloud infrastructure cost is insignificant compared to the level of protection offered.
Blockchain, as an emerging decentralized architecture and distributed computing paradigm underlying Bitcoin and other cryptocurrencies, has attracted intensive attention in both research and applications recently. Blo...
详细信息
Blockchain, as an emerging decentralized architecture and distributed computing paradigm underlying Bitcoin and other cryptocurrencies, has attracted intensive attention in both research and applications recently. Blockchain, especially powered by chain-coded smart contracts, has the full potential of revolutionizing increasingly centralized cyber-physical-social systems (CPSSs) for constructions and applications, and reshaping traditional knowledge automation workflows. The key advantage of blockchain technology lies in the fact that it can enable the establishment of secured, trusted, and decentralized autonomous ecosystems for various scenarios, especially for better usage of the legacy devices, infrastructure, and resources.
RPYFMM is a software package for the efficient evaluation of the potential field governed by the Rotne-Prager-Yamakawa (RPY) tensor interactions in biomolecular hydrodynamics simulations. In our algorithm, the RPY ten...
详细信息
RPYFMM is a software package for the efficient evaluation of the potential field governed by the Rotne-Prager-Yamakawa (RPY) tensor interactions in biomolecular hydrodynamics simulations. In our algorithm, the RPY tensor is decomposed as a linear combination of four Laplace interactions, each of which is evaluated using the adaptive fast multipole method (FMM) (Greengard and Rokhlin, 1997) where the exponential expansions are applied to diagonalize the multipole-to-local translation operators. RPYFMM offers a unified execution on both shared and distributed memory computers by leveraging the DASHMM library (DeBuhr et al., 2016, 2018). Preliminary numerical results show that the interactions for a molecular system of 15 million particles (beads) can be computed within one second on a Cray XC30 cluster using 12,288 cores, while achieving approximately 54% strong-scaling efficiency. Program summary Program Title: RPYFMM: Parallel Adaptive FMM for RPY Tensor Program Files doi: http://***/10.17632/zpbjvy8whp.1 Licensing provisions: BSD 3-clause Programming language: C++ Nature of problem: Evaluate the Rotne-Prager-Yamakawa tensor matrix-vector multiplications describing the hydrodynamics interaction in biomolecular systems. Solution method: The Rotne-Prager-Yamakawa tensor is decomposed as a linear combination of four Laplace interactions, each of which is evaluated using the new version of adaptive fast multipole method [1]. Additional Comments: RPYFMM is built on top of the DASHMM library and the Asynchronous Multi-Tasking HPX-5 runtime system. DASHMM is automatically downloaded during installation and HPX-5 is available at http://***/. (C) 2018 Elsevier B.V. All rights reserved.
Simultaneous correction of nonuniform attenuation and detector response was implemented in single-photon-emission computed tomography (SPECT) image reconstruction. A ray-driven projector-backprojector that exactly mod...
详细信息
Simultaneous correction of nonuniform attenuation and detector response was implemented in single-photon-emission computed tomography (SPECT) image reconstruction. A ray-driven projector-backprojector that exactly models attenuation in the reconstructed image slice and the spatially variant detector response was developed and used in the iterative maximum-likelihood algorithm for the correction. A computer-generated heart-lung phantom was used in simulation studies to compare the simultaneous correction method with an intrinsic attenuation correction method using a smoothing filter, and intrinsic attenuation correction method using a deconvolution filter, and a modified Chang attenuation correction method using a nonuniform attenuation distribution. The results demonstrate that the present method provides more-accurate quantitation and superior image quality.< >
Matrix factorization, when the matrix has missing values, has become one of the leading techniques for recommender systems. To handle web-scale datasets with millions of users and billions of ratings, scalability beco...
详细信息
Matrix factorization, when the matrix has missing values, has become one of the leading techniques for recommender systems. To handle web-scale datasets with millions of users and billions of ratings, scalability becomes an important issue. Alternating least squares (ALS) and stochastic gradient descent (SGD) are two popular approaches to compute matrix factorization, and there has been a recent flurry of activity to parallelize these algorithms. However, due to the cubic time complexity in the target rank, ALS is not scalable to large-scale datasets. On the other hand, SGD conducts efficient updates but usually suffers from slow convergence that is sensitive to the parameters. Coordinate descent, a classical optimization approach, has been used for many other large-scale problems, but its application to matrix factorization for recommender systems has not been thoroughly explored. In this paper, we show that coordinate descent-based methods have a more efficient update rule compared to ALS and have faster and more stable convergence than SGD. We study different update sequences and propose the CCD++ algorithm, which updates rank-one factors one by one. In addition, CCD++ can be easily parallelized on both multi-core and distributed systems. We empirically show that CCD++ is much faster than ALS and SGD in both settings. As an example, with a synthetic dataset containing 14.6 billion ratings, on a distributed memory cluster with 64 processors, to deliver the desired test RMSE, CCD++ is 49 times faster than SGD and 20 times faster than ALS. When the number of processors is increased to 256, CCD++ takes only 16 s and is still 40 times faster than SGD and 20 times faster than ALS.
A site broadcasting its local value to all other sites in a fault-prone environment is a basic paradigm in the development of reliable distributed systems. Time complexity lower limits and network connectivity require...
详细信息
A site broadcasting its local value to all other sites in a fault-prone environment is a basic paradigm in the development of reliable distributed systems. Time complexity lower limits and network connectivity requirements for reliable broadcast protocols in point-to-point communication networks are well known. Here, examination focuses on the reliable broadcast problem in distributed systems with broadcast networks as the fundamental communication architecture. It is demonstrated how properties of such network architectures can be employed to effectively restrict the externally visible behavior of faulty processors. These methods are used to derive simple protocols that implement reliable broadcast in only 2 rounds, independent of the failure upper bounds.
Secure data aggregation schemes are widely adopted in wireless sensor networks, not only to minimize the energy and bandwidth consumption, but also to enhance the security. Statistics obtained from data aggregation sc...
详细信息
Secure data aggregation schemes are widely adopted in wireless sensor networks, not only to minimize the energy and bandwidth consumption, but also to enhance the security. Statistics obtained from data aggregation schemes often fall into three categories, i.e., distributive, algebraic, and holistic. In practice, a wide range of reasonable aggregation queries are combinations of several different statistics. Providing multi-functional aggregation support is also a primary demand for data preprocessing in data mining. However, most existing secure aggregation schemes only focus on a single type of statistics. Some statistics, especially holistic ones (e.g., median), are often difficult to compute efficiently in a distributed mode even without considering the security issue. In this paper, we first propose a new Multi-functiOnal secure Data Aggregation scheme (MODA), which encodes raw data into well-defined vectors to provide value-preservation, order-preservation and context-preservation, and thus offering the building blocks for multi-functional aggregation. A homomorphic encryption scheme is adopted to enable in-ciphertext aggregation and end-to-end security. Then, two enhanced and complementary schemes are proposed based on MODA, namely, RandOm selected encryption based Data Aggregation (RODA) and COmpression based Data Aggregation (CODA). RODA can significantly reduce the communication cost at the expense of slightly lower but acceptable security on a leaf node, while CODA can dramatically reduce communication cost with the lower aggregation accuracy. The performance results obtained from theoretic analysis and experimental evaluation of three real datasets under different scenarios, demonstrate that our schemes can achieve the performance superior to the most closely related work. (C) 2017 Elsevier B.V. All rights reserved.
This paper considers the leader-based consensus of heterogeneous multiple agents with nonlinear uncertain systems. Based on the information obtained from the following agents' neighbors, leader observers are desig...
详细信息
This paper considers the leader-based consensus of heterogeneous multiple agents with nonlinear uncertain systems. Based on the information obtained from the following agents' neighbors, leader observers are designed by the following agents to estimate the leader's states and nonlinear dynamics. Then, to achieve leader-based consensus, adaptive distributed controllers are designed for the following agents to track the designed corresponding leader observers. The effectiveness of the leader observers and distributed consensus controllers are illustrated by formal proof and simulation
This work addresses the time history analysis of structures subjected to dynamic loads using high performance computing environments. Structural mechanics, parallel computing, and object-oriented programming methodolo...
详细信息
This work addresses the time history analysis of structures subjected to dynamic loads using high performance computing environments. Structural mechanics, parallel computing, and object-oriented programming methodologies are integrated to design and implement frameworks for parallel and sequential transient finite element analysis (TFE++ and PTFE++). The object-oriented approach is employed to facilitate extensibility, reusability, maintainability and simplicity of the resulting software. Parallel processing concepts and algorithms are used in the design of PTFE++. An application has been developed to demonstrate the developed frameworks. It has been found that PTFE++ provides an efficient way to analyze large structural systems. (C) 2002 Elsevier Science Ltd. All rights reserved.
暂无评论