Storing highly skewed data in a distributed system has become a very frequent issue, in particular withthe emergence of semantic Web and Big Data. this often leads to biased data dissemination among nodes. Addressing...
详细信息
Storing highly skewed data in a distributed system has become a very frequent issue, in particular withthe emergence of semantic Web and Big Data. this often leads to biased data dissemination among nodes. Addressing load imbalance is necessary, especially to minimize response time and avoid workload being handled by only one or few nodes. Our contribution aims at dynamically managing load imbalance by allowing multiple hash functions on different peers, while maintaining consistency of the overlay. Our experiments, on highly skewed data sets from the semantic web, show we can distribute data on at least 300 times more peers than when not using any load balancing strategy.
the emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully suppo...
详细信息
the emergence of Big Data applications provides new challenges in data management such as processing and movement of masses of data. Volunteer computing has proven itself as a distributed paradigm that can fully support Big Data generation. this paradigm uses a large number of heterogeneous and unreliable Internet-connected hosts to provide Peta-scale computing power for scientific projects. Withthe increase in data size and number of devices that can potentially join a volunteer computing project, the host bandwidth can become a main hindrance to the analysis of the data generated by these projects, especially if the analysis is a concurrent approach based on either in-situ or in-transit processing. In this paper, we propose a bandwidth model for volunteer computing projects based on the real trace data taken from the Docking@Home project with more than 280,000 hosts over a 5-year period. We validate the proposed statistical model using model-based and simulation-based techniques. Our modeling provides us with valuable insights on the concurrent integration of data generation with in-situ and in-transit analysis in the volunteer computing paradigm.
Mathematical optimization algorithms are ubiquitous in computational science and engineering where the objective function of the optimization problem involves a complicated computer model predicting relevant phenomena...
详细信息
Mathematical optimization algorithms are ubiquitous in computational science and engineering where the objective function of the optimization problem involves a complicated computer model predicting relevant phenomena of a scientific or engineering system of interest. therefore, in this area of mathematical software, it is indispensable to combine software for optimization with software for simulation, typically developed independently of each other by members of separate scientific communities. From a software engineering point of view, the situation becomes even more challenging when the simulation software is developed using a parallel programming paradigm without taking into consideration that it will be executed within an optimization context. the EFCOSS environment alleviates some of the problems by serving as an interfacing layer between optimization software and simulation software. In this paper, we show the software design of those parts of EFCOSS that are relevant to the integration of a simulation software involving different parallel programming paradigms. the parallel programming paradigms supported by EFCOSS include MPI for distributed memory and OpenMP for shared memory. In addition, the simulation software can be executed on a remote parallel computer.
Geometrically nonlinear forced vibrations of three dimensional structures, due to harmonic excitations, are investigated in the frequency domain. Structures of elastic materials are considered and the discretized equa...
详细信息
ISBN:
(纸本)9783662438800;9783662438794
Geometrically nonlinear forced vibrations of three dimensional structures, due to harmonic excitations, are investigated in the frequency domain. Structures of elastic materials are considered and the discretized equation of motion is derived by the finite element method, using Elmer software. the shooting and the continuation methods are applied to the resulting large scale FEM system by using scalable parallel solvers. Periodic steady-state solutions are of interest and their computation is achieved by two techniques: shooting and continuation methods. the periodic solutions are obtained by shooting method, i.e. by solving a two-point boundary value problem defined by the periodicity condition. For that purpose, a time integration scheme, such as Newmark's method is used and the correction of the initial guess is accomplished through a Newton-Raphson method. the next solution of the bifurcation diagram is obtained by the arc-length continuation method. A prediction for the new point from the bifurcation diagram is defined by using the previous solution and the new solution is obtained by correcting the prediction, i.e. by shooting method. the main objective of the current work is to investigate the potential of the proposed methods for the efficient computation of the bifurcation diagrams of large-scale dynamical systems, which result from the discretization in space of real-life structures, achieved by appropriate numerical techniques and parallel algorithms.
the performance of the sparse matrix-vector multiplication (SMVM) on a parallel system is strongly affected by the distribution of data among its components. Two costs arise as a result of the used data mapping method...
详细信息
the performance of the sparse matrix-vector multiplication (SMVM) on a parallel system is strongly affected by the distribution of data among its components. Two costs arise as a result of the used data mapping method: arithmetic and communication. the communication cost often dominates the arithmetic cost, and the gap between these costs tends to increase. therefore, finding a mapping method that reduces the communication cost is of high importance. On the other hand, the load distribution among the processing units must not be sacrificed. In this paper, a data mapping method is proposed for SMVM on Network-on-Chip which achieves balanced working load and reduces the communication cost. Afterwards, an FPGA-based architecture is introduced which is designed to fit withthe proposed data mapping method.
Recently, we have shown that a translating bar on which blindfolded participants position their hand is perceived as also rotating. Here, we investigated whether such an illusory rotation would also be found if a sphe...
详细信息
ISBN:
(纸本)9783662441961;9783662441954
Recently, we have shown that a translating bar on which blindfolded participants position their hand is perceived as also rotating. Here, we investigated whether such an illusory rotation would also be found if a sphere or a plane (i.e. a stimulus without a clear orientation) was used as translating stimulus. We indeed found similar rotation biases: on average a stimulus that translates over a distance of 60 cm has to rotate 25. to be perceived as non-rotating. An additional research question was whether the biases were caused by the same underlying biasing egocentric reference frame. To our surprise, the correlations between the sizes of the biases of the individual participants in the various conditions were not high and mostly not even significant. this was possibly due to day-to-day variations, but clearly, more research is needed to answer this second research question.
Many healthcare units are creating cloud strategies and migration plans in order to exploit the benefits of cloud based computing. this generally involves collaboration between healthcare specialists and data manageme...
详细信息
ISBN:
(纸本)9783662433522;9783662433515
Many healthcare units are creating cloud strategies and migration plans in order to exploit the benefits of cloud based computing. this generally involves collaboration between healthcare specialists and data management researchers to create a new wave of healthcare technology and services. However, in many cases the technology pioneers are ahead of government policies as cloud based storage of healthcare data is not yet permissible in many jurisdictions. One approach is to store anonymised data on the cloud and maintain all identifying data locally. At login time, a simple protocol can be developed to allow clinicians to combine both sets of data for selected patients for the current session. However, the management of off-cloud identifying data requires a framework to ensure sharing and availability of data within clinics and the ability to share data between users in remote clinics. In this paper, we introduce the PACE healthcare architecture which uses a combination of Cloud and Peer-to-Peer technologies to model healthcare units or clinics where off-cloud data is accessible to all, and where exchange of data between remote healthcare units is also facilitated.
K-Means, a simple but effective clustering algorithm, is widely used in data mining, machine learning and computer vision community. K-Means algorithm consists of initialization of cluster centers and iteration. the i...
详细信息
K-Means, a simple but effective clustering algorithm, is widely used in data mining, machine learning and computer vision community. K-Means algorithm consists of initialization of cluster centers and iteration. the initial cluster centers have a great impact on cluster result and algorithm efficiency. More appropriate initial centers of k-Means can get closer to the optimum solution, and even much quicker convergence. In this paper, we propose a novel clustering algorithm, Kmms, which is the abbreviation of k-Means and Mean Shift. It is a density based algorithm. Experiments show our algorithm not only costs less initialization time compared with other density based algorithms, but also achieves better clustering quality and higher efficiency. And compared withthe popular k-Means++ algorithm, our method gets comparable accuracy, mostly even better. Furthermore, we parallelize Kmms algorithm based on OPenMP from both initialization and iteration step and prove the convergence of the algorithm.
In this article we present a performance study of our finite element package Hierarchical Hybrid Grids (HHG) on current European supercomputers. HHG is designed to close the gap between the flexibility of finite eleme...
详细信息
ISBN:
(纸本)9783662438800;9783662438794
In this article we present a performance study of our finite element package Hierarchical Hybrid Grids (HHG) on current European supercomputers. HHG is designed to close the gap between the flexibility of finite elements and the efficiency of geometric multigrid by using a compromise between structured and unstructured grids. A coarse input finite element mesh is refined in a structured way, resulting in semi-structured meshes. Within this article we compare and analyze the efficiencies of the stencil-based code on those clusters.
TOUGH2 is a general-purpose numerical simulation program for multi-dimensional, multiphase, multicomponent fluid flows, heat transfer and contaminant transport in porous and fractured media. It has been used worldwide...
详细信息
TOUGH2 is a general-purpose numerical simulation program for multi-dimensional, multiphase, multicomponent fluid flows, heat transfer and contaminant transport in porous and fractured media. It has been used worldwide for geothermal reservoir engineering, nuclear waste isolation, environmental assessment and remediation, and modeling flow and transport in variably saturated media. TOUGH2 is very computationally intense, and the accuracy and scope of the simulation is limited by the amount of processing power available on a single computer. this makes it an ideal canadate for parallelcomputing, as more CPU power and memory is available. Furthermore, TOUGH2's main computational unit is a linear equation solver. In parallelcomputing, a lot of effort has been spent to develop highly efficient parallel linear equation solvers. In this paper, we present TOUGH2-PETSc, a parallel implementation of TOUGH2 that uses PETSc to solve the linear systems in TOUGH2. PETSc is a library of high-performance linear and non-linear equation solvers that has been throughly tested at scale. Based on TOUGH2 and PETSc, TOUGH2-PETSc gives TOUGH2 users the potential to perform larger scale and higher resolution simulations. Experimental results demonstrate that the parallel TOUGH2-PETSc shows improved performance over the sequential version.
暂无评论