The paper considers algorithm for computing sparse QR decomposition of a specially ordered rectangular matrix. Decomposition is based on block sparse Householder transformations. For ordering computations the ND-type ...
详细信息
The paper considers algorithm for computing sparse QR decomposition of a specially ordered rectangular matrix. Decomposition is based on block sparse Householder transformations. For ordering computations the ND-type ordering for sparsity of ATA matrix can be used, here A-original rectangular matrix. For mesh based problems the ordering can be constructed starting from appropriate volume partitioning of the computational mesh. parallel computations are based on sparse QR decomposition for sets of rows with additional zero block at the beginning. The suggested algorithm is planned to be used as main computational kernel in the developed by the author parallel iterative algorithms for solving SLAEs and least squares problems. The corresponding algorithms will be based on composition of the subspaces represented by sparse bases.
In the era of Bigdata, we are deluged with massive graph data emerged from numerous social and scientific applications. In most cases, graph data are generated as lists of edges (edge list), where an edge denotes a li...
详细信息
Linkage disequilibrium method is applied for the research on inferring population genetics, LD mapping, haploid type diversity analysis and so on. Soybean genotypes are adopted as the data source and linkage disequili...
详细信息
ISBN:
(纸本)9781509025367
Linkage disequilibrium method is applied for the research on inferring population genetics, LD mapping, haploid type diversity analysis and so on. Soybean genotypes are adopted as the data source and linkage disequilibrium parallel algorithm is implemented by OpenMP technology. In this algorithm, single nucleotide polymorphism sites are divided by using sliding windows into groups, adjacent sites allele in a window of each chromosome are parallel calculated and store the LD results. According to the experimental data, the serial and parallel algorithms are compared and analyzed. The conclusion shows that the OpenMP parallel technology can effectively improve the efficiency of linkage disequilibrium analysis method. It is a realistic significance for processing massive biological information data.
The bioavailability of transition metals in sediments often depends on redox conditions in the sediment. We explored how the physicochemistry and toxicity of anoxic Cu-amended sediments changed as they aged (i.e., nat...
详细信息
The bioavailability of transition metals in sediments often depends on redox conditions in the sediment. We explored how the physicochemistry and toxicity of anoxic Cu-amended sediments changed as they aged (i.e., naturally oxidized) in a flow-through flume. We amended two sediments (Dow and Ocoee) with Cu, incubated the sediments in a flow-through flume, and measured sediment physicochemistry and toxicity over 213 days. As sediments aged, oxygen penetrated sediment to a greater depth, the relative abundance of Fe oxides increased in surface and deep sediments, and the concentration of acid volatile sulfide declined in Ocoee surface sediments. The total pool of Cu in sediments did not change during aging, but porewater Cu, and Cu bound to amorphous Fe oxides decreased while Cu associated with crystalline Fe oxides increased. The dose-response of the epibenthic amphipod Hyalella azteca to sediment total Cu changed over time, with older sediments being less toxic than freshly spiked sediments. We observed a strong doseresponse relationship between porewater Cu and H. azteca growth across all sampling periods, and measurable declines in relative growth rates were observed at concentrations below interstitial water criteria established by the U.S. EPA. Further, solid-phase bioavailability models based on AVS and organic carbon were overprotective and poorly predicted toxicity in aged sediments. We suggest that sediment quality criteria for Cu is best established from measurement of Cu in pore water rather than estimating bioavailable Cu from the various solid-phase ligands, which vary temporally and spatially.
In recent years, detecting dense sub-graphs that are known as communities in massive graphs has been a common issue in different fields of science. It provides the facility of studying complex graphs by simplifying th...
详细信息
In this paper, we present a novel method that exploits the great parallel capability of multi-cores to speed up the famous Count-Min sketch algorithmThe proposed parallel Count-Min sketch algorithm equally distributes...
详细信息
In this paper, we present a novel method that exploits the great parallel capability of multi-cores to speed up the famous Count-Min sketch algorithmThe proposed parallel Count-Min sketch algorithm equally distributes the input data stream into sub-threads which use the original Count-Min sketch algorithm to process the sub-streamsThe counters in each local Count-Min sketch with frequency increments exceeding a pre-defined threshold are sent to a merging thread which is able to return the estimated frequencies satisfying the(ε, δ)-approximation requirementExperiments with real traffic traces demonstrate the excellent performance as well as the effects of parametersThe parallel Count-Min sketch algorithm achieves near-linear speedup at the cost of greater memory use.
Graphlets represent small induced subgraphs and are becoming increasingly important for a variety of applications. Despite the importance of the local subgraph (graphlet) counting problem, existing work focuses mainly...
详细信息
ISBN:
(纸本)9781467390064
Graphlets represent small induced subgraphs and are becoming increasingly important for a variety of applications. Despite the importance of the local subgraph (graphlet) counting problem, existing work focuses mainly on counting graphlets globally over the entire graph. These global counts have been used for tasks such as graph classification as well as for understanding and summarizing the fundamental structural patterns in graphs. In contrast, this work proposes an accurate, efficient, and scalable parallel framework for the more challenging problem of counting graphlets locally for a given edge or set of edges. The local graphlet counts provide a topologically rigorous characterization of the local structure surrounding an edge. The aim of this work is to obtain the count of every graphlet of size k for each edge. The framework gives rise to efficient, parallel, and accurate unbiased estimation methods with provable error bounds, as well as exact algorithms for counting graphlets locally. Experiments demonstrate the effectiveness of the proposed exact and estimation methods on various datasets. In particular, the exact methods show strong scaling results (11-16x on 16 cores). Moreover, our estimation framework is accurate with error less than 5% on average.
Reconfigurable models were shown to be very powerful in solving many problems faster than non-reconfigurable models. WECPAR is an reconfigurable model that has point-to-point reconfigurable interconnection with wires ...
详细信息
Reconfigurable models were shown to be very powerful in solving many problems faster than non-reconfigurable models. WECPAR is an reconfigurable model that has point-to-point reconfigurable interconnection with wires between neighboring processors. This paper studies several aspects of WECPAR. We first consider solving the list ranking problem on WECPAR. Some of the results obtained show that, ranking one element in a list of elements can be solved on WECPAR in time. Also, on , ranking a list of elements can be done in time. Then, we assess the relative computational power of WECPAR and transfer a large body of algorithms to work directly on WECPAR. We introduce several simulation algorithms between WECPAR and well-known models such as PRAM and RMBM. Simulation algorithms show that a PRIORITY CRCW PRAM of processors and shared memory locations can be simulated on WECPAR in time. Also, we show that a PRIORITY CRCW basic RMBM(, of processors and buses can be simulated on WECPAR in time. This directly migrate a large number of algorithms to work on WECPAR with the simulation overhead.
In recent years, probabilistic data management has received a lot of attention due to several applications that deal with uncertain data: RFID systems, sensor networks, data cleaning, scientific and biomedical data ma...
详细信息
In recent years, probabilistic data management has received a lot of attention due to several applications that deal with uncertain data: RFID systems, sensor networks, data cleaning, scientific and biomedical data management, and approximate schema mappings. Query evaluation is a challenging problem in probabilistic databases, proved to be #P-hard. A general method for query evaluation is based on the lineage of the query and reduces the query evaluation problem to computing the probability of a propositional formula. The main approaches proposed in the literature to approximate probabilistic queries confidence computation are based on Monte Carlo simulation, or formula compilation into decision diagrams (e.g., d-trees). The former executes a polynomial, but with too many, iterations, while the latter is polynomial for easy queries, but may be exponential in the worst case. We designed a new optimized Monte Carlo algorithm that drastically reduces the number of iterations and proposed an efficient parallel version that we implemented on GPU. Thanks to the elevated degree of parallelism provided by the GPU, combined with the linear speedup of our algorithm, we managed to reduce significantly the long running time required by a sequential Monte Carlo algorithm. Experimental results show that our algorithm is so efficient as to be comparable with the formula compilation approach, but with the significant advantage of avoiding exponential behavior.
We present a parallel algorithm for computing an equilibrium path in a large-scale eco-nomic growth model. We exploit the special block structure of the nonlinear systems of equations common in such models. Our algori...
详细信息
We present a parallel algorithm for computing an equilibrium path in a large-scale eco-nomic growth model. We exploit the special block structure of the nonlinear systems of equations common in such models. Our algorithm is based on an iterative method of Gauss-Seidel type with prices of different time periods calculated simultaneously rather than recursively. We have implemented the parallel algorithm in OpenMP and MPI programming environments. The numerical results show that speedup im-proves almost linearly as number of nodes increases. Different methods for solving an individual block: Newton-type methods, Krylov subspace methods and trust-region methods, give similar results for the speedup.
暂无评论