Strongly correlated electron materials are materials where the electron-electron interaction is so strong it is of primary importance in determining how the materials behave. These materials may be tuned to pass throu...
详细信息
Strongly correlated electron materials are materials where the electron-electron interaction is so strong it is of primary importance in determining how the materials behave. These materials may be tuned to pass through a metal-insulator transition as a function of chemical doping, pressure, or temperature. This tunability of their properties makes them good candidates for so-called "smart materials" that may change their properties to respond to the particular needs of a device. We employ a massively parallel algorithm to solve exactly for the response of these strongly correlated materials to the presence of a large electric field, including all nonlinear and non-equilibrium effects. The work was performed as a CAP Phase IIproject on the ERDC XT3 computer in the Winter of 2006. The code displayed good scalability up to approximately 3000 processors although most production runs employed about 1500 processors, because of machine load
In this paper, we propose a new term weighting scheme called term frequency-inverse corpus frequency (TF-ICF). It does not require term frequency information from other documents within the document collection and thu...
详细信息
In this paper, we propose a new term weighting scheme called term frequency-inverse corpus frequency (TF-ICF). It does not require term frequency information from other documents within the document collection and thus, it enables us to generate the document vectors of N streaming documents in linear time. In the context of a machine learning application, unsupervised document clustering, we evaluated the effectiveness of the proposed approach in comparison to five widely used term weighting schemes through extensive experimentation. Our results show that TF-ICF can produce document clusters that are of comparable quality as those generated by the widely recognized term weighting schemes and it is significantly faster than those methods
Modern forensic analytics applications, like network traffic analysis, perform high-performance hypothesis testing, knowledge discovery and data mining on very large datasets. One essential strategy to reduce the time...
详细信息
Modern forensic analytics applications, like network traffic analysis, perform high-performance hypothesis testing, knowledge discovery and data mining on very large datasets. One essential strategy to reduce the time required for these operations is to select only the most relevant data records for a given computation. In this paper, we present a set of parallel algorithms that demonstrate how an efficient selection mechanism - bitmap indexing - significantly speeds up a common analysis task, namely, computing conditional histogram on very large datasets. We present a thorough study of the performance characteristics of the parallel conditional histogram algorithms. As a case study, we compute conditional histograms for detecting distributed scans hidden in a dataset consisting of approximately 2.5 billion network connection records. We show that these conditional histograms can be computed on interactive time scale (i.e., in seconds). We also show how to progressively modify the selection criteria to narrow the analysis and find the sources of the distributed scans
Today, due to the wide variety of existing parallel systems consisting on collections of heterogeneous machines, it is very difficult for a user to solve a target problem by using a single algorithm or to write portab...
详细信息
Today, due to the wide variety of existing parallel systems consisting on collections of heterogeneous machines, it is very difficult for a user to solve a target problem by using a single algorithm or to write portable programs that perform well on multiple computational supports. The inherent heterogeneity and the diversity of networks of such environments represent a great challenge to model the communications for high performance computing applications. Our objective within this work is to propose a generic framework based on communication models and adaptive techniques for dealing with prediction of communication performances on based-clusters hierarchical platforms. Toward this goal, we introduce the concept of Poly-model of communications that corresponds to techniques to better model the communications in terms of the characteristics of the hardware resources of the target parallel system. We apply this methodology on collective communication operations and show that the framework provides significant performances while determining the best combination model-algorithm depending on the problem and architecture parameters
Frequent itemset mining is a classic problem in data mining. It is a nonsupervised process which concerns in finding frequent patterns (or itemsets) hidden in large volumes of data in order to produce compact summarie...
详细信息
Frequent itemset mining is a classic problem in data mining. It is a nonsupervised process which concerns in finding frequent patterns (or itemsets) hidden in large volumes of data in order to produce compact summaries or models of the database. These models are typically used to generate association rules, but recently they have also been used in far reaching domains like e-commerce and bio-informatics. Because databases are increasing in terms of both dimension (number of attributes) and size (number of records), one of the main issues in a frequent itemset mining algorithm is the ability to analyze very large databases. Sequential algorithms do not have this ability, especially in terms of run-time performance, for such very large databases. Therefore, we must rely on high performance parallel and distributed computing. We present new parallel algorithms for frequent itemset mining. Their efficiency is proven through a series of experiments on different parallel environments, that range from shared-memory multiprocessors machines to a set of SMP clusters connected together through a high speed network. We also briefly discuss an application of our algorithms to the analysis of large databases collected by a Brazilian Web portal.
The knapsack problem is a famous NP-hard problem, the solution for which usually requires not only exponential time, but also exponential space. It is for this cause that it is very important in cryptosystem and numbe...
详细信息
The knapsack problem is a famous NP-hard problem, the solution for which usually requires not only exponential time, but also exponential space. It is for this cause that it is very important in cryptosystem and number theory. Based on the two-list algorithm and the parallel three-list algorithm, this paper proposes a parallel three-list algorithm for the solution of knapsack problem. To avoid the possible memory conflicts, the method of dividing and conquering, and parallel merging without memory conflicts are adopted in the algorithm. The proposed algorithm needs O(23n/8) time when O(23n/8) shared memory units and O(2n/4) processors are available for the objectivity to find a solution for a n-element knapsack instance. The comparison of the proposed algorithm with the past researches show that it is the first EREW parallel algorithm that can solve the knapsack instances in less than O(2n/2) time when available hardware resource is also less than O(2n/2), and thus it is an improved result over the past researches, and may have some impact on the researches on the knapsack based public-key cryptosystem.
This paper describes how to achieve a desired speedup by careful selection of appropriate algorithms for parallelization. Our target simulation is the total airport and airspace model (TAAM), a worldwide standard for ...
详细信息
This paper describes how to achieve a desired speedup by careful selection of appropriate algorithms for parallelization. Our target simulation is the total airport and airspace model (TAAM), a worldwide standard for aviation analysis. TAAM is designed as a sequential program, and we have increased its speed by incorporating multithreaded algorithms with minimal changes to the underlying simulation architecture. Our method was to identify algorithms that are bottlenecks in the computation and that can be executed concurrently, producing a hybrid sequential and parallel simulation. Our results show a performance gain that varied between 14% and 33%.
The bandwidth allocation problem in ATM networks is NP-complete. This paper presents a novel generalized particle approach (GPA) to optimize the bandwidth allocation and QoS parameter for ATM networks. The GPA transfo...
详细信息
The bandwidth allocation problem in ATM networks is NP-complete. This paper presents a novel generalized particle approach (GPA) to optimize the bandwidth allocation and QoS parameter for ATM networks. The GPA transforms the optimization of ATM networks into a kinematics and dynamics of numerous particles in a force-field. The GPA has many advantages in terms of the higher parallelism, multi-objective optimization, multi-type coordination, and easiness for hardware implementation. During the ATM networks optimization, the GPA may deal with a variety of random and emergent phenomena, such as the congestion, failure, and interaction. This paper also gives the GPA's properties regarding its correctness, convergency and stability. The simulations have shown the effectiveness and suitability of the GPA to the optimization of ATM networks
The modified Gram-Schmidt algorithm (MGS) is used in many fields of computational science as a basic part for problems which relate to Numerical Linear Algebra. In this paper we describe different parallel implementat...
详细信息
ISBN:
(纸本)3540287000
The modified Gram-Schmidt algorithm (MGS) is used in many fields of computational science as a basic part for problems which relate to Numerical Linear Algebra. In this paper we describe different parallel implementations (blocked and unblocked) of the MGS-algorithm and show how computation and calculation overlap can increase the performance up to 38 percent on the two different Clusters platforms which where used for performance evaluation.
We focus on the interplay between the choice of partition (problem decomposition) and the corresponding rate of convergence of parallel numerical algorithrns. Using a specific algorithin, for which the numerics depend...
详细信息
We focus on the interplay between the choice of partition (problem decomposition) and the corresponding rate of convergence of parallel numerical algorithrns. Using a specific algorithin, for which the numerics depend upon the partition, we demonstrate that the rate of convergence can depend strongly on the choice of the partition. This dependence is shown to be a function of the algorithm and of the choice of problem. Information gleaned from tests using various 2-way partitions leads to new partitions for which some degree of convergence robustness is exhibited. The incorporation of a known correction for approximate Schur complements into the original algorithm yields a modified parallel algorithm which numerical experiments indicate achieves robust convergence behaviour with respect to the choice of partition. We conclude that tests of a parallel algorithm which vary the method of partitioning can provide constructive information regarding the robustness of the algorithm and guidance for modifying the algorithm or the choice of partitioning algorithm to make the overall computations more robust. Copyright (C) 2003 John Wiley Sons, Ltd.
暂无评论