Social media is very important factor in analyzing modern society as a whole, their values, norms, and behaviors, as being a part of our everyday life. This study is oriented towards analyzing social media in order to...
详细信息
ISBN:
(纸本)9781538633373
Social media is very important factor in analyzing modern society as a whole, their values, norms, and behaviors, as being a part of our everyday life. This study is oriented towards analyzing social media in order to allow users to create their own preferences to follow (analyze) a specific social media source. The web application has been developed to allow a user to follow specific Facebook accounts and categorize the Facebook posts on those accounts based on the user defined taxonomies. Results of this study are various reports generated from the Facebook posts and their statistics that are clustered based on the user defined taxonomies. The benefit of this project is that any user can track in real time when people are talking about some topic, and it enables anyone to have better insight about society as a whole, their values, norms, what they find interesting, and many other things. This tool is also useful for different companies to track the user feedback on social networks for their products.
Although the advancement of cyber technologies in sensing, communication and smart measurement devices significantly enhanced power system security and reliability, its dependency on data communications makes it vulne...
详细信息
ISBN:
(纸本)9781538617762
Although the advancement of cyber technologies in sensing, communication and smart measurement devices significantly enhanced power system security and reliability, its dependency on data communications makes it vulnerable to cyber-attacks. Coordinated false data injection (FDI) attacks manipulate power system measurements in a way that emulate the real behaviour of the system and remain unobservable, which misleads the state estimation process, and may result in power outages and even system blackouts. In this paper a robust dynamic state estimation (DSE) algorithm is proposed and implemented on the massively parallel architecture of graphic processing unit (GPU). Numerical simulation on IEEE-118 bus system demonstrate the efficiency and accuracy of the proposed mechanism.
Image processing could be done in CPU or in Graphical Processing Unit (GPU), using sequential programming or parallel programming respectively. Sequential and parallel programming are good in their own paradigm. This ...
详细信息
ISBN:
(纸本)9781509047154
Image processing could be done in CPU or in Graphical Processing Unit (GPU), using sequential programming or parallel programming respectively. Sequential and parallel programming are good in their own paradigm. This paper analyses the performances of various basic image processing algorithms on GPU as well as CPU. Various images with a range of dimensions have been used for the testing purpose. The results show that the usability of the GPU for image processing problems is highly depends on the nature of the problem and also on the size of the problem domain.
In the recent years the search for scalability in terms of computing power has led to very complex parallel computer architectures which require greater control of the storage and computation resources to utilize all ...
详细信息
ISBN:
(纸本)9780769561493
In the recent years the search for scalability in terms of computing power has led to very complex parallel computer architectures which require greater control of the storage and computation resources to utilize all the available hardware capacity for optimal performance. New solutions in the level of programming languages/models have increased the reliance and need for threads. A system with a huge number of threads can face problems with thread micro-management, smooth scaling between data and task parallelism, portability, and consistency. We present TCF++, a new concurrent C/C++ language extension generalizing on the idea of threads with so called thick control flows. Opposed to threading, thick control flows provide a way to orchestrate computation using lower number of independent actors, dynamically adapting to problem size. The language extension approach is chosen to support mixing with legacy code. We qualitatively analyze the new language's eligibility and explain its idiomatic use with a selection of core parallel algorithm kernels.
Existing graph-processing frameworks let users develop efficient implementations for many graph problems, but none of them support efficiently bucketing vertices, which is needed for bucketing-based graph algorithms s...
详细信息
ISBN:
(纸本)9781450345934
Existing graph-processing frameworks let users develop efficient implementations for many graph problems, but none of them support efficiently bucketing vertices, which is needed for bucketing-based graph algorithms such as.-stepping and approximate set-cover. Motivated by the lack of simple, scalable, and efficient implementations of bucketing-based algorithms, we develop the Julienne framework, which extends a recent shared-memory graph processing framework called Ligra with an interface for maintaining a collection of buckets under vertex insertions and bucket deletions. We provide a theoretically efficient parallel implementation of our bucketing interface and study several bucketing-based algorithms that make use of it (either bucketing by remaining degree or by distance) to improve performance: the peeling algorithm for k-core (coreness), Delta-stepping, weighted breadth-first search, and approximate set cover. The implementations are all simple and concise (under 100 lines of code). Using our interface, we develop the first work-efficient parallel algorithm for k-core in the literature with nontrivial parallelism. We experimentally show that our bucketing implementation scales well and achieves high throughput on both synthetic and real-world workloads. Furthermore, the bucketing-based algorithms written in Julienne achieve up to 43x speedup on 72 cores with hyper-threading over well-tuned sequential baselines, significantly outperform existing work-inefficient implementations in Ligra, and either outperform or are competitive with existing special-purpose parallel codes for the same problem. We experimentally study our implementations on the largest publicly available graphs and show that they scale well in practice, processing real-world graphs with billions of edges in seconds, and hundreds of billions of edges in a few minutes. As far as we know, this is the first time that graphs at this scale have been analyzed in the main memory of a single multicore m
The paper presents a practical approach for building high-level services for teaching parallel and distributed computing based on Everest platform. Originally designed for publication of computing applications, the pl...
详细信息
ISBN:
(纸本)9783319589435;9783319589428
The paper presents a practical approach for building high-level services for teaching parallel and distributed computing based on Everest platform. Originally designed for publication of computing applications, the platform is suitable for rapid development of services for running different types of parallel programs on high-performance resources, as well as services for evaluation of practical assignments. As was demonstrated by using Everest for teaching two introductory PDC courses, the proposed approach helps to enhance students' practical experience while avoiding low-level interfaces and providing a level of automation necessary for scaling the course to a large number of students. In contrast to other solutions, the exploited Platform as a Service model provides the ability to quickly reuse this approach by other PDC educators without installation of the platform.
We present an algorithm to compute the intersection of two 3D triangulated meshes. It has applications in GIS, CAD and Additive Manufacturing, and was developed to process big datasets quickly and correctly. The speed...
详细信息
ISBN:
(纸本)9781450354905
We present an algorithm to compute the intersection of two 3D triangulated meshes. It has applications in GIS, CAD and Additive Manufacturing, and was developed to process big datasets quickly and correctly. The speed comes from simple regular data structures that parallelize very well. The correctness comes from using multiple-precision rational arithmetic to prevent roundoff errors and the resulting topological inconsistencies, and symbolic perturbation (simulation of simplicity) to handle special cases (geometric degeneracies). To simplify the symbolic perturbation, the algorithm employs only orientation predicates. This paper focuses on the challenges and solutions of the implementing symbolic perturbation. Our preliminary implementation has intersected two objects totalling 8M triangles in 11 elapsed seconds on a dual 8-core Xeon. The competing LibiGL took 248 seconds and CGAL took 2726 seconds. Our software is freely available for nonprofit research.
Many-integrated core (MIC) architecture combines dozens of reduced x86 cores onto a single chip to offer high degrees of parallelism. The parallel user applications executed across many cores that exist in one or more...
详细信息
ISBN:
(数字)9783319682105
ISBN:
(纸本)9783319682105;9783319682099
Many-integrated core (MIC) architecture combines dozens of reduced x86 cores onto a single chip to offer high degrees of parallelism. The parallel user applications executed across many cores that exist in one or more MICs require a series of work related to data sharing and synchronization with the host. In this work, we build a real CPU+MIC heterogeneous cluster and analyze its performance behaviors by examining different communication methods such as message passing method and remote direct memory accesses. Our evaluation results and in-depth studies reveal that (i) aggregating small messages can improve network bandwidth without violating latency restrictions, (ii) while MICs can execute hundreds of hardware cores, the highest network throughput is achieved when only 4 similar to 6 point-to-point connections are established for data communication, (iii) data communication over multiple point-to-point connections between host and MICs introduce severe load unbalancing, which require to be optimized for future heterogeneous computing.
Solving large-scale problems in a variety of scientific and engineering fields requires efficient hierarchical methods to exploit parallelism. In this paper we present optimizations to enhance the performance of paral...
详细信息
ISBN:
(纸本)9781509015603
Solving large-scale problems in a variety of scientific and engineering fields requires efficient hierarchical methods to exploit parallelism. In this paper we present optimizations to enhance the performance of parallel N-body simulations (NBS) using the Barnes Hut approximation for a 60-core MIC accelerator. We focus on two sources of performance degradation in NBS: (1) the semi-static parallelism which leads to dynamic load unbalancing and (2) the processing of very large data exceeding the cache capacity. A first proposed optimization is to dynamically balance the load by computing load in an iteration as an estimate for the load in the next iteration. This optimization helps in even distribution of the load for the next iteration. The second proposed optimization subdivides the data into well-adjusted chunks to enhance data reuse in shared caches. The proposed optimizations are tested on a 60-core MIC accelerator. Evaluation results showed that optimized NBS produces a speedup of up to 33% due to dynamic load balancing and 260% due to enhanced cached data reuse.
We present measured results for the parallelization efficiency of a code for numerical 2D electromagnetic analysis. The code is based on method of moments and uses higher-order basis functions with Galerkin testing pr...
详细信息
ISBN:
(纸本)9781538632840
We present measured results for the parallelization efficiency of a code for numerical 2D electromagnetic analysis. The code is based on method of moments and uses higher-order basis functions with Galerkin testing procedure. The functions for linear algebra within the code are from Intel MKL library, and the code uses OpenMP for parallelization. A workstation with up to 40 hyper-threads is used for the numerical experiments. The calculation of monostatic radar cross-sections of a 2D structure is used as a test example.
暂无评论