With the rapid development of information communication, computer and control technology, the smart grid has become a direction and trend of the development of electric power industry. The ultimate goal of smart grid ...
详细信息
ISBN:
(纸本)9781509035403
With the rapid development of information communication, computer and control technology, the smart grid has become a direction and trend of the development of electric power industry. The ultimate goal of smart grid is to build a panoramic real-time system which covers the whole production process of power system. However, it is difficult to meet the demand of the power system dispatching department to store and process large scale data in the current power system. In view of the above reasons, this paper develops the Mysql-CIM model, realizes the distributed cloud storage of power network data, and develops the application of the parallel topology processing of power network. Verified by the case, this paper develops the model Mysql-CIM fully meet requirements of the smart grid of system reliability, availability, high throughput; In this paper, the development of the power network based on CIM parallel topological processing is applied to realize the fast network topology processing and topology island formation, its running time scale increased to milliseconds, greatly improving the work efficiency of the system.
Current shared memory multi-core systems require powerful software and hardware techniques to support the performance parallel computation and consistency simultaneously. The use of transactional memory results in sig...
详细信息
Current shared memory multi-core systems require powerful software and hardware techniques to support the performance parallel computation and consistency simultaneously. The use of transactional memory results in significant improvement of performance by avoiding thread synchronization and locks overhead. Also, transactions scheduling apparently influences the performance of transactional memory. In this paper, we study the fairness of transactions' scheduling using Lazy Snapshot Algorithm. The fairness of transactions' scheduling aims to balance between transactions types which are read-only and update transactions. Indeed, we support the fairness of the scheduling procedure by a machine learning technique. The machine learning techniques improve the fairness decisions according to transactions' history. The experiments in this paper show that the throughput of the Lazy Snapshot Algorithm is improved with a machine learning support. Indeed, our experiments show that the learning significantly affects the performance if the durations of update transactions are much longer than read-only ones. We also study several machine learning techniques to investigate the fairness decisions accuracy. In fact, K-Nearest Neighbor machine learning technique shows more accuracy and more suitability, for our problem, than Support Vector Machine Model and Hidden Markov Model.
Various methods have been proposed for enhancing the images. Some of those perform well in some specific application areas but most of the techniques suffer from artifacts due to over enhancement. To overcome this pro...
详细信息
Various methods have been proposed for enhancing the images. Some of those perform well in some specific application areas but most of the techniques suffer from artifacts due to over enhancement. To overcome this problem, we have introduced a new image enhancement technique namely Bilateral Histogram Equalization with Pre-processing (BHEP) which uses Harmonic mean to divide the histogram of the image. We have performed both qualitative and quantitative measurements for experiments and the results show that BHEP creates less artifacts in several standard images than the existing state-of-the-art image enhancement techniques.
In this paper, we present a bottom-up approach to parallel anisotropic mesh generation by building a mesh generator from principles. applications focusing on high-lift design or dynamic stall, or numerical methods and...
详细信息
ISBN:
(纸本)9781509028245
In this paper, we present a bottom-up approach to parallel anisotropic mesh generation by building a mesh generator from principles. applications focusing on high-lift design or dynamic stall, or numerical methods and modeling test cases still focus on the two-dimensions. Our push-button parallel mesh generation approach can generate high-fidelity unstructured meshes with anisotropic boundary layers for use in the computational fluid dynamics field. The anisotropy requirement adds a level of complexity to a parallel meshing algorithm by making computation depend on the local alignment of elements, which in turn is dictated by geometric boundaries and the density functions. Our experimental results show 70% parallel efficiency over the fastest sequential isotropic mesh generator on 256 distributed memory nodes.
Transactional memory (TM) has become progressively widespread especially with hardware transactional memory implementation becoming increasingly available. In this paper, we focus on Restricted Transactional Memory (R...
详细信息
Transactional memory (TM) has become progressively widespread especially with hardware transactional memory implementation becoming increasingly available. In this paper, we focus on Restricted Transactional Memory (RTM) in Intel's Haswell processor and show that performance of RTM varies across applications. While RTM enhances performance of some applications relative to software transactional memory (STM), in some others, it degrades performance. We exploit this variability and present an adaptive system which is a static approach that switches between HTM and STM in transaction granularity. By incorporating a decision tree prediction module, we are able to predict the optimum TM system for a given transaction based on its characteristics. Our adaptive system supports both HTM and STM with the aim of increasing an application's performance. We show that our adaptive system has an average overall speedup of 20.82% over both TM systems.
A static method for balancing computational loads in parallel implementations of the finite-difference time-domain method is presented. The procedure is fairly straightforward and computationally inexpensive, thus pro...
详细信息
ISBN:
(纸本)9781467398121
A static method for balancing computational loads in parallel implementations of the finite-difference time-domain method is presented. The procedure is fairly straightforward and computationally inexpensive, thus providing an attractive alternative to optimization techniques. The method is described for partitioning in a single mesh dimension, but it is shown that it can be adapted also for 2D and 3D partitioning in approximate way, with good results. It is applicable to both homogeneous and heterogeneous parallel architectures, and can also be used for balancing memory on distributed memory architectures.
Despite the recent advances in computer vision and the proliferation of applications for tracking, image classification, and video analysis;very little applied work has been done to improve techniques for underwater v...
详细信息
One of the main challenges for computer architects is how to hide the high average memory access latency from the processor. In this context, Hybrid Memory Cubes (HMCs) can provide substantial energy and bandwidth imp...
详细信息
One of the main challenges for computer architects is how to hide the high average memory access latency from the processor. In this context, Hybrid Memory Cubes (HMCs) can provide substantial energy and bandwidth improvements compared to traditional memory organizations. However, it is not clear how this reduced average memory access latency will impact the LLC. For applications with high cache miss ratios, the latency to search for the data inside the cache memory will impact negatively on the performance. The importance of this overhead depends on the memory access latency. In this paper, we present an evaluation of the L3 cache importance on a high performance processor using HMC also exploring chip area tradeoffs between the cache size and number of processor cores. We show that the high bandwidth provided by HMC memories can eliminate the need for L3 caches, removing hardware and making room for more processing power. Our evaluations show that performance increased 37% and the EDP improved 12% while maintaining the same original chip area in a wide range of parallelapplications, when compared to DDR3 memories.
The programming of heterogeneous clusters is inherently complex, as these architectures require programmers to manage both distributed memory and computational units with a very different nature. Fortunately, there ha...
详细信息
The programming of heterogeneous clusters is inherently complex, as these architectures require programmers to manage both distributed memory and computational units with a very different nature. Fortunately, there has been extensive research on the development of frameworks that raise the level of abstraction of cluster-based applications, thus enabling the use of programming models that are much more convenient that the traditional one based on message-passing. One of such proposals is the Hierarchically Tiled Array (HTA), a data type that represents globally distributed arrays on which it is possible to perform a wide range of data-parallel operations. In this paper we explore for the first time the development of heterogeneous applications for clusters using HTAs. In order to use a high level API also for the heterogeneous parts of the application, we developed them using the Heterogeneous Programming Library (HPL), which operates on top of OpenCL but providing much better programmability. Our experiments show that this approach is a very attractive alternative, as it obtains large programmability benefits with respect to a traditional implementation based on MPI and OpenCL, while presenting average performance overheads just around 2%.
Big data decision-making techniques take advantage of large-scale data to extract important insights from them. One of the most important classes of such techniques falls in the domain of graph applications, where dat...
详细信息
ISBN:
(纸本)9781509028245
Big data decision-making techniques take advantage of large-scale data to extract important insights from them. One of the most important classes of such techniques falls in the domain of graph applications, where data segments and their inherent relationships are represented as vertices and edges. Efficiently processing large-scale graphs involves many subtle tradeoffs and is still regarded as an open-ended problem. Furthermore, as modern data centers move towards increased heterogeneity, the traditional assumption of homogeneous environments in current graph processing frameworks is no longer valid. Prior work estimates the graph processing power of heterogeneous machines by simply reading hardware configurations, which leads to suboptimal load balancing. In this paper, we propose a profiling methodology leveraging synthetic graphs for capturing a node's computational capability and guiding graph partitioning in heterogeneous environments with minimal overheads. We show that by sampling the execution of applications on synthetic graphs following a power-law distribution, the computing capabilities of heterogeneous clusters can be captured accurately (
暂无评论