the performance of the sparse matrix-vector multiplication (SMVM) on a parallel system is strongly affected by the distribution of data among its components. Two costs arise as a result of the used data mapping method...
详细信息
the performance of the sparse matrix-vector multiplication (SMVM) on a parallel system is strongly affected by the distribution of data among its components. Two costs arise as a result of the used data mapping method: arithmetic and communication. the communication cost often dominates the arithmetic cost, and the gap between these costs tends to increase. therefore, finding a mapping method that reduces the communication cost is of high importance. On the other hand, the load distribution among the processing units must not be sacrificed. In this paper, a data mapping method is proposed for SMVM on Network-on-Chip which achieves balanced working load and reduces the communication cost. Afterwards, an FPGA-based architecture is introduced which is designed to fit withthe proposed data mapping method.
Automatic generation of parallel unit tests is an efficient and systematic way of identifying data races inside a program. In order to be effective parallel unit tests have to be analysed by race detectors. However, e...
详细信息
K-Means, a simple but effective clustering algorithm, is widely used in data mining, machine learning and computer vision community. K-Means algorithm consists of initialization of cluster centers and iteration. the i...
详细信息
K-Means, a simple but effective clustering algorithm, is widely used in data mining, machine learning and computer vision community. K-Means algorithm consists of initialization of cluster centers and iteration. the initial cluster centers have a great impact on cluster result and algorithm efficiency. More appropriate initial centers of k-Means can get closer to the optimum solution, and even much quicker convergence. In this paper, we propose a novel clustering algorithm, Kmms, which is the abbreviation of k-Means and Mean Shift. It is a density based algorithm. Experiments show our algorithm not only costs less initialization time compared with other density based algorithms, but also achieves better clustering quality and higher efficiency. And compared withthe popular k-Means++ algorithm, our method gets comparable accuracy, mostly even better. Furthermore, we parallelize Kmms algorithm based on OPenMP from both initialization and iteration step and prove the convergence of the algorithm.
the proceedings contain 26 papers. the special focus in this conference is on Terotechnology. the topics include: the material properties of different zones of joints welded using a laser;tribological properties of di...
ISBN:
(纸本)9783037859803
the proceedings contain 26 papers. the special focus in this conference is on Terotechnology. the topics include: the material properties of different zones of joints welded using a laser;tribological properties of diamond-like carbon coatings;quanitative analysis of the polymer/metal powder magentic composites compacts structure;the average friction coefficient of laser textured surfaces of silicon carbide identified by RSM methodology;evaluation of the state for the material of the live steam superheater pipe coils of V degree;fatigue strength of ductile iron in ultra-high cycle regime;influence of laser surface texturing on scuffing resistance of sliding pairs;contour error of the 3-DoF hydraulic translational parallel manipulator;the impact of the power plant unit start-up scheme on the pollution load;impact of the parameters of laser-vibration treatment on the roughness of aluminium melts;research of the elastic properties of bellows made in SLS technology;the centrifugal pump withthe impeller supported in sealing clearances;enhancing safety and security of networked FPGA-based embedded systems;laser texturing, spark erosion and sanding of the surfaces and their practical applications in heat exchange devices;the influence of electrospark and laser treatment upon corrosive resistance of carbon steel;laser cold ablation as a cutting edge method of forming silicon wafers used in solar cells;laser technologies in Microsystems;modelling of the mechanical state of a diamond particle in the metallic matrix;application the 3d image analysis techniques for assessment the quality of material surface layer before and after laser treatment;modeling of errors counting system for PCB soldered in the wave soldering technology;the optimization of the technological process withthe fuzzy regression;factorial approach to assessment of GPU computational efficiency in surrogate models and on the possibility of the estimation of the depth of a keyhole formed during laser weldings
TOUGH2 is a general-purpose numerical simulation program for multi-dimensional, multiphase, multicomponent fluid flows, heat transfer and contaminant transport in porous and fractured media. It has been used worldwide...
详细信息
TOUGH2 is a general-purpose numerical simulation program for multi-dimensional, multiphase, multicomponent fluid flows, heat transfer and contaminant transport in porous and fractured media. It has been used worldwide for geothermal reservoir engineering, nuclear waste isolation, environmental assessment and remediation, and modeling flow and transport in variably saturated media. TOUGH2 is very computationally intense, and the accuracy and scope of the simulation is limited by the amount of processing power available on a single computer. this makes it an ideal canadate for parallelcomputing, as more CPU power and memory is available. Furthermore, TOUGH2's main computational unit is a linear equation solver. In parallelcomputing, a lot of effort has been spent to develop highly efficient parallel linear equation solvers. In this paper, we present TOUGH2-PETSc, a parallel implementation of TOUGH2 that uses PETSc to solve the linear systems in TOUGH2. PETSc is a library of high-performance linear and non-linear equation solvers that has been throughly tested at scale. Based on TOUGH2 and PETSc, TOUGH2-PETSc gives TOUGH2 users the potential to perform larger scale and higher resolution simulations. Experimental results demonstrate that the parallel TOUGH2-PETSc shows improved performance over the sequential version.
Due to their inherent parallel and non-deterministic nature, P system implementations require vast computing and storage resources. this significantly limits their applications, even more so when the calculation of al...
详细信息
Many industries nowadays use management and decision making based on artificial neural networks. However, the major drawback of neural networks lies in their time and computational complexity. the problem with computa...
详细信息
Many industries nowadays use management and decision making based on artificial neural networks. However, the major drawback of neural networks lies in their time and computational complexity. the problem with computational complexity could be eliminated using sharing of the computing needs on multiple computing nodes. this article focuses on the architectural design of a distributed system, which aims to solve large neural networks. the article describes the technology GPGPU and the next part of the article deals with an overview of methods for speeding up the calculation and distribution of artificial neural network. the main section describes the design of a model architecture description of the algorithm that allows correct data distribution on computational nodes.
the design and construction of a 500VA microinverter for photovoltaic applications is presented. the developed microinverter is capable of operating as a standalone AC voltage source for small loads or, alternatively,...
详细信息
ISBN:
(纸本)9783800735785
the design and construction of a 500VA microinverter for photovoltaic applications is presented. the developed microinverter is capable of operating as a standalone AC voltage source for small loads or, alternatively, as a grid-parallel system. Possible applications are therefore small off-grid installations as well as installations where space constraints do not allow enough modules to be combined to use string inverters. the design of the microinverter is based on a two stage concept. Two alternative topologies for the DC-DC converter stage have been considered. A prototype using a current-fed push-pull converter and a full-bridge output stage has been built and was shown to be operational. the prototype has achieved a maximum overall efficiency of 93.9 %, which includes all sensing, control and driving losses. SiC MOSFETs have been utilized to illustrate the capability of wide-band-gap devices.
In this paper, we propose new variants of unsupervised and competitive learning algorithms designed to deal with temporal sequences. these algorithms combine features from Spiking Neural Networks (SNNs) and the advant...
详细信息
In this paper, we propose new variants of unsupervised and competitive learning algorithms designed to deal with temporal sequences. these algorithms combine features from Spiking Neural Networks (SNNs) and the advantages of the hierarchical self organizing map (HSOM). the first variant named Hierarchical Dynamic recurrent spiking self-organizing map (HD-RSSOM) is characterized by the integration of a temporal controller component to regulate the firing activity of the spiking neurons. the second variant is a hierarchical model which represents a multi-layer extension of HD-RSSOM model. the case study of the proposed HSOM variants is phonemes and words recognition in continuous speech. the applied HSOM variants serve as tools for developing intelligent systems and pursuing artificial intelligence applications.
Grid computing promotes resource sharing, dynamic computational resource allocation, distributed data access from disjoint application domains and allowing various service providers to meet different demands efficient...
详细信息
ISBN:
(纸本)9781479938384
Grid computing promotes resource sharing, dynamic computational resource allocation, distributed data access from disjoint application domains and allowing various service providers to meet different demands efficiently. With an increase of user demands to the resources, planning the guaranteed Quality of Service (QoS) is a challenging task. the objective of this paper is to propose an agent based grid framework and QoS Time Based Scheduling (QTBS) algorithm for effective task scheduling in the grid computing paradigm. this algorithm is simulated in the GridSim toolkit and the results shows that the proposed work gives better results in makespan, resource utilization rate and load balancing level than algorithms such as QoS guided Weighted Mean Time Min (QWMTM), QoS guided MinMin, Max-Min and Min-Min heuristic algorithms.
暂无评论