Machine learning, especially deep learning, is revolutionizing how many engineering problems are being solved. Three critical ingredients are needed to apply deep machine learning to significant real world problems: i...
详细信息
ISBN:
(纸本)9780769561493
Machine learning, especially deep learning, is revolutionizing how many engineering problems are being solved. Three critical ingredients are needed to apply deep machine learning to significant real world problems: i.) large data sets;ii.) software to implement deep learning and;iii.) significant computing cycles. This paper discusses the state of each ingredient with a specific focus on: a.) how deep learning can apply to large-scale social network analysis and;b.) the computing resources required to make such analyses feasible.
We are quickly reaching an impasse to the number of transistors that can be squeezed onto a single chip. This has led to a scramble for new nanotechnologies and the subsequent emergence of new computing architectures ...
详细信息
ISBN:
(纸本)9781479982004
We are quickly reaching an impasse to the number of transistors that can be squeezed onto a single chip. This has led to a scramble for new nanotechnologies and the subsequent emergence of new computing architectures capable of exploiting these nano-devices. The memristor is a promising More-than-Moore device because of its unique ability to store and manipulate data on the same device. In this paper, we propose a flexible architecture of memristive crossbar networks for computing Boolean formulas. Our design nullifies the gap between processor and memory in von Neumann architectures by using the crossbar both for the storage of data and for performing Boolean computations. We demonstrate the effectiveness of our approach on practically important computations, including parallel Boolean matrix multiplication.
Today's genomic experiments have to process the so-called "biological big data" that is now reaching the size of Terabytes and Petabytes. To process this huge amount of data, scientists may require weeks...
详细信息
Compared to iris recognition, sclera recognition which uses line descriptor can achieve comparable recognition accuracy in visible wavelengths. However, this method is too time-consuming to be implemented in a real-ti...
详细信息
ISBN:
(纸本)9780819490841
Compared to iris recognition, sclera recognition which uses line descriptor can achieve comparable recognition accuracy in visible wavelengths. However, this method is too time-consuming to be implemented in a real-time system. In this paper, we propose a GPU-based parallel computing approach to reduce the sclera recognition time. We define a new descriptor in which the information of KD tree structure and sclera edge are added. Registration and matching task is divided into subtasks in various sizes according to their computation complexities. Every affine transform parameters are generated by searching on KD tree. Texture memory, constant memory, and shared memory are used to store templates and transform matrixes. The experiment results show that the proposed method executed on GPU can dramatically improve the sclera matching speed in hundreds of times without accuracy decreasing.
Real-time syntactic pattern recogniton imposes strict computing time constraints on new techniques developed. Recently, a method for an analysis of hand postures of the Polish Sign Language based on the ETPL(k) graph ...
详细信息
ISBN:
(纸本)9783642143892
Real-time syntactic pattern recogniton imposes strict computing time constraints on new techniques developed. Recently, a method for an analysis of hand postures of the Polish Sign Language based on the ETPL(k) graph grammars (Flasinski: Patt. Recogn. 26 (1993);1-16;Theor. Comp. Sci. 201 (1998), 189-231) has been constructed. In order to make a system implemented more feasible for the users, a research into parallelization of a pattern recognition process has been led. Possible techniques of tasks distribution have been tested. It has allowed us to define an optimum strategy of parallelization. The results are presented in the paper.
作者:
Ishikawa, MUniv Tokyo
Grad Sch Engn Dept Math Engn & Informat Phys Bunkyo Ku Tokyo 1138656 Japan
Optical interconnections and integrated optoelectronic devices are expected to be promising candidates that solve the lack of interconnection bandwidth between large-scale integrated circuits (LSIs). However, although...
详细信息
ISBN:
(纸本)081943759X
Optical interconnections and integrated optoelectronic devices are expected to be promising candidates that solve the lack of interconnection bandwidth between large-scale integrated circuits (LSIs). However, although two-dimensional parallel devices such as vertical-cavity surface-emitting lasers (VCSELs) and spatial light modulators (SLMs) have massive parallelism, their capabilities are not fully utilized without appropriate way to yields high speed processing. From the viewpoint of system and algorithm, the improvement of the physical layer by optical technologies should require the reconsideration of the architectural design and algorithms so that enough performance improvements could be obtained. In this paper, we will present several of our optoelectronic parallel computing systems including a two-layer pipelined parallel system, which is called OCULAR-II. The system uses VCSELs and phase modulation SLM for realizing free-space reconfigurable optical interconnects. The algorithmic approach is discussed including the optimal load allocation for optically interconnected systems and a novel database management algorithm. As one of the most important technological challenges, the alignment problem of optical beam is also investigated.
The three-dimensional variational assimilation (3D-Var) is the most commonly used technique currently to generate an analysis that provides better consistent initial conditions for numerical weather prediction (NWP). ...
详细信息
ISBN:
(纸本)9783540681052
The three-dimensional variational assimilation (3D-Var) is the most commonly used technique currently to generate an analysis that provides better consistent initial conditions for numerical weather prediction (NWP). The Global and Regional Assimilation Prediction System (GRAPES) is a new generation NWP system in China, in which 3D-Var is one of the main components and plays an important role in direct assimilation for non-conventional observations. In this study, the principal theory and serial implementation of GRAPES 3D-Var are introduced firstly, and the details of distributed parallel computing algorithm of GRAPES 3D-Var are discussed, including data partitioning strategies, data communication strategies and stagger parallelization strategies. At last, some parallel experimental results on 16-CPU cluster platform are put forward, and the numerical simulations of the parallelization show that the parallel strategies can be combined to achieve considerable load balancing and good performance.
In this paper, we study the parallelization of a Cartesian grid based treecode algorithm in evaluating electrostatic potentials in a charged particle system. The treecode algorithm uses a far-field Taylor expansion to...
详细信息
ISBN:
(数字)9783319325576
ISBN:
(纸本)9783319325576;9783319325569
In this paper, we study the parallelization of a Cartesian grid based treecode algorithm in evaluating electrostatic potentials in a charged particle system. The treecode algorithm uses a far-field Taylor expansion to compute O(N logN) particle-cluster interactions to replace the O(N-2) particle-particle interactions. The treecode algorithm is implemented with MPI based parallelization. We design schemes to optimize the implementation adaptive to the particle location. The numerical results show high parallel efficiency. These optimized schemes are further extended to accelerate GMRES iteration in solving boundary integral Poisson-Boltzmann equation in which the discretized linear algebraic system resembles the interactions of the charged system.
This paper introduced a parallel implementation of Numerical Manifold Method based on multiprocessor platforms. First, the computing performance for a class of rock engineering problems was analyzed. For solving simul...
详细信息
ISBN:
(纸本)9781424450756
This paper introduced a parallel implementation of Numerical Manifold Method based on multiprocessor platforms. First, the computing performance for a class of rock engineering problems was analyzed. For solving simultaneous equations is the most time consuming, we choose parallelized Jacobi's iterative method to speed up the computing process. The implementation of parallel Jacobi's iterative method with OpenMP was introduced. A series of experiments shows that our parallel algorithm is an effective way to improve computing performance for such class of engineering problems.
Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the ...
详细信息
ISBN:
(纸本)0769515126
Investigations of the parallel computing of the non-ideal 3-D space detonation wave propagation are presented in this paper on the hi-performance computer based on CC-NUMA architecture. Upon analyzing and testing the previous serial program, the computation of curvature, the first-order and the second-order difference were determined to be the main objects of parallelization. Some processing techniques were applied to convert the serial program into parallel program, such as the strategy of "Divide and Conquer", the balance of the loading distribution. Numerical simulation computation of the parallel program results in a great increase of computing speed of the non-ideal 3-D space detonation wave propagation.
暂无评论