To accelerate the execution of most DSP(Digital Signal processing) algorithms such as FFT, FIR, Vector operations, while keeping the flexibility of the chip, a reconfigurable architecture (named ReDAr) for DSP is prop...
详细信息
ISBN:
(纸本)078037889X
To accelerate the execution of most DSP(Digital Signal processing) algorithms such as FFT, FIR, Vector operations, while keeping the flexibility of the chip, a reconfigurable architecture (named ReDAr) for DSP is proposed and implemented, and finally will be applied to the Radar system of Automatic Navigation Equipment. By analyzing these algorithms, the structure of Reconfigurable processing Element (RPE), the Crossbar interconnect network, the Memory organization. the host controlling strategy, and the data sequencing scheme of the architecture are conceived. and parts of them, including the RPE, Crossbar. data sequencer, are reconfigurable. After configuration. it can be interconnected into a parallel and pipelined framework. closely matching the algorithms and like a dedicated hardware. By simulation. the performances of these algorithms mapped onto this architecture are comparative to algorithm-specific chips in market, and satisfy the requirement of the targeted application.
mpF is a new parallel extension of Fortran 90. It was developed on base of experience of development and use of the mpC parallel programming language. the paper compares programming models of mpC and mpF.
ISBN:
(纸本)3540341412
mpF is a new parallel extension of Fortran 90. It was developed on base of experience of development and use of the mpC parallel programming language. the paper compares programming models of mpC and mpF.
Text classification is a classic topic in natural language processing. In this study, we propose an attention model with multi-layer supervision for this task. In our model, the previous context vector is directly use...
详细信息
ISBN:
(纸本)9781450377072
Text classification is a classic topic in natural language processing. In this study, we propose an attention model with multi-layer supervision for this task. In our model, the previous context vector is directly used as attention to select the required features, and multi-layer supervision is used for text classification, i.e., the prediction losses are combined across all layers in the global cost function. the main contribution of our model is that the context vector is not only used as attention but also as a representation of an input text for classification at each layer. We conducted experiments based on five benchmark text classification data sets and the results indicate that our model can improve classification performance when applied to most of the data sets.
Two O(log2 n) time algorithms are proposed for computing the dominators and constructing the dominator tree of a directed acyclic graph, G=(V, E), |V|=n, |E|=m. the parallel computation used is a practical SIMD hyperc...
详细信息
the paper presents a new method for efficient fine grain computations on distributed memory computers. RDMA (Remote Direct Memory Access) communication is applied which assures direct access to memories of remote proc...
详细信息
ISBN:
(纸本)3540219463
the paper presents a new method for efficient fine grain computations on distributed memory computers. RDMA (Remote Direct Memory Access) communication is applied which assures direct access to memories of remote processing nodes. To obtain high RDMA efficiency for fine-grain computations with very frequent transmissions of small size messages, a specially designed structure of RDMA rotating buffers (RB) is introduced. It allows to fully exploit available communication bandwidth by provision of a special communication control infrastructure prepared and activated in a program before effective computations start. As an example of a fine-grain problem implemented withthe RDMA rotating buffers, the discrete Fast Fourier Transform (FFT) execution is presented. "the binary-exchange algorithm" of FFT is examined showing efficiency of the RB method in comparison to standard MPI communication.
the paper describes parallel 3D DEM simulation of compacting of heterogeneous material. Static domain decomposition and message passing inter-processor communication have been implemented in the DEM code. A novel algo...
详细信息
Raman spectrometry is a technique that allows detecting chemical products through a number of representative peaks found in an image spectrum or numeric series of data. the Raman spectrum machine generates a CSV file ...
详细信息
ISBN:
(纸本)9781538663929
Raman spectrometry is a technique that allows detecting chemical products through a number of representative peaks found in an image spectrum or numeric series of data. the Raman spectrum machine generates a CSV file or an image as a curve which is the result of the diagnosis product. the analysis of the spectrum peaks permits to detect the chemical origin of the concerned product. Scientists do this operation manually, which makes it hard and long in terms of time. Graphics processing Units (GPUs) allow us to make the processing faster and more efficient, thanks to its multicore architecture. the aim of the present paper is to propose a new GPU based approach to automate the molecule detection operation using image-processing techniques with OpenCL parallel implementation. We propose two parallel solutions, which will be compared to each other. We apply the exploited approach to analyze ionic liquids and biomaterials samples.
In the field of industrial automation, the traditional master/slave real-lime data processing (SCADA) system has been unable to deal withthe current massive data and diversified business demands in throughput, real-t...
详细信息
ISBN:
(纸本)9781728101200
In the field of industrial automation, the traditional master/slave real-lime data processing (SCADA) system has been unable to deal withthe current massive data and diversified business demands in throughput, real-time and scalability. this paper presents a decentralized real-time data space, based decentralized management SCADA cluster solutions for the treatment of object partitioning, distributed real-time data processing, dynamic data migration and decentralized real-time data management are discussed. At last, the scheme is applied to develop the system an-d verify it.
In this paper, aimed at understanding the effects of resource utilization in the performance of multi-way joins in shared-nothing database systems, we introduce a performance modeling for the left and right-deep linea...
详细信息
In this paper two domain decomposition formulations are presented in conjunction withthe preconditioned conjugate gradient method (PCG) for the solution of large-scale problems in solid and structural mechanics. In t...
详细信息
In this paper two domain decomposition formulations are presented in conjunction withthe preconditioned conjugate gradient method (PCG) for the solution of large-scale problems in solid and structural mechanics. In the first approach, the PCG method is applied to the global coefficient matrix, while in the second approach it is applied to the interface problem after eliminating the internal degrees of freedom. For both implementations, a subdomain-by-subdomain (SBS) polynomial preconditioner is employed, based on local information of each subdomain. the approximate inverse of the global coefficient matrix or the Schur complement matrix, which acts as the preconditioner, is expressed by a truncated Neumann series resulting in an additive type local preconditioner. Block type preconditioning, where full elimination is performed inside each block, is also studied and compared withthe proposed polynomial preconditioning. Copyright (C) Civil-Comp Limited and Elsevier Science Limited.
暂无评论