Plane Lucid is an extension of the language Lucid, a language based on intentional logic. The language allows values of expressions in a program to vary in space as well as in time; it provides spatial and temporal op...
详细信息
Plane Lucid is an extension of the language Lucid, a language based on intentional logic. The language allows values of expressions in a program to vary in space as well as in time; it provides spatial and temporal operators to combine values from different contexts (or different points in space and time). As an application of Plane Lucid, an intentional 3-D spreadsheet has been designed in which Plane Lucid is the definition language of the spreadsheet. The spreadsheet is considered as a single entity (called the spreadsheet variable) which varies in spatial and temporal dimensions; values of cells in the spreadsheet are values of the spreadsheet variable at different spatial and temporal points.< >
Hierarchical Signal flow Graphs (HSFGs) am used to illustrate the computations and the dataflow required for the block regularised parameter estimation algorithm. This algorithm protects the parameter estimation from...
详细信息
Hierarchical Signal flow Graphs (HSFGs) am used to illustrate the computations and the dataflow required for the block regularised parameter estimation algorithm. This algorithm protects the parameter estimation from numerical difficulties associated with insufficiently exciting data or where the behaviour of the underlying model is unknown. Hierarchical signal flow graphs (HSFGs) aid the user's understanding of the algorithm as they clearly show how the algorithm differs from exponentially weighted recursive least squares, but also allow the user to develop fast efficient parallel algorithms easily and effectively, as demonstrated.
This paper introduces a powerful novel sequencer for controlling computational machines and for structured DMA (direct memory access) applications. It is mainly focused on applications using 2-dimensional memory organ...
详细信息
This paper introduces a powerful novel sequencer for controlling computational machines and for structured DMA (direct memory access) applications. It is mainly focused on applications using 2-dimensional memory organization, where most inherent speed-up is obtained thereof. A classification scheme of computational sequencing patterns and storage schemes is derived. In the context of application specific computing the paper illustrates its usefulness especially for data sequencing-recalling examples hereafter published earlier, as far as needed for completeness. The paper also discusses, how the new sequencer hardware provides substantial speed-up compared to traditional sequencing hardware use.
Modern high energy physics experiments require massively parallel special purpose computers (triggers) to reduce the extremely large primary dataflow to manageable amounts. We present a prototype processing ASIC inte...
详细信息
Modern high energy physics experiments require massively parallel special purpose computers (triggers) to reduce the extremely large primary dataflow to manageable amounts. We present a prototype processing ASIC intended as the basic computational unit in a first-level calorimeter trigger for the ATLAS collider detector to be built at CERN, Switzerland. The proposed trigger is a compact highly parallel pipelined system with 4096 systolic processors partitioned into 256 weakly-interacting custom-designed ASICs. Local results from these ASICs are then merged by a second, less complex type of ASIC. data is received at 800 Mbit/s by bipolar input circuits, while the processing is performed in CMOS at 320 MHz, using the true single phase clocking scheme (TSPC). This method promotes fast and compact implementations well suited for pipelined bit-serial applications. A 0.5 /spl mu/m BiCMOS process with 4 metal layers was chosen for the implementation.
Pattern Matching (PM) over network packet flows for Network Intrusion Detection/Prevention System is becoming more and more performance sensitive due to the rapid progress of Internet applications in terms of data vol...
详细信息
Pattern Matching (PM) over network packet flows for Network Intrusion Detection/Prevention System is becoming more and more performance sensitive due to the rapid progress of Internet applications in terms of data volumes. Meanwhile, modern multicore platforms are becoming performance competitive with traditional hardware solutions for PM. But due to the unbalance of network flow sizes, traditional flow- based data parallel processing/programming model can not fully exert multicore platforms' computing power and results in poor performance scalability. In this paper, a novel parallel inspection model, Dynamic Differentiated Distributed Detection (D 4 ) is proposed. D 4 deploys distributed parallel operations by adding one more dimension on workload partition/allocation. It proposes an effective and efficient scheme to pre-partition the pattern set in several candidate ways, called "Detection Modes", and let multiple candidate PM methods to handle the subsets, respectively; the most suitable Detection Mode would be selected specifically for each incoming flows at the run-time, and the workload would be dynamically allocated among multiple CPU cores. Experimental results on real-world pattern set and traffic traces show that D 4 scales much better than traditional schemes by better balancing the load among the processors while avoiding unnecessary overheads.
作者:
R. AmmarB. QinU-155
Computer Science & Engineering Department University of Connecticut Storrs CT USA
A flow-analysis technique for real-time parallel computations at the computation level is presented. The technique is based on reducing the given parallel computation to a sequential one and then applying one of the a...
详细信息
A flow-analysis technique for real-time parallel computations at the computation level is presented. The technique is based on reducing the given parallel computation to a sequential one and then applying one of the available flow-analysis techniques for sequential computations. An example is given.< >
The authors present a new template-matching algorithm with good recognition performance. However, this new algorithm exhibits a complex, four-dimensional, wavefront architecture. Thus, for VLSI implementation, reduced...
详细信息
The authors present a new template-matching algorithm with good recognition performance. However, this new algorithm exhibits a complex, four-dimensional, wavefront architecture. Thus, for VLSI implementation, reduced architectures with fewer connections and processors need to be derived. For this purpose, the authors develop a systematic reduction methodology to manually map wavefront computations from high-dimension to low-dimension. This methodology consists of seven steps. Based on this methodology, the authors derive several two-dimensional architectures which are suitable for VLSI implementation for the new template-matching algorithm and have simulated one of the architectures by using the Intel Hypercube Machine iPSC/2.< >
The use of search algorithms for test data generation has seen many successful results. For structural criteria such as branch coverage, heuristics have been designed to help the search. The most common heuristic is t...
详细信息
The use of search algorithms for test data generation has seen many successful results. For structural criteria such as branch coverage, heuristics have been designed to help the search. The most common heuristic is the use of approach level (usually represented with an integer) to reward test cases whose executions get close (in the control flow graph)to the target branch. To solve the constraints of the predicates in the control flow graph, the branch distance is commonly employed. These two measures are linearly combined. Because the approach level is more important, the branch distance is normalised, often in the range. In this paper, we analyse different types of normalising functions. We found out that the one that is usually employed in the literature has several flaws. We hence propose a different normalizing function that is very simple and that does not suffer of these limitations. We carried out empirical and analytical analyses to compare these two functions. In particular, we studied their effect on two commonly used search algorithms, namely Simulated Annealing and Genetic Algorithms.
Armstrong III is a 20 node multi-computer that is currently operational. In addition to a RISC processor, each node contains reconfigurable resources implemented with FPGAs. The in-circuit reprogramability of static R...
详细信息
Armstrong III is a 20 node multi-computer that is currently operational. In addition to a RISC processor, each node contains reconfigurable resources implemented with FPGAs. The in-circuit reprogramability of static RAM based FPGAs allows the computational capabilities of a node to be dynamically matched to the computational requirements of an application. Most reconfigurable computers in existence today rely solely on a large number of FPGAs to perform computations. In contrast, the paper demonstrates the utility of a small number of FPGAs coupled to a RISC processor with a simple interconnect. The article describes a substantive example application that performs HMM training for speech recognition with the reconfigurable platform.
暂无评论