Two methods that have been studied for solving large, sparse sets of algebraic equations, the multiple factoring method and the W-matrix method, are shown to be two independent methods of explaining equivalent computa...
详细信息
Two methods that have been studied for solving large, sparse sets of algebraic equations, the multiple factoring method and the W-matrix method, are shown to be two independent methods of explaining equivalent computational procedures. The forward and backward substitution part of these methods are investigated using parallel processing techniques on commercially available computers. The results are presented from testing the proposed methods on two local memory machines, the Intel iPSC/1 and iPSC/860 hypercubes, and a shared memory machine, the Sequent SymmetryS81. With the iPSC/1, which is characterized by its slow communication rate and high communication overhead for a short message, the best speedup obtained is less than 2.5, and that was with only 8 of the 16 available processors in use. The iPSC/860, a more advance model of the iPSC family, is even worse as far as these parallel methods are concerned. Much better results were obtained on the Sequent Symmetry where a speedup of 7.48 was obtained with 16 processors.< >
Demands for applications requiring massive parallelism in symbolic environments have given rebirth to research in models labeled as neural networks. These models are made up of many simple nodes which are highly inter...
详细信息
Demands for applications requiring massive parallelism in symbolic environments have given rebirth to research in models labeled as neural networks. These models are made up of many simple nodes which are highly interconnected such that computation takes place as dataflows amongst the nodes of the network. To present, most models have proposed nodes based on simple analog functions, where inputs are multiplied by weights and summed, the total then optionally being transformed by an arbitrary function at the node. Learning in these systems is accomplished by adjusting the weights on the input lines. This paper discusses the use of digital (boolean) nodes as a primitive building block in connectionist systems. Digital nodes naturally engender new paradigms and mechanisms for learning and processing in connectionist networks. The digital nodes are used as the basic building block of a class of models called ASOCS (Adaptive Self-Organizing Concurrent Systems). These models combine massive parallelism with the ability to adapt in a self-organizing fashion. Basic features of standard neural network learning algorithms and those proposed using digital nodes are compared and contrasted. The latter mechanisms can lead to vastly improved efficiency for many applications.
Modelling a protocol is difficult because it involves describing a two-dimensional relationship between the flow of control of many processes and the synchronized flow of data between those processes. This paper prese...
详细信息
Modelling a protocol is difficult because it involves describing a two-dimensional relationship between the flow of control of many processes and the synchronized flow of data between those processes. This paper presents the use of a new technique, Deductive Systems, for the modeling of communication protocols. The strengths of using such approach for protocol modeling are the ease with which modeling can be modified, the rigorous analysis which they enable of the constructed models, and the incremental way in which modeling and verification of a system can be performed. Starting from the fundamental definitions of Deductive Systems, it is shown how an extended Alternating Bit (AB) protocol can be modeled, with real life conditions taken into consideration.
The Reverse Engineering group at EDS Research has developed software tools to mechanically assist in reengineering transaction processing applications. The authors apply the software tools to assist in converting a ve...
详细信息
The Reverse Engineering group at EDS Research has developed software tools to mechanically assist in reengineering transaction processing applications. The authors apply the software tools to assist in converting a very large minicomputer application written in COBOL to run under CICS on an IBM mainframe. The two platforms provide very different user interfaces and computational environments. The user interacts with the minicomputer one field at a time, but interacts with CICS a full screen at a time. This and other major differences demand that any successful mechanical conversion strategy employ sophisticated feature extraction and restructuring techniques. They describe the problem of recovering the user interface specification and using the recovered specification to create the appropriate user interface in the target environment. Techniques such as dataflow analysis and other formal analysis techniques appear to be too weak to guide the conversion, and that a priori programming knowledge must be encoded and applied to obtain a successful conversion.< >
In this paper we make a study of the capabilities required of memories to support the synthesis of designs using structured architectures. We explore the advantages of using multi-port memories with two write ports as...
详细信息
In this paper we make a study of the capabilities required of memories to support the synthesis of designs using structured architectures. We explore the advantages of using multi-port memories with two write ports as an architectural component over conventional memories with a single write port in such a synthesis environment. A study the of the memory resources available in some of the current Field Programmable Gate Arrays (FPGA) is made. We then propose a multi-port memory structure that could be suitable for use in programmable structures such as FPGAs, to facilitate implementations of designs through HLS. The principal advantages of the proposed memory structure are its flexibility, simplicity and its ability to support more efficient execution of operations than existing memory structures.
In this paper a mechanism for adaptation of parallel computation is defined for dataflow computations in dynamic and heterogeneous environments. Our mechanism is especially useful in massively parallel multi-threaded...
详细信息
In this paper a mechanism for adaptation of parallel computation is defined for dataflow computations in dynamic and heterogeneous environments. Our mechanism is especially useful in massively parallel multi-threaded computations as found in cluster or grid computing. By basing the state of executions on a dataflow graph, this approch shows extreme flexibility with respect to adaptation of parallel computation induced by application. This adaptation reflects needs for changing runtime behavior due to time observable parameters. Specifically, it allows an on-line adaptation of parallel execution in dynamic heterogeneous systems. We have implemented this mechnism in KAAPI (Kernel for Adaptative and Asynchronous Parallel Interface) and experimental results show the overhead induced is small.
We report on the computation of 3D volumetric optical flow on gated MRI datasets. We extend the 2D least squares and regularization approaches of Lucas and Kanade [4] and Horn and Schunck [3] and show flow fields (as ...
详细信息
We report on the computation of 3D volumetric optical flow on gated MRI datasets. We extend the 2D least squares and regularization approaches of Lucas and Kanade [4] and Horn and Schunck [3] and show flow fields (as XY and XZ 2D flows) for a beating heart. The flow not only can capture the expansion and contraction of various parts of the heart motion but also can capture the twisting motion of the heart.
Plane Lucid is an extension of the language Lucid, a language based on intentional logic. The language allows values of expressions in a program to vary in space as well as in time; it provides spatial and temporal op...
详细信息
Plane Lucid is an extension of the language Lucid, a language based on intentional logic. The language allows values of expressions in a program to vary in space as well as in time; it provides spatial and temporal operators to combine values from different contexts (or different points in space and time). As an application of Plane Lucid, an intentional 3-D spreadsheet has been designed in which Plane Lucid is the definition language of the spreadsheet. The spreadsheet is considered as a single entity (called the spreadsheet variable) which varies in spatial and temporal dimensions; values of cells in the spreadsheet are values of the spreadsheet variable at different spatial and temporal points.< >
This paper introduces a powerful novel sequencer for controlling computational machines and for structured DMA (direct memory access) applications. It is mainly focused on applications using 2-dimensional memory organ...
详细信息
This paper introduces a powerful novel sequencer for controlling computational machines and for structured DMA (direct memory access) applications. It is mainly focused on applications using 2-dimensional memory organization, where most inherent speed-up is obtained thereof. A classification scheme of computational sequencing patterns and storage schemes is derived. In the context of application specific computing the paper illustrates its usefulness especially for data sequencing-recalling examples hereafter published earlier, as far as needed for completeness. The paper also discusses, how the new sequencer hardware provides substantial speed-up compared to traditional sequencing hardware use.
Hierarchical Signal flow Graphs (HSFGs) am used to illustrate the computations and the dataflow required for the block regularised parameter estimation algorithm. This algorithm protects the parameter estimation from...
详细信息
Hierarchical Signal flow Graphs (HSFGs) am used to illustrate the computations and the dataflow required for the block regularised parameter estimation algorithm. This algorithm protects the parameter estimation from numerical difficulties associated with insufficiently exciting data or where the behaviour of the underlying model is unknown. Hierarchical signal flow graphs (HSFGs) aid the user's understanding of the algorithm as they clearly show how the algorithm differs from exponentially weighted recursive least squares, but also allow the user to develop fast efficient parallel algorithms easily and effectively, as demonstrated.
暂无评论