A qualitative description and a mathematical definition of the data-flow Petri-net model are presented. The data-flow Petri nets are used for the data-flow multiprocessor operation. They can advantageously replace flo...
详细信息
A qualitative description and a mathematical definition of the data-flow Petri-net model are presented. The data-flow Petri nets are used for the data-flow multiprocessor operation. They can advantageously replace flow graphs for validation, simulation, and scheduling. In the functional assembler language used, the source program describes the function to be processed and does not specify all the steps as do classical assemblers. An application for the control of a telemanipulator robot with image processing is given.< >
Packet scheduling is a critical component of router data paths because it allows routers to divide bandwidth intelligently between competing flows. A large number of scheduling algorithms annotate packets with time st...
详细信息
Packet scheduling is a critical component of router data paths because it allows routers to divide bandwidth intelligently between competing flows. A large number of scheduling algorithms annotate packets with time stamps and subsequently sort these packets according to their annotated time stamp values. For these algorithms the problems of efficiently tagging and sorting packets are still open to investigation. In this paper we propose a data structure and algorithm that reduces the latency of determining the packet with the smallest time stamp to a single memory access time and a small constant number of computation steps, independent of the number of flows serviced by the scheduler. The complexity of inserting a packet into our sorting data structure is logarithmic as a function of the ratio between the maximum packet size and minimum connection weight. The latency of inserting a packet can be hidden by performing insertions of several packets in parallel. This is the fastest sorting data structure known to us. One of the most efficient alternative implementation techniques proposed by Chao et. al. [20] is associated with logarithmic complexity of the scheduling decision time, as a function of the maximum value of packet time stamps. Our solution applies to many different packet fair queuing algorithms including Weighted Fair Queuing [28], Self-Clocked Fair Queuing (SCFQ) [28] and Start Time Fair Queuing [4].
Dynamic dataflow system is a group of interconnected dataflow driven processor elements, solving one compound task. dataflow systems are systems driven by flow of data, effectively using natural parallelism rising ...
详细信息
Dynamic dataflow system is a group of interconnected dataflow driven processor elements, solving one compound task. dataflow systems are systems driven by flow of data, effectively using natural parallelism rising during processing of large amount of data, for example: image processing, sound processing etc. Speed of dynamic dataflow system depends on used interconnection network, suitability of parallel architecture usage and also on control of operands matching process. This article deals with operands matching process control optimization. There is a design of part of coordinating processor, controlling operands matching process, described in this article
A powerful and useful data-flow Visual Programming Language (DFVPL) must provide the necessary programming constructs to deal with complex problems. The main purpose of this paper is to give a contribution to the deba...
详细信息
A powerful and useful data-flow Visual Programming Language (DFVPL) must provide the necessary programming constructs to deal with complex problems. The main purpose of this paper is to give a contribution to the debate on DFVPL constructs, by presenting the solutions we devised for the VIPERS language.
The authors investigate and compare two ways of specifying stream relations (in particular, stream functions). The first uses relational programs, i.e., netlike program schemes in which the signature primitives are in...
详细信息
The authors investigate and compare two ways of specifying stream relations (in particular, stream functions). The first uses relational programs, i.e., netlike program schemes in which the signature primitives are interpreted as relations over a given CPO. No stream domains are assumed; semantics is in fixed-point style. The second is through dataflow nets, i.e., nets whose nodes are interpreted as processes (computational stations). The authors prove the existence of an adequate dataflow interpreter for relational programs over all relations and its uniqueness. When dealing with functions the interpreter is modular and obeys the Kahn principle, the authors identify two kinds of anomalies. The first (meagerness anomaly) is caused by the defect of the used processes (computational stations) and holds in fact for arbitrary input-output behaviors. The second (ambiguity anomaly) is rooted in the semantics of relational nets over arbitrary CPO. It is unavoidable in any extension beyond functional behaviors.< >
This paper is concerned with the compile-time (static) scheduling of dataflow graphs (DFGs) onto multiprocessor systems. It mainly concentrates on producing a rate-optimal time schedule that achieves the minimum iter...
详细信息
This paper is concerned with the compile-time (static) scheduling of dataflow graphs (DFGs) onto multiprocessor systems. It mainly concentrates on producing a rate-optimal time schedule that achieves the minimum iteration period, known as the iteration period bound. A combinatorial theory is developed to produce a rate-optimal time schedule for a fully specified DFG. The DFG is first converted to a critical graph by making all its circuits critical. Next, it is transformed into an acyclic graph through a sequence of circuit contractions. An algorithm is then proposed which achieves the time scheduling of the given DFG by first scheduling the acyclic graph, followed by a scheduling of the critical circuits in an order which is reverse to that of their contraction.
A dataflow computer architecture achieves hitgher computing speeds than the traditional von Newmann architecture by exploiting software parallelism at the instruction level in a highly parallel system of processing h...
详细信息
A dataflow computer architecture achieves hitgher computing speeds than the traditional von Newmann architecture by exploiting software parallelism at the instruction level in a highly parallel system of processing hardware. dataflow architectures executs- dataflow graphs - directed graphs in which nodes represent instructions and interconnecting arcs define dataflow oaths. One such dataflow architecture, the Manchester dataflow Computer has several rings which communicate via a common switch. This pacer describes the simulation of a single such ring. The code is written in Modula-2 using object oriented techniques. Partial cerformance figures obtained by the author's group are also presented. The paper concludes by suggesting several simulator enhancements.
An algorithm is presented to determine two lower bounds in functional pipelined data path synthesis. Given an iteration time constraint and a task initiation latency, the algorithm computes a lower bound on the number...
详细信息
An algorithm is presented to determine two lower bounds in functional pipelined data path synthesis. Given an iteration time constraint and a task initiation latency, the algorithm computes a lower bound on the number of functional units required to execute the dataflow graph (DFG) of a loop body, and given a resource constraint and a task initiation latency the algorithm computes a lower bound on the number of time steps required to execute the DFG. The lower bounds not only greatly reduce the size of the solution space, but also provide a means to measure the proximity of the final solution to an optimal one. The bounds are computed in polynomial time; therefore the algorithm is very effective, especially for large DFGs. Experiments indicate that the lower bound is very tight. For all of the test cases the difference between our solution and the optimal solution is not greater than one.< >
Many modern computer systems are distributed over space. Well-known examples are the Internet of Things and IBM's TrueNorth for deep learning applications. At the Asynchronous Research Center (ARC) at Portland Sta...
详细信息
Many modern computer systems are distributed over space. Well-known examples are the Internet of Things and IBM's TrueNorth for deep learning applications. At the Asynchronous Research Center (ARC) at Portland State University we build distributed hardware systems using self-timed computation and delay-insensitive communication. Where appropriate, self-timed hardware operations can reduce average and peak power, energy, latency, and electromagnetic interference. Alternatively, self-timed operations can increase throughput, tolerance to delay variations, scalability, and manufacturability. The design of complex hardware systems requires design automation and support for test, debug, and product characterization. This thesis focuses on design compilation and test support for dataflow applications. Both parts are necessary to go from self-timed circuits to large-scale hardware systems. As part of the research in design compilation, the ARCwelder compiler designed by Willem Mallon (previously with NXP and Philips Handshake Solutions) was extended. The key to testing distributed systems, including self-timed systems, is to identify the actions in the systems. In distributed systems there is no such thing as a global action. To test, debug, characterize, and even initialize distributed systems, it is necessary to control the local actions. The designs developed at the ARC separate the actions from the states. As part of the research in test and debug, a special circuit to control actions, called MrGO, was implemented. A scan and JTAG test interface was also implemented. The test implementations have been built into two silicon test experiments, called Weaver and Anvil, and were used successfully for testing, debug, and performance characterizations.
We present a new algorithm for cluster-oriented scheduling (COS) in pipelined data path synthesis. First, cluster structure information is extracted from the corresponding linear graph representation of a dataflow gr...
详细信息
We present a new algorithm for cluster-oriented scheduling (COS) in pipelined data path synthesis. First, cluster structure information is extracted from the corresponding linear graph representation of a dataflow graph (DFG). Then, scheduling and allocation globally interact with the cluster information under the timing constraints of pipeline latency and data initialization interval (DII). Using this technique, the costs of registers, multiplexors, and interconnection were reduced dramatically in both the FIR and elliptic filter examples.< >
暂无评论