Vector quantization (VQ) is widely used for color image and video compression. However, its high computational overhead prohibits many applications in real-timesystems. This paper presents a novel method to accelerat...
详细信息
Vector quantization (VQ) is widely used for color image and video compression. However, its high computational overhead prohibits many applications in real-timesystems. This paper presents a novel method to accelerate full-search VQ algorithm by adding quantized color pack extension (QCPX) instruction set architecture (ISA). QCPX not only supports a packed 16-bit YCbCr data format but also obtains performance and code density improvements through three-color pixels in parallel in a 16-bit width. To measure execution performance of the QCPX instruction set architecture (ISA), it is evaluated in a SIMD pixel array platform developed at Georgia Tech. In addition, by varying the grain size (pixel per processing element, PPE), this study can fully measure the impact of QCPX in the presence of different levels of data parallelism. Simulation results indicate that QCPX version achieves speedups from 27% to 297% over non-QCPX with the most impressive improvements >200 % occurring above the communication-bound 16 PPE granularity. QCPX also reduces average PE idle cycles by 45%. QCPX can be incorporated in range of architectures from current ILP processors to future massively data parallel machines
Rapid changes in service, network and user demand have made conventional distributed information systems mostly unprepared in continuously and timely satisfying the users requests. A high-assurance information system ...
详细信息
Rapid changes in service, network and user demand have made conventional distributed information systems mostly unprepared in continuously and timely satisfying the users requests. A high-assurance information system architecture sustained by Push/Pull mobile agents, called Faded Information Field (FIF), has been proposed to achieve adaptability and real-time property under continuously changing conditions of the system. When the service offer changes, the information environment is restructured in order to match the current user demand in terms of volume and trend. However, once the structure is set, it is still required to permit to the user's Pull mobile agents to autonomously adapt to the rapidly changing demand conditions. The goal of autonomous navigation of the pull mobile agents is in that regard to assure adaptive migration and distribution of the pull-MAs' process to avoid the congestion of certain regions of the FIF when the user demand for information is changing. In this paper, the concept and realization of autonomous Process or Go Navigation technology is proposed under the goal to preserve the timeliness of the system under local congestion conditions in the FIF Effectiveness in fairly reducing the average users' response time under changing demand conditions has been confirmed by simulation.
In mobile computing environments, distributed applications are provided over the wireless network that is unstable. The goal of our research is to offer service stability by rapid failure detection and recovery by swi...
详细信息
In mobile computing environments, distributed applications are provided over the wireless network that is unstable. The goal of our research is to offer service stability by rapid failure detection and recovery by switching to service management required by each client application through the cooperation of client middleware and the overlay network. This paper describes adaptive monitoring (AM), which detects failure rapidly with only a slight addition in network load, and Monitoring Information Notification Protocol (MINP), which transfer failure information efficiently. Experiments on a testbed hosting real Web Services confirm that our system can detect failure and switch to a service alternative within the recovery time demanded by most applications. We also confirm that the proposed method is effective in reducing network load as well as satisfying application requirements.
In this paper an efficient hybrid MAC-layer protocol for dependable multi-channel networking structures is presented and evaluated. This new protocol handles C sub-channels totally, that is c physically or frequency d...
详细信息
In this paper an efficient hybrid MAC-layer protocol for dependable multi-channel networking structures is presented and evaluated. This new protocol handles C sub-channels totally, that is c physically or frequency divided sub-channels for the information or data packets and one (1) request sub-channel, where the request packets are transmitted and (C=c+1). The proposed mechanism for the accessing of the request sub-channel slots is a TDMA-like scheme. An effective reservation mechanism based on the transmitted requests is used for the deterministic accessing of the slots of the information sub-channels, mainly at high loads. Additionally, the random access ALOHA-like protocol is selected for the dynamic accessing of the Free Information Slots (FISs) for low traffic conditions, when the stations transmit no requests. This combination offers bounded packet delay and high throughput-mean delay performance under mixed (asynchronous/synchronous) traffic. On the other hand, the proposed multi-channel networking structure offers high operational reliability due to the existed channel redundancy.
An overview is given of several aspects of the Cadena analysis facilities related to slicing and partial evaluation. First, it is illustrated how developers can declare various intracomponent dependence properties usi...
详细信息
An overview is given of several aspects of the Cadena analysis facilities related to slicing and partial evaluation. First, it is illustrated how developers can declare various intracomponent dependence properties using a light-weight specification formalism. Second, it is explained how different forms of slicing can be carried out on dependence for CCM designs constructed from component connection information and intra-component dependency specifications. Third, the often overlooked connections between the symbolic evaluation strategies used in traditional partial evaluation and state-space exploration strategies used in explicit-state model-checking are summarized. Fourth, it is described how projections of CCM designs can be obtained using a form of partial evaluation driven by an extensible explicit-state model-checking engine called Bogor.
realtime signal, image, and control applications have very important time constraints, involving the use of several powerful numerical calculation units. The aim of our project is to develop a fast and automatic prot...
详细信息
realtime signal, image, and control applications have very important time constraints, involving the use of several powerful numerical calculation units. The aim of our project is to develop a fast and automatic prototyping process dedicated to parallel architectures made of both PC and several last generation Texas Instruments digital signal processors: TMS320C6X DSP. The process is based on SynDEx, a CAD software improving algorithm implementation onto multiprocessor architectures by finding the best matching between an algorithm and an architecture. SynDEx kernels for automatic PC and DSP dedicated code generation have been developed with the new SynDEx functionalities. A full coding application illustrates the results. The application is an image compression algorithm called LAR (locally adaptive resolution).
Motion detection systems for visual surveillance and monitoring purposes have aroused interest in the computer video community for many years. The main task of these applications is to identify (and track) moving targ...
详细信息
Motion detection systems for visual surveillance and monitoring purposes have aroused interest in the computer video community for many years. The main task of these applications is to identify (and track) moving targets. Usually, these applications requires that a large number of parameters is tuned in order to work properly. In the traffic monitoring application we have developed about thirty parameters concerning the detection algorithm have been considered as to be optimized. Genetic algorithms (GAs) are an optimization technique which involves a search from a population of solutions rather than from a single point. Although they usually are very time-consuming, they owe a high intrinsic parallelism. Accordingly, this paper shows how a distributed implementation of a GA over a network of workstations can successfully accomplish the parameter optimization task within a motion detection system and achieve excellent performance within a reduced amount of time
Dynamic traffic management methods constitute the intelligent core of Intelligent Transportation systems (ITS). In order for these methods to be effective and deployable in real-time, there is a need to develop models...
详细信息
Dynamic traffic management methods constitute the intelligent core of Intelligent Transportation systems (ITS). In order for these methods to be effective and deployable in real-time, there is a need to develop models that predict future traffic conditions in a computational time much less than realtime. In this paper we report on parallel implementations for a class of Dynamic Traffic Assignment (DTA) models, known as macroscopic DTA models. This class of models possess mathematical formulations which are solved using various algorithms. Two parallel decomposition strategies based on network topology and time are investigated and implemented in a distributed memory environment. Numerical results show that for the network topology based decomposition strategy, a speed-up of 5 is observed when the number of processors is 10 and the asymptotic speed-up is about 10. For the time-based decomposition strategy a speed-up of 6.5 is observed when the number of processors is 10 and the asymptotic speed-up is about 25.
As a lot of programs and contents such as movie files are being delivered via the Internet, and copies are often stored in distributed servers in order to reduce the load on the original servers, to ease network conge...
详细信息
As a lot of programs and contents such as movie files are being delivered via the Internet, and copies are often stored in distributed servers in order to reduce the load on the original servers, to ease network congestion, and to decrease response time. To retrieve an object file, existing methods simply select one or more servers. Such methods divide a file into equal pieces whose size is determined a priori. This approach is not practical for networks that offer variable bandwidth. In order to more utilize variable bandwidth, we propose an adaptive downloading method We evaluate it by experiments conducted on the Internet. The results show that the new method is effective and that it will become an important network control technology for assurance.
Dynamic voltage scaling has been widely acknowledged as a powerful technique for trading off power consumption and delay for processors. Recently, variable-frequency (and variable-voltage) parallel and serial links ha...
详细信息
ISBN:
(纸本)0769518702
Dynamic voltage scaling has been widely acknowledged as a powerful technique for trading off power consumption and delay for processors. Recently, variable-frequency (and variable-voltage) parallel and serial links have also been proposed, which can save link power consumption by exploiting variations in bandwidth requirement. In this paper, we address joint dynamic voltage scaling for variable-voltage processors and communication links in such systems. We propose a scheduling algorithm for real-time applications, with both data flow and control flow information captured. It performs efficient routing of communication events through multi-hops, as well as efficient slack allocation among heterogeneous processors and communication links to maximize energy savings, while meeting all real-time constraints.
暂无评论