In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (event-...
详细信息
In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (event-oriented, subscription-oriented and hybrid) to make all the matched pairs (event, subscription) meet in a system. By theoretically analyzing the inherent problem of both event-oriented and subscription-oriented methods, we propose PEM (Popularity-based Event Matching), a variant of hybrid method. PEM can achieve better trade-off between event processing load and subscription storage load of a system. PEM has been verified through both mathematical and simulation-based evaluation.
In this paper, we present an automatic synthesis framework to map loop nests to processor arrays with local memories on FPGAs. An affine transformation approach is firstly proposed to address space-time mapping proble...
详细信息
In this paper, we present an automatic synthesis framework to map loop nests to processor arrays with local memories on FPGAs. An affine transformation approach is firstly proposed to address space-time mapping problem. Then a data-driven architecture model is introduced to enable automatic generation of processor arrays by extracting this data-driven architecture model from transformed loop nests. Some techniques including memory allocation, communication generation and control generation are presented. Synthesizable RTL codes can be easily generated from the architecture model built by these techniques. A preliminary synthesis tool is implemented based on PLUTO, an automatic polyhedral source-to-source transformation and parallelization framework.
We present a high performance and memory efficient hardware implementation of matrix multiplication for dense matrices of any size on the FPGA devices. By applying a series of transformations and optimizations on the ...
详细信息
We present a high performance and memory efficient hardware implementation of matrix multiplication for dense matrices of any size on the FPGA devices. By applying a series of transformations and optimizations on the original serial algorithm, we can obtain an I/O and memory optimized block algorithm for matrix multiplication on FPGAs. A linear array of processing elements (PEs) is proposed to implement this block algorithm. We show significant reduction in hardware resources consuming compared to the related work while increasing clock frequency. Moreover, the memory requirement can be reduced to O(S) from O(S 2 ), where S is the block size. Therefore, more PEs can be integrated into the same FPGA devices.
The networked application environment has motivated the development of multitasking operating systems for sensor networks and other low-power electronic devices, but their multitasking capability is severely limited b...
详细信息
ISBN:
(纸本)9781424472611;9780769540597
The networked application environment has motivated the development of multitasking operating systems for sensor networks and other low-power electronic devices, but their multitasking capability is severely limited because traditional stack management techniques perform poorly on small-memory systems. In this paper, we show that combining binary translation and a new kernel runtime can lead to efficient OS designs on resource-constrained platforms. We introduce SenSmart, a multitasking OS for sensor networks, and present new OS design techniques for supporting preemptive multi-task scheduling, memory isolation, and versatile stack management. We have implemented SenSmart on MICA2/MICAz motes. Evaluation shows that SenSmart performs efficient binary translation and demonstrates a significantly better capability in managing concurrent tasks than other sensornet operating systems.
To efficiently perform large matrix LU decomposition on FPGAs with limited local memory, the original algorithm needs to be blocked. In this paper, we propose a block LU decomposition algorithm for FPGAs, which is app...
详细信息
To efficiently perform large matrix LU decomposition on FPGAs with limited local memory, the original algorithm needs to be blocked. In this paper, we propose a block LU decomposition algorithm for FPGAs, which is applicable for matrices of arbitrary size. We introduce a high performance hardware design, which mainly consists of a linear array of processing elements (PEs), to implement our block LU decomposition algorithm. A total of 36 PEs can be integrated into a Xilinx Virtex-5 xc5vlx330 FPGA on our self-designed PCI-Express card, reaching a sustained performance of 8.50 GFLOPS at 133 MHz, which outperforms previous work.
Multi-island single electron transistor (MISET) is a kind of single electron transistor (SET), which has advantages of the room temperature operating. A novel semi-empirical compact model for MISET is proposed. The ne...
详细信息
Multi-island single electron transistor (MISET) is a kind of single electron transistor (SET), which has advantages of the room temperature operating. A novel semi-empirical compact model for MISET is proposed. The new approach combines the orthodox theory of single electron tunneling for single Coulomb island and a novel empirical analysis for a chain of Coulomb islands. The model is verified by the Monte-Carlo method in SIMON simulator, and is much faster than the traditional multi-island SET simulator, which has the advantages for the large scale multi-island SET circuit simulation.
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into conside...
详细信息
ISBN:
(纸本)9781424477050
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into consideration. To solve the problem, an encrypted schema, based on the Postgresql DBMS, is proposed. Through the security dictionary and the extended SQL, the approach implements the encrypted storage and efficiently query over the encrypted data in the outsourced databases. Results of experiments validate the efficiency and feasibility of our approach.
According to Moore's law the complexity of VLSI circuits has doubled approximately every two years, resulting in simulation becoming the major bottleneck in the circuit design process. parallel and distributed sim...
详细信息
According to Moore's law the complexity of VLSI circuits has doubled approximately every two years, resulting in simulation becoming the major bottleneck in the circuit design process. parallel and distributed simulations can be applied as fast, cost effective approaches to the simulation of large, complex circuits. In this paper, a simple yet effective simulated annealing-based approach is proposed to optimize the choice of a time window for optimistic parallel simulation. We chose gate level circuits simulations as our experimental vehicle. Our results show up to a 52% improvement in the simulation time using our simulated annealing algorithm. To the best of our knowledge, this is the first time that SA has been applied to optimize the performance of time warp simulations.
The effect of quantum capacitance (QC) on the switching speed in T-CNFETs (Tunneling Carbon Nanotube Field Effect Transistors) is investigated with simulation using NEGF (Non-Equilibrium Green's Function) formalis...
详细信息
ISBN:
(纸本)9781424435432;9781424435449
The effect of quantum capacitance (QC) on the switching speed in T-CNFETs (Tunneling Carbon Nanotube Field Effect Transistors) is investigated with simulation using NEGF (Non-Equilibrium Green's Function) formalism. Firstly, based on the analytical expression of SS, a quantitative study of the source/drain leads doping level is made to obtain an averaged inverse subthreshold slope(SS) as small as possible. And then the impact of QC on the gate control, thus on SS, is studied. Lastly, for the first time, approaches for restricting the impacts of QC are investigated, such as tuning the drain-source bias condition or CNT diameter which are governed by the analytical expression presented at the end of this paper. The modeling results reveal that the QC in T-CNFET has a negative impact on both the SS and on-current, and such impact could be restricted effectively with a proper choice of drain-source voltage and CNT chirality.
暂无评论