This paper introduces cycle-reconfigurable modules that enhance FPGA architectures with efficient support for dynamic data accesses: data accesses with accessed data size and location known only at runtime. The propos...
详细信息
ISBN:
(纸本)9782839918442
This paper introduces cycle-reconfigurable modules that enhance FPGA architectures with efficient support for dynamic data accesses: data accesses with accessed data size and location known only at runtime. The proposed module adopts new reconfiguration strategies based on dynamic FIFOs, dynamic caches, and dynamic shared memories to significantly reduce configuration generation and routing complexity. We develop a prototype FPGA chip with the proposed cycle-reconfigurable module in the SMIC 130-nm technology. The integrated module takes less than the chip area of 39 CLBs, and reconfigures thousands of runtime connections in 1.2 ns. applications for large-scale sorting, sparse matrix-vector multiplication, and Memcached are developed. The proposed modules enable 1.4 and 11 times reduction in area-delay product compared with those applications mapped to previous architectures and conventional FPGAs.
The network robustness is defined by how well its vertices are connected to each other to keep the network strong and sustainable. The change of network robustness may reveal events as well as periodic trend patterns ...
详细信息
ISBN:
(纸本)9781538631201
The network robustness is defined by how well its vertices are connected to each other to keep the network strong and sustainable. The change of network robustness may reveal events as well as periodic trend patterns that affect the interactions among vertices in the network. The evaluation of network robustness may be helpful to many applications, such as event detection, disease transmission, and network security, etc. There are many existing metrics to evaluate the robustness of networks, for example, node connectivity, edge connectivity, algebraic connectivity, graph expansion, R-energy and so on. It is a natural and urgent problem how to choose a reasonable metric to effectively measure and evaluate the network robustness in the real applications. In this paper, based on some general principles, we design and implement a benchmark, namely BMNR, for the metrics of network robustness. The benchmark consists of graph generator, graph attack and robustness metric evaluation. We find that R-energy can evaluate both connected and disconnected graphs, and can be computed more efficiently.
The AXIOM platform is built with, in mind, the possibility of executing an application not only on a single board but also, in a distributed fashion, on multiple boards. While this is a classic problem with some solut...
详细信息
ISBN:
(纸本)9781509067428
The AXIOM platform is built with, in mind, the possibility of executing an application not only on a single board but also, in a distributed fashion, on multiple boards. While this is a classic problem with some solutions in the case of no constraints, it becomes interesting for embedded computing and cyber-physical systems where we aim to accelerate applications while maintaining energy efficency and also easy programmability. Currently, the AXIOM platform consists of a custom board based on the Xilinx Zynq Ultrascale+ ZU9EG which incorporates the largest FPGA avaialable on that System-on-Chip at the moment, four 64-bit ARM cores and two 32-bit ARM cores, up to 32GiB of main memory and several 12.5Gbitis tranceivers. We relyed on this hardware to develop our novel concept, which exploitsdataflow execution in multiple ways for programs that are written in an OpenMP extension, known as OmpSs. A key aspect relates to the adopted memory consistency model, which allows the programmer to focus on aspects other than taking care of the communication among nodes. The lower level of our communication stack relies on a fast interconnect based on inexpensive USB-C type connectors rather than on other proprietary interfaces. The reconfigurable logic provides a complete Network Interface Card (NIC) to allow fast routing of the data and code of the system. We envision many applications for this platform although we are currently focused on developing two basic scenarios based on the Smart-Home and on Smart-Videosurveillance. Our initial results confirm good scalability of the platform and a speed-up compared to other programming models such as Cilk and OpenMPI.
Support Vector Machine (SVM) is a linear binary classifier that requires a kernel function to handle non-linear problems. Most previous SVM implementations for embedded systems in literature were built targeting a cer...
详细信息
ISBN:
(纸本)9781538605493
Support Vector Machine (SVM) is a linear binary classifier that requires a kernel function to handle non-linear problems. Most previous SVM implementations for embedded systems in literature were built targeting a certain application;where analyses were done through comparison with software implementations only. The impact of different application datasets towards SVM hardware performance were not analyzed. In this work, we propose a parameterizable linear kernel architecture that is fully pipelined. It is prototyped and analyzed on Altera Cyclone IV platform and results are verified with equivalent software model. Further analysis is done on determining the effect of the number of features and support vectors on the performance of the hardware architecture. From our proposed linear kernel implementation, the number of features determine the maximum operating frequency and amount of logic resource utilization, whereas the number of support vectors determines the amount of on-chip memory usage and also the throughput of the system.
Convolutional dictionary learning (CDL) has great potential to "learn" rich sparse representations from training datasets, by training translation-invariant filters. However, the performance of applying lear...
详细信息
ISBN:
(纸本)9781538615652
Convolutional dictionary learning (CDL) has great potential to "learn" rich sparse representations from training datasets, by training translation-invariant filters. However, the performance of applying learned filters from CDL to inverse problems has not yet been fully maximized because training data preprocessing in training stage is not fully compensated in testing stage. We propose CDL using Adaptive Contrast Enhancement (CDL-ACE) that additionally models the preprocessing in CDL, and image denoising model using learned filters from CDL-ACE. For CDL-ACE, we apply a practically feasible and convergent Block Proximal Gradient method using Majorizer (BPG-M) with a momentum coefficient formula and an adaptive restarting rule. Numerical experiments show that, for strong additive white Gaussian noise, the proposed image denoiser using learned filters by CDL outperforms existing image denoising methods using Wiener filtering and total variation;and learned filters by CDL-ACE further improves the denoiser.
RoadRunneR-128 is a recently invented light weight, Feistel-type bit slice block cipher with a block size of 64 bits and key length of 128 bits. RoadRunneR is specifically designed to offer a better performance in res...
详细信息
This papers examines the problem of a local bandwidth recovery for non-stationary stochastic signals when the only available information is given in terms of level crossings. The use of multiple level crossings is a f...
详细信息
ISBN:
(纸本)9781538615652
This papers examines the problem of a local bandwidth recovery for non-stationary stochastic signals when the only available information is given in terms of level crossings. The use of multiple level crossings is a fundamental paradigm for the recently investigated event-based sampling approach. In fact, level crossings allows us to exploit local signal properties and to avoid unnecessarily fast sampling when the local signal intensity is low. The paper proposes the least-square based method for the local intensity estimation from level crossings for the class of signals being the time-warped version of the stationary and bandlimited Gaussian processes. This result is then related to the concept of the local mean bandwidth and finally to the local power bandwidth. The smooth convolution estimate of the local intensity is proposed and its positivity corrected version is introduced. The latter is achieved by the truncation argument and next by the method of alternating projections onto convex sets.
Identifying vulnerabilities in software systems is crucial to minimizing the damages that result from malicious exploits and software failures. This often requires proper identification of vulnerable execution paths t...
详细信息
ISBN:
(纸本)9781538605424
Identifying vulnerabilities in software systems is crucial to minimizing the damages that result from malicious exploits and software failures. This often requires proper identification of vulnerable execution paths that contain program vulnerabilities or bugs. However, with rapid rise in software complexity, it has become notoriously difficult to identify such vulnerable paths through exhaustively searching the entire program execution space. In this paper, we propose StatSym, a novel, automated Statistics-Guided Symbolic Execution framework that integrates the swiftness of statistical inference and the rigorousness of symbolic execution techniques to achieve precision, agility and scalability in vulnerable program path discovery. Our solution first leverages statistical analysis of program runtime information to construct predicates that are indicative of potential vulnerability in programs. These statistically identified paths, along with the associated predicates, effectively drive a symbolic execution engine to verify the presence of vulnerable paths and reduce their time to solution. We evaluate StatSym on four real-world applications including polymorph, CTree, Grep and thttpd that come from diverse domains. Results show that StatSym is able to assist the symbolic executor, KLEE, in identifying the vulnerable paths for all of the four cases, whereas pure symbolic execution fails in three out of four applications due to memory space overrun.
This paper introduces a parametric design of a new 3D compliant parallel manipulator based on pantograph linkage for micro/nano applications. Furthermore, the modal shapes and natural frequencies analysis are carried ...
详细信息
ISBN:
(纸本)9781509060009
This paper introduces a parametric design of a new 3D compliant parallel manipulator based on pantograph linkage for micro/nano applications. Furthermore, the modal shapes and natural frequencies analysis are carried out versus the flexure joint parameters which are a crucial point for the controller selection/design and geometry optimization. The new compliant manipulator provides decoupled 3DOF translational motion with fixed orientation of the end effector and it has significantly high workspace to size ratio. The modified manipulator aims to enlarge the workspace by enhancing the values of magnification factors of input motion and by reducing the parasitic motion and geometric stiffening of the original manipulator. The main parameters that affect the performance of the compliant manipulator are determined based on the generated results of finite element analysis which is performed using ANSYS software. The results have successfully demonstrated the improvements of the proposed manipulator in terms of workspace size, magnification factors, joint stiffening and parasitic motions.
Target image detection based on rapid serial visual presentation (RSVP) paradigm is a typical brain-computer interface with various applications, such as image retrieval. In an RSVP paradigm, the P300 component is det...
详细信息
ISBN:
(纸本)9781509046034
Target image detection based on rapid serial visual presentation (RSVP) paradigm is a typical brain-computer interface with various applications, such as image retrieval. In an RSVP paradigm, the P300 component is detected to determine the target image, which requires high-precision single-trial P300 detection methods. However, compared to multi-trial P300 detection methods, the performance of single-trial methods are always relatively lower. In this paper, we propose a novel paradigm, triple-RSVP for EEG-based target image detection. In the triple-RSVP, three images appear at the same time, and target image will appear three times, so that multi-trial P300 classification methods can be used to improve the detection accuracy. Experimental results show that the accuracy of the triple-RSVP is superior to standard RSVP (single RSVP) and dual-RSVP (Wilcoxon signed rank test, p < 0.05).
暂无评论