检索结果-内蒙古大学图书馆

International Conference on Parallel and Distributed systems (ICPADS)

作者： Dongni Han Shixiong Xu Li Chen Lei Huang Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences China Computer Science Department Prairie View A&M University USA

Stencil computations are core of wide range of scientific and engineering applications. A lot of efforts have been put into improving efficiency of stencil calculations on different platforms, but unfortunately it is not easy to reuse. In this paper we present a PAttern-Driven Stencil compiler-based tool and a simple tuning system to reuse those well optimized methods and codes. We also suggest extensions to OpenMP, depicting high-level data structures in order to facilitate recognition of various stencil computation patterns. The PADS allows programmers to rewrite kernel of stencils or reuse source-to-source translator outputs as optimized stencil template codes with related tuning parameters, In addition, PADS consists of a OpenMP to CUDA translator and code generator using optimized template codes. It also obtains architecture-specific parameters to tune stencils across different GPU platforms. To demonstrate our system flexibility and performance portability, we illustrate four different stencil computations, Laplacian operator with Jacobi iterative method, divergence operator, 3 dimension 25 point stencil and a 2D heat equation using ADI method with periodic boundary conditions. PADS succeeds in generating all these four stencil codes using different optimization strategies and delivers a promising performance improvement.

关键词： Pattern matching Kernel Tuning Libraries Optimization Generators

来源：评论

学校读者我要写书评

暂无评论

Empirical design bugs prediction for verification

Empirical design bugs prediction for verification

引用

Design, Automation and Test in Europe Conference and Exhibition

作者： Qi Guo Tianshi Chen Haihua Shen Yunji Chen Yue Wu Weiwu Hu Chinese Academy of Sciences Beijing Beijing CN Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

Coverage model is the main technique to evaluate the thoroughness of dynamic verification of a Design-under-Verification (DUV). However, rather than achieving a high coverage, the essential purpose of verification is to expose as many bugs as possible. In this paper, we propose a novel verification methodology that leverages the early bug prediction of a DUV to guide and assess related verification process. To be specific, this methodology utilizes predictive models built upon artificial neural networks (ANNs), which is capable of modeling the relationship between the high-level attributes of a design and its associated bug information. To evaluate the performance of constructed predictive model, we conduct experiments on some open source projects. Moreover, we demonstrate the usability and effectiveness of our proposed methodology via elaborating experiences from our industrial practices. Finally, discussions on the application of our methodology are presented.

关键词： computer bugs Predictive models Complexity theory Measurement Training data Training Correlation

来源：评论

学校读者我要写书评

暂无评论

Cross-layer optimized placement and routing for FPGA soft error mitigation

Cross-layer optimized placement and routing for FPGA soft er...

引用

Design, Automation and Test in Europe Conference and Exhibition

作者： Keheng Huang Yu Hu Xiaowei Li Chinese Academy of Sciences Beijing Beijing CN Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

As the feature size of FPGA shrinks to nanometers, soft errors increasingly become an important concern for SRAM-based FPGAs. Without consideration of the application level impact, existing reliability-oriented placement and routing approaches analyze soft error rate (SER) only at the physical level, consequently completing the design with suboptimal soft error mitigation. Our analysis shows that the statistical variation of the application level factor is significant. Hence in this work, we first propose a cube-based analysis to efficiently and accurately evaluate the application level factor. And then we propose a cross-layer optimized placement and routing algorithm to reduce the SER by incorporating the application level and the physical level factor together. Experimental results show that, the average difference of the application level factor between our cube-based method and Monte Carlo golden simulation is less than 0.01. Moreover, compared with the baseline VPR placement and routing technique, the cross-layer optimized placement and routing algorithm can reduce the SER by 14% with no area and performance overhead.

关键词： Routing Wires Circuit faults Field programmable gate arrays Monte Carlo methods Algorithm design and analysis Accuracy

来源：评论

学校读者我要写书评

暂无评论

GPU-accelerated fault simulation and its new applications

GPU-accelerated fault simulation and its new applications

引用

International Symposium on VLSI Design, Automation and Test

作者： Huawei Li Dawen Xu Kwang-Ting Cheng Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China Department of Electrical and Computer Engineering University of California Santa Barbara CA USA

GPUs have recently been explored as a new general-purpose computing platform, which are suitable for the acceleration of compute-intensive EDA applications. In this paper we describe a GPU-based one- to n-detection fault simulator for both stuck-at and transition faults, which demonstrates a 20X speedup over a commercial CPU-based fault simulator. We further show new fault-simulation-based test selection applications enabled by this accelerated fault simulation. Our results demonstrate that the tests selected from the applications achieve higher fault coverages for 1-to-n detections with steeper fault coverage curves, as well as a better delay test quality, in comparison with tests deterministically generated by commercial ATPG tools.

关键词： Circuit faults Graphics processing unit Instruction sets Delay Integrated circuit modeling Logic gates Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Dynamic Resource Allocation Based on User experience in Virtualized Servers

引用

Procedia Engineering 2011年 15卷 3780-3784页

作者： Wei Zhang Jiajun Liu Ying Song Mingfa Zhu Limin Xiao Yuzhong Sun Li Ruan State Key Laboratory of Software Development Environment Beijing 100191 China School of Computer Science and Engineering Beihang University Beijing 100191 China Key Laboratory of Computer System and Architecture Chinese Academy of SciencesBeijing Institute of Computing Technology Chinese Academy of Sciences Beijing China

Web workloads are known to vary dynamically with time which poses a challenge to resource allocation among the applications. In this paper, we argue that the existing dynamic resource allocation based on resource utilization has some drawbacks in virtualized servers. Dynamic resource allocation directly based on real-time user experience is more reasonable and also has practical significance. To address the problem, we propose a system architecture that combines real time measurements and analysis of user experience for resource allocation. We evaluate our proposal using Webbench. The experiment results show that these techniques can judiciously allocate system resources.

关键词： resource allocation virtualized servers user experience

来源：评论

学校读者我要写书评

暂无评论

On multicast throughput scaling of hybrid wireless networks with general node density

引用

computer Networks 2011年第15期55卷 3548-3561页

作者： Cheng Wang Changjun Jiang Xiang-Yang Li Yunhao Liu Department of Computer Science and Technology Tongji University Shanghai China Key Laboratory of Embedded System and Service Computing Ministry of Education China Department of Computer Science Illinois Institute of Technology Chicago IL 60616 United States TNLIST School of Software Tsinghua University China

In this paper, we consider hybrid wireless networks with a general node density λ ∈ [1, n ], where n ad hoc nodes are uniformly distributed and m base stations (BSs) are regularly placed in a square region A ( n , A ) = 1 , A × 1 , A with A ∈ [1, n ]. We focus on multicast sessions in which each ad hoc node as a user chooses randomly d ad hoc nodes as its destinations. Specifically, when d = 1 (or d = n − 1), a multicast session is essentially a unicast (or broadcast) session. We study the asymptotic multicast throughput for such a hybrid wireless network according to different cases in terms of m ∈ [1, n ] and d ∈ [1, n ], as n → ∞. To be specific, we design two types of multicast schemes, called hybrid scheme and BS - based scheme , respectively. For the hybrid scheme, there are two alternative routing backbones : sparse backbones and dense backbones . Particularly, according to different regimes of the node density λ = n A , we derive the thresholds in terms of m and d . Depending on these thresholds, we determine which scheme is preferred for the better performance of network throughput.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An ultrasound system for tumor detection in soft tissues using low transient pulse

An ultrasound system for tumor detection in soft tissues usi...

引用

IEEE International Conference on Automation Science and Engineering (CASE)

作者： Ashish R. Ratnakar MengChu Zhou Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ USA MoE Key Laboratory of Embedded System and Service Computing Tongji University Shanghai China

This work presents a method to detect the size and location of tumor in soft tissues using ultrasound. Quantitative ultrasound is utilized to allow an ultrasound signal to be sent from a transmitter to multiple receivers. This received signal is analyzed for echogenic and echolucent tumors to differentiate between the two along with non-tumor sample and also studied for the delay to determine the size/location of the tumor. The proposed system utilizes Low Transient Pulse (LTP) technique and is implemented using Field Programmable Gate Array (FPGA) and Digital Signal Processor (DSP) technologies. In this co-design architecture, DSP carries out the analysis of received demodulated signal at a lower speed while FPGA runs at a higher one to generate LTP signal and demodulate bandpass ultrasonic signal. This work elaborates the implementation of Quadrature Amplitude Modulation (QAM) receiver on FPGA for the received signal from an ultrasound detector. LTP is applied to the tumor samples through the transmitter and the received signal at an ultrasonic receiver is passed through QAM to obtain different maxima that are then further used to compute the location and the size of the tumor using DSP. This dual platform co-design demonstrates a good application of a FPGA/DSP platform for the LTP generation and received signal processing.

关键词： Digital signal processing Field programmable gate arrays Tumors Ultrasonic imaging Clocks Quadrature amplitude modulation Receivers

来源：评论

学校读者我要写书评

暂无评论

An ultra-fast hybrid simulation framework for ASIP

An ultra-fast hybrid simulation framework for ASIP

引用

IEEE International Conference on Electronics, Circuits and systems (ICECS)

作者： Ji Qiu Xiang Gao Yifei Jiang Xu Xiao Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences (CAS) Beijing China Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Urbana IL USA

ISS (Instruction Set Simulator) plays an important role in pre-silicon software development for ASIP. However, the speed of traditional simulation is too slow to effectively support full-scale software development. In this paper, we propose a hybrid simulation framework which further improves the previous simulation methods by aggressively utilizing the host machine resources. The utilization is achieved by categorizing instructions of ASIP application into two types, namely custom and basic instructions, via binary instrumentation. Then in a way of hybrid simulation, only custom instructions are simulated on the ISS and basic instructions are executed fast and natively on the host machine. We implement this framework for an industrial ASIP to validate our approach. Experimental results show that when the implemented ISS, namely GS-Sim, is applied to practical multimedia decoders, an average simulation speed up to 1058.5MIPS can be achieved, which is 34.7 times of the state-of-art dynamic binary translation simulator and is the fastest to the best of our knowledge.

关键词： Instruments Context Programming Decoding Switches Prototypes Context modeling

来源：评论

学校读者我要写书评

暂无评论

Improved condition for controllability of strongly dependent strict minimal siphons in Petri nets

Improved condition for controllability of strongly dependent...

引用

IEEE International Conference on Networking, Sensing and Control

作者： GuanJun Liu ChangJun Jiang MengChu Zhou Key Laboratory of the MoE for Embedded System and Service Computing University of Tongji Shanghai China Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark NJ USA

Li and Zhou propose an important concept for Petri nets: elementary siphons. They partition siphons into elementary and dependent ones. The controllability of the latter can be ensured by the former's proper control. They give a sufficient condition to decide whether a dependent siphon is controlled by its elementary ones in S 3 PR. However, this condition is so loose that in many cases the controllability of a dependent SMS cannot be determined although it is actually controlled. In this paper, we propose an improved condition to decide the controllability of strongly dependent SMS.

关键词： Controllability system recovery Nickel Flexible manufacturing systems Petri nets Sufficient conditions Control theory

来源：评论

学校读者我要写书评

暂无评论

Reader Activation Scheduling in Multi-reader RFID systems: A Study of General Case

Reader Activation Scheduling in Multi-reader RFID Systems: A...

引用

International Symposium on Parallel and Distributed Processing (IPDPS)

作者： Shaojie Tang Cheng Wang Xiang-Yang Li Changjun Jiang Department of Computer Science Illinois Institute of Technology Chicago IL USA Key Laboratory of Embedded System and Service Computing Ministry of Education Shanghai China Department of Computer Science University of Tongji Shanghai China

Radio frequency identification (RFID) is a technology where a reader device can "sense'' the presence of a close by object by reading a tag device attached to the object. To guarantee the coverage quality, multiple RFID readers can be deployed in the given region. In this paper, we consider the problem of activation schedule for readers in a multi-reader environment. In particular, we try to design a schedule for readers to maximize the number of served tags per time-slot while avoiding various interferences. We first develop a centralized algorithm under the assumption that different readers may have different interference and interrogation radius. Next, we propose a novel algorithm which does not need any location information of the readers. Finally, we extend the previous algorithm in distributed manner in order to suit the case where no central entity exists. We conduct extensive simulations to study the performances of our proposed algorithm. And our evaluation results corroborate our theoretical analysis.

关键词： Interference Radiofrequency identification Schedules Scheduling Processor scheduling Geometry Algorithm design and analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：