检索结果-内蒙古大学图书馆

A Power and Area Optimization Approach of Mixed Polarity Reed-Muller Expression for Incompletely Specified Boolean Functions

引用

Journal of computer Science & technology 2017年第2期32卷 297-311页

作者： Zhen-Xue He Li-Min Xiao Li Ruan Fei Gu Zhi-Sheng Huo Guang-Jun Qin Ming-Fa Zhu F Long-Bing Zhang Rui Liu Xiang Wang State Key Laboratory of Software Development Environment Beihang University Beijing 100191 China School of Computer Science and Engineering Beihang University Beijing 100191 China School of Electronic and Information Engineering Beihang University Beijing 100191 China State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China University of Chinese Academy of Sciences Beijing 100049 China National Engineering Research Center for Science and Technology Resources Sharing Service Beihang University Beijing 100191 China

The power and area optimization of Reed-Muller （RM） circuits has been widely concerned. However, almost none of the exiting power and area optimization approaches can obtain all the Pareto optimal solutions of the original problem and are efficient enough. Moreover, they have not considered the don＇t care terms, which makes the circuit performance unable to be further optimized. In this paper, we propose a power and area optimization approach of mixed polarity RM expression （MPRM） for incompletely specified Boolean functions based on Non-Dominated Sorting Genetic Algorithm II （NSGA-II）. Firstly, the incompletely specified Boolean function is transformed into zero polarity incompletely specified MPRM （ISMPRM） by using a novel ISMPRM acquisition algorithm. Secondly, the polarity and allocation of don＇t care terms of ISMPRM is encoded as chromosome. Lastly, the Pareto optimal solutions are obtained by using NSGA-II, in which MPRM corresponding to the given chromosome is obtained by using a chromosome conversion algorithm. The results on incompletely specified Boolean functions and MCNC benchmark circuits show that a significant power and area improvement can be made compared with the existing power and area optimization approaches of RM circuits.

关键词： power and area optimization Reed-Muller （RM） circuit Pareto optimal solution don＇t care term chromosomeconversion

来源：评论

学校读者我要写书评

暂无评论

Author Correction: BigNeuron: a resource to benchmark and predict performance of algorithms for automated tracing of neurons in light microscopy datasets

引用

Nature methods 2024年第10期21卷 1959页

作者： Linus Manubens-Gil Zhi Zhou Hanbo Chen Arvind Ramanathan Xiaoxiao Liu Yufeng Liu Alessandro Bria Todd Gillette Zongcai Ruan Jian Yang Miroslav Radojević Ting Zhao Li Cheng Lei Qu Siqi Liu Kristofer E Bouchard Lin Gu Weidong Cai Shuiwang Ji Badrinath Roysam Ching-Wei Wang Hongchuan Yu Amos Sironi Daniel Maxim Iascone Jie Zhou Erhan Bas Eduardo Conde-Sousa Paulo Aguiar Xiang Li Yujie Li Sumit Nanda Yuan Wang Leila Muresan Pascal Fua Bing Ye Hai-Yan He Jochen F Staiger Manuel Peter Daniel N Cox Michel Simonneau Marcel Oberlaender Gregory Jefferis Kei Ito Paloma Gonzalez-Bellido Jinhyun Kim Edwin Rubel Hollis T Cline Hongkui Zeng Aljoscha Nern Ann-Shyn Chiang Jianhua Yao Jane Roskams Rick Livesey Janine Stevens Tianming Liu Chinh Dang Yike Guo Ning Zhong Georgia Tourassi Sean Hill Michael Hawrylycz Christof Koch Erik Meijering Giorgio A Ascoli Hanchuan Peng Institute for Brain and Intelligence Southeast University Nanjing China. Microsoft Corporation Redmond WA USA. Tencent AI Lab Bellevue WA USA. Computing Environment and Life Sciences Directorate Argonne National Laboratory Lemont IL USA. Kaya Medical Seattle WA USA. University of Cassino and Southern Lazio Cassino Italy. Center for Neural Informatics Structures and Plasticity Krasnow Institute for Advanced Study George Mason University Fairfax VA USA. Faculty of Information Technology Beijing University of Technology Beijing China. Beijing International Collaboration Base on Brain Informatics and Wisdom Services Beijing China. Nuctech Netherlands Rotterdam the Netherlands. Janelia Research Campus Howard Hughes Medical Institute Ashburn VA USA. Department of Electrical and Computer Engineering University of Alberta Edmonton Alberta Canada. Ministry of Education Key Laboratory of Intelligent Computation and Signal Processing Anhui University Hefei China. Paige AI New York NY USA. Scientific Data Division and Biological Systems and Engineering Division Lawrence Berkeley National Lab Berkeley CA USA. Helen Wills Neuroscience Institute and Redwood Center for Theoretical Neuroscience UC Berkeley Berkeley CA USA. RIKEN AIP Tokyo Japan. Research Center for Advanced Science and Technology (RCAST) The University of Tokyo Tokyo Japan. School of Computer Science University of Sydney Sydney New South Wales Australia. Texas A&M University College Station TX USA. Cullen College of Engineering University of Houston Houston TX USA. Graduate Institute of Biomedical Engineering National Taiwan University of Science and Technology Taipei Taiwan. National Centre for Computer Animation Bournemouth University Poole UK. PROPHESEE Paris France. Department of Neuroscience Columbia University New York NY USA. Mortimer B. Zuckerman Mind Brain Behavior Institute Columbia University New York NY USA. Department of Computer Science Northern Illinois Universit

来源：评论

学校读者我要写书评

暂无评论

Find Your Online Social Friends from Mobile Internet Traffic

Find Your Online Social Friends from Mobile Internet Traffic

引用

IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC)

作者： Yi Zhang Yuanyuan Qiao Yanting Zhang Nanfei Shu Yizhe Song Jie Yang Beijing Key Laboratory of Network System Architecture and Convergence Beijing University of Posts and Telecommunications Beijing China Technology Research Institute Aisino Corporation Beijing China School of Electronic Engineering and Computer Science Queen Mary University of London London England

Increasingly more mobile Internet traffic is produced which contains ample personal information related to user mobility and website browsing behavior. Prior research has attempted to recommend friends based on Global Position system (GPS) in location based social networks (LBSN). However, the study of friend recommendation in general social network according to position from the base station is relatively understudied. This paper introduces a novel feature set extracted from mobile Internet traffic according to base station location and Uniform Resource Locator (URL). We train classification models using these features to predict friendship between pairs of Weibo users. Results show that both base station location and URL when acted alone can already effectively reflect friendships even in general social network. We further show that by fusing the two features together, the model obtains even better performance. Finally, we demonstrate that the location and URL features can improve prediction performance than only using the common friends.

关键词： Uniform resource locators Feature extraction Social network services Base stations Internet Global Positioning system Predictive models

来源：评论

学校读者我要写书评

暂无评论

Cooperative communication based connectivity recovery for UAV networks

arXiv

引用

arXiv 2018年

作者： Tian, Wen Jiao, Zhenzhen Liu, Min Zhang, Meng Li, Dong State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100049 China RDA FOA ART-CN1 Corporate Technology Siemens Ltd. China Beijing100102 China

UAV networks often partition into separated clusters due to the high node and link dynamic. As a result, network connectivity recovery is an important issue in this area. Existing solutions always need excessive movement of nodes and thus lead to low recovery efficiency in terms of the time and energy consumption. In this paper, we for the first time study the issue of how to utilize cooperative communication technology to improve the connectivity recovery efficiency in UAV networks. We propose a Cooperative Communication based Connectivity Recovery algorithm for UAV Networks, named C3RUN. The key novelty is C3RUN not only uses cooperative communication to enlarge node's communication range and thus achieve quick repair of network connectivity, but also enables nodes to proactively move to better places for ensuring the establishment of cooperative communication links. We conduct extensive simulations to evaluate the performance of C3RUN. The simulation results reveal that C3RUN can not only achieve connectivity recovery with less nodes and shorter distance to move, but also always finish recovery with less time, when comparing with existing work. Furthermore, C3RUN can achieve 100% success ratio for connectivity recovery. Copyright © 2018, The Authors. All rights reserved.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

An introduction to CPU and DSP design in China

引用

Science China(Information Sciences) 2016年第1期59卷 58-65页

作者： Weiwu HU Yifu ZHANG Jie FU State Key Laboratory of Computer Architecture Institute of Computing TechnologyChinese Academy of Sciences Loongson Technology Corporation Limited School of Computer and Control Engineering University of Chinese Academy of Sciences Institute of Computing Technology Chinese Academy of Sciences

In recent years, China has witnessed considerable achievements in the production of domesticallydesigned CPUs and DSPs. Owing to fifteen years of hard work that began in 2001, significant progress has been made in Chinese domestic CPUs and DSPs, primarily represented by Loongson and Shen Wei *** parts of the CPU design techniques are comparable to the world’s most advanced designs. A special issue published in Scientia Sinica I nf ormationis in April 2015, is dedicated to exhibiting the technical advancements in Chinese domestically-designed CPUs and DSPs. The content in this issue describes the design and optimization of high performance processors and the key technologies in processor development; these include high-performance micro-architecture design, many-core and multi-core design, radiation hardening design, highperformance physical design, complex chip verification, and binary translation technology. We hope that the articles we collected will promote understanding of CPU/DSP progress in China. Moreover, we believe that the future of Chinese domestic CPU/DSP processors is quite promising.

关键词： Chinese domestic CPUs and DSPs Loongson CPU Shen Wei CPU YHFT DSP BWDSP

来源：评论

学校读者我要写书评

暂无评论

Modeling the Impact of DVFS on Performance of Applications in Datacenter

引用

Ruan Jian Xue Bao/Journal of Software 2017年第4期28卷 845-859页

作者： Li, Deng-Hui Zhao, Jia-Cheng Cui, Hui-Min Feng, Xiao-Bing State Key Laboratory of Computer Architecture Institute of Computing Technology The Chinese Academy of Sciences Beijing100190 China School of Computer and Control Engineering University of Chinese Academy of Sciences Beijing100049 China

Datacenters are built to house massive internet services at an affordable price. Both Op-ex (long-time operational expenditure) and Cap-ex (one-time construction costs) are directly impacted by datacenter power consumption. Thus, DVFS (dynamic voltage and frequency scaling) is widely adopted to improve per node energy efficiency. However, it is well known but has not yet been fully explored that such schemes affect an application's power consumption and performance simultaneously. This paper focuses on the impact of DVFS on performance of an application and proposes an analytical model to quantitatively characterize the relationship between an application's performance and a processor's frequency, which can be leveraged to predict the performance of an application at any frequency. Specifically, according to different memory subsystem resources accessed, instructions of an application are divided into two parts: on-chip instructions and off-chip instructions, which can be modeled independently. On-Chip instructions refer to instructions which only access on-chip resources, and their execution time is frequency-relevant and can be modeled using a linear function. Off-chip instructions stand for instructions accessing the main memory, and their execution time is dominated by memory access latency and is frequency-irrelevant. By the division and modeling of the two parts, a quantitative model can be obtained between the execution time of an application and frequency of a processor. Evaluations using two different platforms and all benchmarks of SPEC 2006 show that the derived models are very precise, with average prediction error less than 1.34%. © Copyright 2017, institute of Software, the Chinese Academy of Sciences. All rights reserved.

关键词： Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

Understanding the GPU Microarchitecture to Achieve Bare-Metal Performance Tuning

引用

ACM SIGPLAN Notices 2017年第8期52卷 31-43页

作者： Zhang, Xiuxia Tan, Guangming Xue, Shuangbai Li, Jiajia Zhou, Keren Chen, Mingyu State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences China University of Chinese Academy of Sciences China Georgia Institute of Technology United States

In this paper, we present a methodology to understand GPU microarchitectural features and improve performance for compute-intensive kernels. The methodology relies on a reverse engineering approach to crack the GPU ISA encodings in order to build a GPU assembler. An assembly microbenchmark suite correlates microarchitectural features with their performance factors to uncover instruction-level and memory hierarchy preferences. We use SGEMM as a running example to show the ways to achieve bare-metal performance tuning. The performance boost is achieved by tuning FFMA throughput by activating dual-issue, eliminating register bank conflicts, adding non-FFMA instructions with little penalty, and choosing proper width of global/shared load instructions. On NVIDIA Kepler K20m, we develop a faster SGEMM with 3.1Tflop/s performance and 88% efficiency;the performance is 15% higher than cuBLAS7.0. Applying these optimizations to convolution, the implementation gains 39%-62% performance improvement compared with cuDNN4.0. The toolchain is an attempt to automatically crack different GPU ISA encodings and build an assembler adaptively for the purpose of performance enhancements to applications on GPUs. © 2017 ACM.

关键词： Reverse engineering

来源：评论

学校读者我要写书评

暂无评论

Wide Operational Range Processor Power Delivery Design for Both Super-Threshold Voltage and Near-Threshold Voltage computing

引用

Journal of computer Science & technology 2016年第2期31卷 253-266页

作者： Xin He Gui-Hai Yan Yin-He Han Xiao-Wei Li State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China University of Chinese Academy of Sciences Beijing 100049 China

The load power range of modern processors is greatly enlarged because many advanced power management techniques are employed, such as dynamic voltage frequency scaling, Turbo Boosting, and near-threshold voltage （NTV） technologies. However, because the efficiency of power delivery varies greatly with different load conditions, conventional power delivery designs cannot maintain high efficiency over the entire voltage spectrum, and the gained power saving may be offset by power loss in power delivery. We propose SuperRange, a wide operational range power delivery unit. SuperRange complements the power delivery capability of on-chip voltage regulator and off-chip voltage regulator. On top of SuperRange, we analyze its power conversion characteristics and propose a voltage regulator （VR） aware power management algorithm. Moreover, as more and more cores have been integrated on a singe chip, multiple SuperRange units can serve as basic building blocks to build, in a highly scalable way, more powerful power delivery subsystem with larger power capacity. Experimental results show SuperRange unit offers lx and 1.3x higher power conversion efficiency （PCE） than other two conventional power delivery schemes at NTV region and exhibits an average 70% PCE over entire operational range. It also exhibits superior resilience to power-constrained systems.

关键词： voltage regulator power delivery near-threshold computing multicore processor

来源：评论

学校读者我要写书评

暂无评论

Towards memory and computation efficient graph processing on spark

Towards memory and computation efficient graph processing on...

引用

IEEE International Conference on Big Data

作者： Xinhui Tian Yuanqing Guo Jianfeng Zhan Lei Wang University of Chinese Academy of Sciences China Chinese Academy of Sciences State Key Laboratory of Computer Architecture (Institute of Computing Technology

Algorithms for large scale natural graph processing can be categorized into two types based on their value propagation behaviors: the unidirectional value propagation (UVP) algorithms and the bidirectional value propagation (BVP) algorithms. The behavior about how vertices interact with neighbors also differs between two algorithm types, which demands different system design choices. However, current distributed graph processing systems usually try to support both types in one general-purpose framework Such system design can not promise good performance and low resource consumption for both types. Especially, for UVP algorithms, current systems can not guarantee low memory footprint, computation efficiency and communication efficiency at the same time. In this paper, we propose a new graph processing engine on Spark, GraphV, which is specially designed for the unidirectional value propagation algorithms, and can satisfy all the above requirements for this type of algorithms. To retain the generalization for other algorithms, we also build a dual-engine framework by integrating GraphV with Spark's existing graph processing engine GraphX. The main design choices of GraphV include a cheap propagation-related partitioner, an one-step computation model, and a locality-aware local graph layout. According to the experiment results, GraphV is faster than GraphX by the factors of 1.2x-3.1x, with much less resource consumption. The source code of GraphV will be publicly available from http://***/GraphV.

关键词： Algorithm design and analysis Computational modeling Sparks Engines Partitioning algorithms Mirrors Synchronization

来源：评论

学校读者我要写书评

暂无评论

Towards memory-efficient processing-in-memory architecture for convolutional neural networks

引用

ACM SIGPLAN Notices 2017年第5期52卷 81-90页

作者： Wang, Yi Zhang, Mingxu Yang, Jing College of Computer Science and Software Engineering Shenzhen University China Experimental and Innovation Practice Center Harbin Institute of Technology Shenzhen China State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences China

Convolutional neural networks (CNNs) are widely adopted in artificial intelligent systems. In contrast to conventional computing centric applications, the computational and memory resources of CNN applications are mixed together in the network weights. This incurs a significant amount of data movement, especially for highdimensional convolutions. Although recent embedded 3D-stacked Processing-in-Memory (PIM) architecture alleviates this memory bottleneck to provide fast near-data processing, memory is still a limiting factor of the entire system. An unsolved key challenge is how to efficiently allocate convolutions to 3D-stacked PIM to combine the advantages of both neural and computational processing. This paper presents Memolution, a compiler-based memory efficient data allocation strategy for convolutional neural networks on PIM architecture. Memolution offers thread-level parallelism that can fully exploit the computational power of PIM architecture. The objective is to capture the characteristics of neural network applications and present a hardware-independent design to transparently allocate CNN applications onto the underlining hardware resources provided by PIM. We demonstrate the viability of the proposed technique using a variety of realistic convolutional neural network applications. Our extensive evaluations show that, Memolution significantly improves performance and the cache utilization compared to the baseline scheme. © 2017 ACM.

关键词： Memory architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：