检索结果-内蒙古大学图书馆

IEEE International Conference on computer Design: VLSI in computers and Processors, (ICCD)

作者： Xiaoyu Zhang Xiaoming Chen Yinhe Han State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences School of Computer Science and Technology University of Chinese Academy of Sciences

The performance gap between the processors and the main memory is continuously widening, known as the memory wall bottleneck. Emerging nonvolatile devices have the ability of in-memory processing, and thus, have the potential to partially alleviate the memory wall bottleneck. People have adopted nonvolatile devices to build various accelerators that are targeted at different problems and applications. In this work, we adopt one of the emerging nonvolatile devices, the ferroelectric field-effect transistor (FeFET), to build a multifunctional in-memory processing unit, which is named FeMAT. From a structural point of view, FeMAT is an FeFET-based memory array composed of 3T-based cells. From a functional point of view, FeMAT not only is a nonvolatile memory, but also can perform some logic operations (i.e., the processing-in-memory (PIM) mode), binary convolutions (i.e., the binary convolutional neural network (BCNN) acceleration mode) and content searching (i.e., the ternary content-addressable memory (TCAM) mode) in the memory. These functions are seamlessly fused into the FeFET-based memory array and can be configured online without changing the circuit structure. Superior energy efficiency is demonstrated by our experiments and comparisons with a resistive random-access memory (ReRAM) based equivalence, as well as a TCAM and a BCNN accelerator based on complementary metal-oxide-semiconductor (CMOS) devices.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Optimization Space Exploration of Hardware Design for CRYSTALS-KYBER

Optimization Space Exploration of Hardware Design for CRYSTA...

引用

Asian Test Symposium (ATS)

作者： Yixuan Zhao Zhiteng Chao Jing Ye Wen Wang Yuan Cao Shuai Chen Xiaowei Li Huawei Li State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences China University of Chinese Academy of Sciences China Yale University US Hohai University China Rock Solid Security Lab FiberHome Co. Ltd. China

ISBN: (数字)9781728174679

ISBN: (纸本)9781728174686

Public key cryptography is important in the global communication digital infrastructure. However, the emergence of quantum computer and Shor algorithm has greatly threatened the security of public key cryptography. The CRYSTALS-KYBER, as a lattice-based KEM algorithm, passed three rounds of a global solicitation for post-quantum cryptography algorithms held by the National institute of Standards and technology (NIST). This paper explores the implementation and optimization space of hardware design according to CRYSTALS-KYBER algorithm. We analyze its software code and try different strategies to optimize the hardware implementation, and conduct comparative analysis in terms of area and speed. The experimental results show that the performance can be greatly improved by moderately optimizing the loops. In comparison with optimal results of the work [12], our optimizations improve the performance by up to 74.6% for encapsulation algorithm and 54.4% for decapsulation algorithm.

关键词： Cryptography Optimization Hardware Pipeline processing Encapsulation computers Space exploration

来源：评论

学校读者我要写书评

暂无评论

Indexing Techniques of Distributed Ordered Tables： A Survey and Analysis

引用

Journal of computer Science & technology 2018年第1期33卷 169-189页

作者： Chen Feng Chun-Dian Li Rui Li State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China~TT ' University of Chinese Academy of Sciences Beijing 100049 China Tencent Inc. Beijing 100080 China

Many NoSQL （Not Only SQL） databases were proposed to store and query on a huge amount of data. Some of them like BigTable, PNUTS, and HBase, can be modeled as distributed ordered tables （DOTs）. Many additional indexing techniques have been presented to support queries on non-key columns for DOTs. However, there was no comprehensive analysis or comparison of these techniques, which brings troubles to users in selecting or proposing a proper indexing technique for a certain workload. This paper proposes a taxonomy based on six indexing issues to classify indexing techniques on DOTs and provides a comprehensive review of the state-of-the-art techniques. Based on the taxonomy, we propose a performance model named QSModel to estimate the query time and storage cost of these techniques and run experiments on a practical workload from Tencent to evaluate this model. The results show that the maximum error rates of the query time and storage cost are 24.2% and 9.8% respectively. Furthermore, we propose IndexComparator, an open source project that implements representative indexing techniques. Therefore, users can select the best-fit indexing technique based on both theoretical analysis and practical experiments.

关键词： database Not Only SQL （NoSQL） range query indexing

来源：评论

学校读者我要写书评

暂无评论

AIoT Bench: Towards Comprehensive Benchmarking Mobile and Embedded Device Intelligence 1st

AIoT Bench: Towards Comprehensive Benchmarking Mobile and Em...

引用

1st International Symposium on Benchmarking, Measuring, and Optimization, Bench 2018

作者： Luo, Chunjie Zhang, Fan Huang, Cheng Xiong, Xingwang Chen, Jianan Wang, Lei Gao, Wanling Ye, Hainan Wu, Tong Zhou, Runsong Zhan, Jianfeng State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China Beijing Academy of Frontier Science and Technology Beijing China Dover United Kingdom China National Institute of Metrology Beijing China China Software Testing Center Beijing China

ISBN: (纸本)9783030328122

Due to increasing amounts of data and compute resources, the deep learning achieves many successes in various domains. Recently, researchers and engineers make effort to apply the intelligent algorithms to the mobile or embedded devices. In this paper, we propose a benchmark suite, AIoT Bench, to evaluate the AI ability of mobile and embedded devices. Our benchmark (1) covers different application domains, e.g. image recognition, speech recognition and natural language processing;(2) covers different platforms, including Android and Raspberry Pi;(3) covers different development frameworks, including TensorFlow and Caffe2;(4) offers both end-to-end application workloads and micro workloads. © 2019, Springer Nature Switzerland AG.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

An automatic debugging tool of instruction-driven multicore systems with synchronization points

arXiv

引用

arXiv 2019年

作者： Luo, Yuzhe Yu, Xin State Key Laboratory of Computer Architecture Institute of Computing Technology CAS University of Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy of Sciences Beijing China

Tracing back the instruction execution sequence to debug a multicore system can be very time-consuming because the relationships of the instructions can be very complex. For instructions that cannot be checked by the environment immediately after their executions, the errors they triggered can propagate through the instruction execution sequence. Our task is to find the error-triggered instructions automatically. This paper presents an automatic debugging tool that can leverage the synchronization points in the instruction execution sequences of the multicore system being verified to locate the instruction which results in simulation error automatically. To evaluate the performance of the debugging tool, we analyze the complexity of the algorithms and count the number of instructions executed to locate the aimed instruction. Copyright © 2019, The Authors. All rights reserved.

关键词： Errors

来源：评论

学校读者我要写书评

暂无评论

CPicker： Leveraging Performance-Equivalent Configurations to Improve Data Center Energy Efficiency

引用

Journal of computer Science & technology 2018年第1期33卷 131-144页

作者： Fa-Qiang Sun Gui-Hai Yan Xin He Hua-Wei Li Yin-He Han State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China National Computer Network Emergency Response Technical Team of China Beijing 100029 China University of Chinese Academy of Sciences Beijing 100049 China

The poor energy proportionality of server is seen as the principal source for low energy efficiency of modern data centers. We find that different resource configurations of an application lead to similar performance, but have distinct energy consumption. We call this phenomenon as ＂performance-equivalent resource configurations （PERC）＂, and its performance range is called equivalent region （ER）. Based on PERC, one basic idea for improving energy efficiency is to select the most efficient configuration from PERC for each application. However, it cannot support every application to obtain optimal solution when thousands of applications are run simultaneously on resource-bounded servers. Here we propose a heuristic scheme, CPicker, based on genetic programming to improve energy efficiency of servers. To speed up convergence, CPicker initializes a high quality population by first choosing configurations from regions that have high energy variation. Experiments show that CPicker obtains above 17% energy efficiency improvement compared with the greedy approach, and less than 4% efficiency loss compared with the oracle case.

关键词： performance equivalence energy efficiency data center power management dynamic voltage and frequencyscaling （DVFS）

来源：评论

学校读者我要写书评

暂无评论

The roles of urban buildings and vegetation in adjusting seasonal and daily air temperature 4

The roles of urban buildings and vegetation in adjusting sea...

引用

4th ISPRS Geospatial Week 2019

作者： Lan, Y. Huang, Z. Guo, R. Zhan, Q. Research Institute for Smart Cities School of Architecture and Urban Planning Shenzhen University Shenzhen China Shenzhen Key Laboratory of Spatial Information Smart Sensing and Services Shenzhen University China Guangdong Key Laboratory of Urban Informatics Shenzhen University China National Engineering Laboratory for Big Data System Computing Technology Shenzhen University China School of Urban Design Wuhan University Wuhan China

Exploring the spatiotemporal patterns of the relationships between urban indicators and urban temperature is essential to improve the mitigation effectiveness when we intend to adjust built environment for moderating urban thermal environment. In this study, RS, GIS technology and statistical methods were involved to investigate the spatiotemporal patterns of the impacts of urban buildings and vegetation on Air Temperature (AT). Building Density (BD) and Normalized Difference Vegetation Index (NDVI) are the indicators for urban buildings and vegetation respectively. The objectives of this study are: 1) to determine an appropriate scale for examining the building-AT relationships and vegetation-AT relationships;2) to explore the seasonal and daily characteristics of these relationships;and 3) to compare the effects of urban buildings and vegetation. The results show that, for both summer and winter, a scale of 200–250 m is optimal for examining building-AT relationships, and 960–1020 m is the desirable scale for studying vegetation-AT relationships. Based on the optimal scales, we find that for both buildings and vegetation, they only significantly impact night-time temperature in both summer and winter. For seasonal comparison, the building-AT relationships and vegetation-AT relationships are relatively stronger in summer than in winter, which are indicated by R-square of the regression results. When comparing the effects of urban building and vegetation, we find that increasing vegetation is more effective than reduce buildings to achieve the same air temperature reduction. Our findings are conducive to generating space-time targeted Urban Heat Island (UHI) mitigation strategies. © Authors 2019.

关键词： Vegetation

来源：评论

学校读者我要写书评

暂无评论

ACG-Engine: An Inference Accelerator for Content Generative Neural Networks

ACG-Engine: An Inference Accelerator for Content Generative ...

引用

IEEE International Conference on computer-Aided Design

作者： Haobo Xu Ying Wang Yujie Wang Jiajun Li Bosheng Liu Yinhe Han University of Chinese Academy of Sciences State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Research Center for Intelligent Computing Systems Institute of Computing Technology Chinese Academy of Sciences

The technological breakthrough in Generative Adversarial Networks (GAN) has propelled the advancement of content generative applications such as AI-based paintings, style transfer, and music composition. However, in contrast to previous deep learning models for prediction and categorization, generative networks generally rely on instance normalization (IN) layer for better feature distribution, which performs significantly better than batch normalization(BN) in image style-transfer, image to image translation, etc. Unlike batch or group normalization that can be fused into convolutional layers and ignored during the network inference stage, an instance normalization layer induces intensive computation and memory access. However, prior deep learning accelerator designs for traditional Neural Network and Generative Adversarial Networks mostly focus on the acceleration of convolution and deconvolution layer but lack of support for IN operations, which could become a performance bottleneck on edge devices with insufficient computational power. To address this problem, we propose an inference accelerator for content generation (ACG-Engine) aimed to support the fundamental operations of generative networks, including convolution layers, deconvolution layers, specifically instance normalization layer. We performed a hardware-aware mathematical transformation of the IN operation for less computation complexity and memory-friendliness, so that it can be efficiently mapped to the classic 2D processing element array. Owing to the proposed optimization techniques, ACG-Engine achieves 4.56X speedup and improve power efficiency up to 29X compared to prior baseline acceleration scheme in generative network acceleration. In addition, ACG-Engine can achieve performance comparable to the classic CNN-specific accelerators with negligible power consumption and area overhead.

关键词： Performance evaluation Deep learning Deconvolution Convolution Neural networks Generative adversarial networks Acceleration

来源：评论

学校读者我要写书评

暂无评论

C-MIDN: Coupled Multiple Instance Detection Network With Segmentation Guidance for Weakly Supervised Object Detection

C-MIDN: Coupled Multiple Instance Detection Network With Seg...

引用

International Conference on computer Vision (ICCV)

作者： Gao Yan Boxiao Liu Nan Guo Xiaochun Ye Fang Wan Haihang You Dongrui Fan Institute of Computing Technology Beijing China State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China

ISBN: (数字)9781728148038

ISBN: (纸本)9781728148045

Weakly supervised object detection (WSOD) that only needs image-level annotations has obtained much attention recently. By combining convolutional neural network with multiple instance learning method, Multiple Instance Detection Network (MIDN) has become the most popular method to address the WSOD problem and been adopted as the initial model in many works. We argue that MIDN inclines to converge to the most discriminative object parts, which limits the performance of methods based on it. In this paper, we propose a novel Coupled Multiple Instance Detection Network (C-MIDN) to address this problem. Specifically, we use a pair of MIDNs, which work in a complementary manner with proposal removal. The localization information of the MIDNs is further coupled to obtain tighter bounding boxes and localize multiple objects. We also introduce a Segmentation Guided Proposal Removal (SGPR) algorithm to guarantee the MIL constraint after the removal and ensure the robustness of C-MIDN. Through a simple implementation of the C-MIDN with online detector refinement, we obtain 53.6% and 50.3% mAP on the challenging PASCAL VOC 2007 and 2012 benchmarks respectively, which significantly outperform the previous state-of-the-arts.

关键词： Proposals Detectors Image segmentation Semantics Couplings Training Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Paddy rice methane emissions across Monsoon Asia

引用

Remote Sensing of Environment 2023年 284卷

作者： Ouyang, Zutao Jackson, Robert B. McNicol, Gavin Fluet-Chouinard, Etienne Runkle, Benjamin R.K. Papale, Dario Knox, Sara H. Cooley, Sarah Delwiche, Kyle B. Feron, Sarah Irvin, Jeremy Andrew Malhotra, Avni Muddasir, Muhammad Sabbatini, Simone Alberto, Ma. Carmelita R. Cescatti, Alessandro Chen, Chi-Ling Dong, Jinwei Fong, Bryant N. Guo, Haiqiang Hao, Lu Iwata, Hiroki Jia, Qingyu Ju, Weimin Kang, Minseok Li, Hong Kim, Joon Reba, Michele L. Nayak, Amaresh Kumar Roberti, Debora Regina Ryu, Youngryel Swain, Chinmaya Kumar Tsuang, Benjei Xiao, Xiangming Yuan, Wenping Zhang, Geli Zhang, Yongguang Department of Earth System Science Stanford University StanfordCA United States Woods Institute for the Environment Stanford University StanfordCA United States Precourt Institute for Energy Stanford University StanfordCA United States Department of Earth and Environmental Sciences University of Illinois Chicago ChicagoIL United States Department of Biological and Agricultural Engineering University of Arkansas FayettevilleAR United States University of Tuscia Viterbo Italy IAFES division Viterbo Italy Department of Geography The University of British Columbia VancouverBC Canada Department of Geography University of Oregon EugeneOR United States Department of Environmental Science Policy And Management UC Berkeley BerkeleyCA United States Campus Fryslan University of Groningen Groningen Netherlands Department of Computer Science Stanford University StanfordCA United States Department of Geography University of Zurich Zurich Switzerland International Rice Research Institute Laguna Philippines Ispra Italy Taiwan Agricultural Research Institute Taichung Wufeng Taiwan Institute of Geographic Sciences and Natural Resources Research Chinese Academy of Sciences Beijing China USDA ARS Delta Water Management Research Unit JonesboroAR United States Fudan University Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering and Coastal Ecosystems Research Station of the Yangtze River Estuary Shanghai China / Jiangsu Key Laboratory of Agricultural Meteorology Nanjing University of Information Science and Technology Nanjing China Department of Environmental Science Shinshu University Matsumoto Japan China Meteorological Administration Institute of Atmospheric Environment Shenyang China International Institute for Earth System Sciences Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application Nanjing University Jiangsu Nanjing210023 China National Center for AgroM

Although rice cultivation is one of the most important agricultural sources of methane (CH4) and contributes ∼8% of total global anthropogenic emissions, large discrepancies remain among estimates of global CH4 emissions from rice cultivation (ranging from 18 to 115 Tg CH4 yr−1) due to a lack of observational constraints. The spatial distribution of paddy-rice emissions has been assessed at regional-to-global scales by bottom-up inventories and land surface models over coarse spatial resolution (e.g., > 0.5°) or spatial units (e.g., agro-ecological zones). However, high-resolution CH4 flux estimates capable of capturing the effects of local climate and management practices on emissions, as well as replicating in situ data, remain challenging to produce because of the scarcity of high-resolution maps of paddy-rice and insufficient understanding of CH4 predictors. Here, we combine paddy-rice methane-flux data from 23 global eddy covariance sites and MODIS remote sensing data with machine learning to 1) evaluate data-driven model performance and variable importance for predicting rice CH4 fluxes;and 2) produce gridded up-scaling estimates of rice CH4 emissions at 5000-m resolution across Monsoon Asia, where ∼87% of global rice area is cultivated and ∼ 90% of global rice production occurs. Our random-forest model achieved Nash-Sutcliffe Efficiency values of 0.59 and 0.69 for 8-day CH4 fluxes and site mean CH4 fluxes respectively, with land surface temperature, biomass and water-availability-related indices as the most important predictors. We estimate the average annual (winter fallow season excluded) paddy rice CH4 emissions throughout Monsoon Asia to be 20.6 ± 1.1 Tg yr−1 for 2001–2015, which is at the lower range of previous inventory-based estimates (20–32 CH4 Tg yr−1). Our estimates also suggest that CH4 emissions from paddy rice in this region have been declining from 2007 through 2015 following declines in both paddy-rice growing area and emission rates per unit

关键词： Climate change

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：