检索结果-内蒙古大学图书馆

IEEE Symposium on High-Performance computer architecture

作者： Xiaoming Chen Yinhe Han Yu Wang Center for Intelligent Computing Systems State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China Department of Electronic Engineering Tsinghua University Beijing China

ISBN: (数字)9781728161495

ISBN: (纸本)9781728161501

In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communication for CNN accelerators. For the off-chip communication, we derive the theoretical lower bound for any convolutional layer and propose a dataflow to reach the lower bound. This fundamental problem has never been solved by prior studies. The on-chip communication is minimized based on an elaborate workload and storage mapping scheme. We in addition design a communication-optimal CNN accelerator architecture. Evaluations based on the 65nm technology demonstrate that the proposed architecture nearly reaches the theoretical minimum communication in a three-level memory hierarchy and it is computation dominant. The gap between the energy efficiency of our accelerator and the theoretical best value is only 37-87%.

关键词： System-on-chip Convolution Random access memory Convolutional codes Memory management Microsoft Windows Kernel

来源：评论

学校读者我要写书评

暂无评论

Mixed reality architecture in space habitats 70

Mixed reality architecture in space habitats

引用

70th International Astronautical Congress, IAC 2019

作者： Basu, Tamalee Bannova, Olga Camba, Jorge D. Sasakawa International Center for Space Architecture University of Houston 4800 Calhoun Rd HoustonTX77004 United States Department of Computer Graphics Technology Purdue University 401 N Grant St West LafayetteIN47907 United States

From assisting with assembling the Orion capsule to using highly immersive virtual environments for astronaut training, MR technologies provide a powerful mechanism to alter the perception of the physical world and deliver realistic personalized visual stimuli to users. In this paper, we discuss a novel strategy to utilize MR technologies as a design element to enhance the interior architecture of the space habitat and enrich the inhabitants' personal experience. We discuss two scenarios that entail long-duration missions as well as a customized experience for space tourists in the Low Earth Orbit (LEO). A series of spacecraft volumetric studies of the ergonomics associated with the application of MR technologies are reported. Physical, virtual and combined experiences are mapped within the volumes with respect to crew ConOps. The experiences are then analyzed and translated to architectural design requirements that inform criteria for the development of personalized MR- based interventions. For the first scenario, NASA's 500 days on the surface of Mars mission is considered, which requires 600 additional days in microgravity transit inside the Deep Space Transfer vehicle, a 7.2 m wide hard-shell module. In this scenario, MR experiences are used as a stress countermeasure to help a crew of four to sustain psychological and behavioral health, maintain productivity, and stimulate teamwork and performance. This is accomplished by providing novelty in the habitat as well as designing content that can increase the volumetric perception of the environment. The second scenario is presented in the context of space tourism where habitats with minimum physical interior design elements can be transformed into comfortable habitable personal environments. Bigelow Space Operations' B330 was selected as a reference site for a 12-day LEO tourism mission. We discuss a design approach that provides tourists with a high level of comfort by using projection-based MR technologies to cus

关键词： Mixed reality

来源：评论

学校读者我要写书评

暂无评论

Optimal Eco-driving Control of Autonomous and Electric Trucks in Adaptation to Highway Topography: Energy Minimization and Battery Life Extension

arXiv

引用

arXiv 2020年

作者： Zhang, Yongzhi Qu, Xiaobo Tong, Lang The College of Mechanical and Vehicle Engineering Chongqing University Chongqing400044 China The Department of Architecture and Civil Engineering Chalmers University of Technology Gothenburg41296 Sweden The School of Electrical and Computer Engineering Cornell University IthacaNY14853 United States

This paper develops a model to plan energy-efficient speed trajectories of electric trucks in real time by taking into account the information of topography and traffic ahead of the vehicle. In this real time control model, a novel state-space model is first developed to capture vehicle speed, acceleration, and state of charge. An energy minimization problem is then formulated and solved by an alternating direction method of multipliers (ADMM) that exploits the structure of the problem. A model predictive control (MPC) framework is further employed to deal with topographic and traffic uncertainties in real-time. An empirical study is finally conducted on the performance of the proposed eco-driving algorithm and its impact on battery degradation. The simulation results show that the energy consumption by using the developed method is reduced by up to 5.05%, and the battery life extended by more than 100% compared to benchmarking solutions. Copyright © 2020, The Authors. All rights reserved.

关键词： Topography

来源：评论

学校读者我要写书评

暂无评论

Creating a radiological database for automatic liver segmentation using artificial intelligence.

引用

European Journal of Surgical Oncology 2022年第2期48卷 e147-e148页

作者： Girnyi, Sergii Tomasz, Dziubich Brzeski, Adam Cychnerski, Jan Świetlik, Dariusz Woźniak, Jakub Szczecińska, Weronika Jaśkiewicz, Janusz Zielinski, Jacek Medical University of Gdansk Department of Surgical Oncology Gdansk Poland Gdansk University of Technology Faculty of Electronics- Telecommunications and Informatics - Department of Computer Architecture Gdansk Poland Medical University of Gdansk Department of Biostatistics and Neural Networks Gdansk Poland Medical University of Gdansk Medical University of Gdansk Gdansk Poland

来源：评论

学校读者我要写书评

暂无评论

Bandwidth-Aware Last-Level Caching: Efficiently Coordinating Off-Chip Read and Write Bandwidth

Bandwidth-Aware Last-Level Caching: Efficiently Coordinating...

引用

IEEE International Conference on computer Design: VLSI in computers and Processors, (ICCD)

作者： Mainak Chaudhuri Jayesh Gaur Sreenivas Subramoney Department of Computer Science and Engineering Indian Institute of Technology Kanpur Processor Architecture Research Lab Intel Corporation

The last two decades have witnessed a large number of proposals on the last-level cache (LLC) replacement policy aiming to minimize the number of LLC read misses. Another independent large body of work has explored mechanisms to address the inefficiencies arising from the DRAM writes introduced by the LLC replacement policy. These DRAM scheduling proposals, however, leave the LLC replacement policy unchanged and, as a result, miss the opportunity of synergistically shaping and scheduling the DRAM write bandwidth demand. In this paper, we argue that DRAM read and write bandwidth demands must be coordinated carefully from the LLC side and hence, introduce bandwidth-awareness in the LLC policy. Our bandwidth-aware LLC policy proposal enables long uninterrupted stretches of DRAM reads while maintaining the efficiency of the last-level cache and controlling precisely when and for how long writes can demand DRAM bandwidth. Our proposal comfortably outperforms the state-of-the-art eager DRAM write scheduling proposals and bridges 75% of the performance gap between the baseline and a hypothetical system that deploys an unbounded DRAM write buffer.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Non-Stop Double Buffering Mechanism for Dataflow architecture

引用

Journal of computer Science & technology 2018年第1期33卷 145-157页

作者： Xu Tan Xiao-Wei Shen Xiao-Chun Ye Da Wang Dong-Rui Fan Lunkai Zhang Wen-Ming Li Zhi-Min Zhang Zhi-Min Tang State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China School of Computer and Control Engineering University of Chinese Academy of Sciences Beijing 100049 China State Key Laboratory of Mathematical Engineering and Advanced Computing Wuxi 214125 China Department of Computer Science The University of Chicago Chicago IL 60637 U.S.A.

Double buffering is an effective mechanism to hide the latency of data transfers between on-chip and off-chip memory. However, in dataflow architecture, the swapping of two buffers during the execution of many tiles decreases the performance because of repetitive filling and draining of the dataflow accelerator. In this work, we propose a non-stop double buffering mechanism for dataflow architecture. The proposed non-stop mechanism assigns tiles to the processing element array without stopping the execution of processing elements through optimizing control logic in dataflow architecture. Moreover, we propose a work-flow program to cooperate with the non-stop double buffering mechanism. After optimizations both on control logic and on work-flow program, the filling and draining of the array needs to be done only once across the execution of all tiles belonging to the same dataflow graph. Experimental results show that the proposed double buffering mechanism for dataftow architecture achieves a 16.2% average efficiency improvement over that without the optimization.

关键词： non-stop double buffering dataflow architecture high-performance computing

来源：评论

学校读者我要写书评

暂无评论

Lightweight Task-Oriented Semantic Communication Empowered by Large-Scale AI Models

引用

IEEE Transactions on Vehicular technology 2025年

作者： Liu, Chuanhong Guo, Caili Yang, Yang Chen, Mingzhe Quek, Tony Q. S. Beijing University of Posts and Telecommunications Beijing Key Laboratory of Network System Architecture and Convergence School of Information and Communication Engineering Beijing100876 China Beijing University of Posts and Telecommunications Beijing Laboratory of Advanced Information Networks School of Information and Communication Engineering Beijing100876 China University of Miami Department of Electrical and Computer Engineering Institute for Data Science and Computing Coral GablesFL United States Singapore University of Technology and Design Dept. of Information Systems Technology and Design 487372 Singapore

Recent studies have focused on leveraging large-scale artificial intelligence (LAI) models to improve semantic representation and compression capabilities. However, the substantial computational demands of LAI models pose significant challenges for real-time communication scenarios. To address this, this paper proposes utilizing knowledge distillation (KD) techniques to extract and condense knowledge from LAI models, effectively reducing model complexity and computation latency. Nevertheless, the inherent complexity of LAI models leads to prolonged inference times during distillation, while their lack of channel awareness compromises the distillation performance. These limitations make standard KD methods unsuitable for task-oriented semantic communication scenarios. To address these issues, we propose a fast distillation method featuring a pre-stored compression mechanism that eliminates the need for repetitive inference, significantly improving efficiency. Furthermore, a channel adaptive module is incorporated to dynamically adjust the transmitted semantic information based on varying channel conditions, enhancing communication reliability and adaptability. In addition, an information bottleneck-based loss function is derived to guide the fast distillation process. Simulation results verify that the proposed scheme outperform baselines in term of task accuracy, model size, computation latency, and training data requirements. © 1967-2012 IEEE.

关键词： NP-hard

来源：评论

学校读者我要写书评

暂无评论

VNet: A Versatile Network for Efficient Real-Time Semantic Segmentation

VNet: A Versatile Network for Efficient Real-Time Semantic S...

引用

IEEE International Conference on computer Design: VLSI in computers and Processors, (ICCD)

作者： Ning Lin Hang Lu Jingliang Gao Shunjie Qiao Xiaowei Li State Key Laboratory of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences University of Chinese Academy of Sciences Department of Computer Science University of Hong Kong

Many recent excellent methods for efficient real-time semantic segmentation are of low precision and heavily rely on multiple GPUs for training. In this paper, we rethink the critical factors affecting the accuracy of efficient segmentation models. The previous works usually reduce the input resolution prior to training the parameters of models by cropping or resizing the images. On the contrary, our empirical study shows that the reduced images lose the important content information and details, which are vital to the high precision. However, the previous methods are unable to train the original high-resolution images due to the memory-limited GPUs. To tackle this problem, we propose a novel versatile network (VNet), which employs reversible mechanism and asymmetric convolution to achieve highly efficient and extremely low memory consumption in backward propagation. In particular, we keep all the detailed spatial information of the input images without cropping or resizing to pursue decent prediction accuracy. It is worth noting that VNet can train multiple 1024×2048 high-resolution images on only one standard GPU card. Under the same conditions, our model achieves a new state-of-the-art result on Cityscapes datasets. Specifically, it can process the 1024×2048 high-resolution inputs at a rate of 37.4 and 15.5 frames per second (fps) on a standard GPU and an edge device, respectively, with only 0.16 million parameters.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Fast and Compact Invertible Sketch for Network-Wide Heavy Flow Detection

arXiv

引用

arXiv 2019年

作者： Tang, Lu Huang, Qun Lee, Patrick P.C. Department of Computer Science and Engineering Chinese University of Hong Kong Hong Kong State Key Lab of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences

Fast detection of heavy flows (e.g., heavy hitters and heavy changers) in massive network traffic is challenging due to the stringent requirements of fast packet processing and limited resource availability. Invertible sketches are summary data structures that can recover heavy flows with small memory footprints and bounded errors, yet existing invertible sketches incur high memory access overhead that leads to performance degradation. We present MV-Sketch, a fast and compact invertible sketch that supports heavy flow detection with small and static memory allocation. MV-Sketch tracks candidate heavy flows inside the sketch data structure via the idea of majority voting, such that it incurs small memory access overhead in both update and query operations, while achieving high detection accuracy. We present theoretical analysis on the memory usage, performance, and accuracy of MV-Sketch in both local and network-wide scenarios. We further show how MV-Sketch can be implemented and deployed on P4-based programmable switches subject to hardware deployment constraints. We conduct evaluation in both software and hardware environments. Trace-driven evaluation in software shows that MV-Sketch achieves higher accuracy than existing invertible sketches, with up to 3.38× throughput gain. We also show how to boost the performance of MV-Sketch with SIMD instructions. Furthermore, we evaluate MV-Sketch on a Barefoot Tofino switch and show how MV-Sketch achieves line-rate measurement with limited hardware resource overhead. Copyright © 2019, The Authors. All rights reserved.

关键词： Memory architecture

来源：评论

学校读者我要写书评

暂无评论

Crack identification using extended IsoGeometric analysis and particle swarm optimization 7th

Crack identification using extended IsoGeometric analysis an...

引用

7th International Conference on Fracture Fatigue and Wear, FFW 2018

作者： Khatir, Samir Wahab, Magd Abdel Benaissa, Brahim Köppen, Mario Department of Electrical Energy Metals Mechanical Constructions and Systems Faculty of Engineering and Architecture Ghent University Ghent Belgium Graduate School of Life Science and Systems Engineering Kyushu Institute of Technology Kitakyushu Japan Graduate School of Computer Science and Systems Engineering Kyushu Institute of Technology Kitakyushu Japan

ISBN: (纸本)9789811304101

The eXtended isogeometric analysis (X-IGA) combined with Particle swarm optimization (PSO) is used for crack identification in twodimensional linear elastic problems based on inverse problem. The application of fracture mechanics test under mode II loading is performed. The X-IGA possesses the advantages of the combination between eXtended Finite Element Method (X-FEM) and the Isogeometric Analysis (IGA). The objective function minimizes the gap between the calculated and measured displacements. Convergence studies at various positions of crack on the plate are calculated and the results shows that the proposed technique can detect damage with minimum accuracy 95% for the position and maximum accuracy 98%. © Springer Nature Singapore Pte Ltd. 2019.

关键词： Inverse problem PSO and crack identification and plate 2D XIGA

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：