检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

4,350 篇 会议
68 篇 期刊文献
14 册 图书

馆藏范围

4,432 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,487 篇 工学
- 2,303 篇 计算机科学与技术...
- 1,053 篇 软件工程
- 492 篇 电气工程
- 430 篇 信息与通信工程
- 194 篇 电子科学与技术（可...
- 119 篇 控制科学与工程
- 102 篇 网络空间安全
- 89 篇 动力工程及工程热...
- 73 篇 生物工程
- 48 篇 机械工程
- 47 篇 生物医学工程（可授...
- 41 篇 光学工程
- 34 篇 建筑学
- 30 篇 材料科学与工程（可...
- 22 篇 仪器科学与技术
- 22 篇 土木工程
- 22 篇 化学工程与技术
604 篇 理学
- 427 篇 数学
- 97 篇 物理学
- 83 篇 生物学
- 67 篇 系统科学
- 61 篇 统计学（可授理学、...
- 23 篇 化学
337 篇 管理学
- 251 篇 管理科学与工程(可...
- 139 篇 工商管理
- 100 篇 图书情报与档案管...
61 篇 医学
- 45 篇 临床医学
- 38 篇 基础医学(可授医学...
40 篇 经济学
- 40 篇 应用经济学
38 篇 法学
- 33 篇 社会学
11 篇 教育学
11 篇 农学
6 篇 文学
1 篇 艺术学

主题

1,185 篇 computer archite...
446 篇 hardware
376 篇 high performance...
287 篇 concurrent compu...
278 篇 computational mo...
237 篇 application soft...
231 篇 parallel process...
217 篇 computer science
213 篇 distributed comp...
187 篇 costs
186 篇 bandwidth
179 篇 field programmab...
173 篇 delay
166 篇 throughput
159 篇 cloud computing
156 篇 grid computing
140 篇 computer network...
137 篇 resource managem...
133 篇 laboratories
123 篇 scalability

机构

16 篇 university of ch...
12 篇 school of comput...
12 篇 institute of com...
8 篇 department of co...
8 篇 georgia inst tec...
8 篇 univ chicago dep...
8 篇 barcelona superc...
8 篇 school of comput...
8 篇 carnegie mellon ...
8 篇 mathematics and ...
8 篇 intel corporatio...
7 篇 department of el...
7 篇 college of compu...
7 篇 univ illinois ur...
7 篇 computer systems...
7 篇 department of co...
7 篇 mathematics and ...
7 篇 intel corp santa...
7 篇 univ fed rio gra...
6 篇 univ toronto on

作者

13 篇 navaux philippe ...
11 篇 d.k. panda
9 篇 viktor k. prasan...
9 篇 prasanna viktor ...
9 篇 mutlu onur
9 篇 i. foster
8 篇 dhabaleswar k. p...
8 篇 dongarra jack
8 篇 guedes dorgival
7 篇 borin edson
7 篇 cristal adrian
7 篇 chong frederic t...
7 篇 loh gabriel h.
7 篇 kim nam sung
7 篇 zhou huiyang
7 篇 panda dhabaleswa...
7 篇 magoules frederi...
7 篇 ferreira renato
7 篇 xiaowei li
7 篇 buyya rajkumar

语言

4,400 篇 英文
23 篇 其他
10 篇 中文
2 篇 葡萄牙文
1 篇 法文

检索条件"任意字段=16th Symposium on Computer Architecture and High Performance Computing"

共 4432 条记录，以下是81-90 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

Deep Reinforcement Learning to Enhance the Energy-efficient performance of Unmanned Aerial Vehicles-enabled Fog Radio Access Networks 21

Deep Reinforcement Learning to Enhance the Energy-efficient ...

引用

21st IEEE International symposium on Parallel and Distributed Processing with Applications, 13th IEEE International Conference on Big Data and Cloud computing, 16th IEEE International Conference on Social computing and Networking and 13th International Conference on Sustainable computing and Communications, ISPA/BDCloud/SocialCom/SustainCom 2023

作者： Liu, Chuanjie Ren, Siyu Du, Shuang Dai, Cheng Guo, Bing Fan, Runzhi Sichuan University School of Computer Science Chengdu China Uestc School of Aeronautic and Astronautic Chengdu China

ISBN: (纸本)9798350329223

Nowadays, the use of unmanned aerial vehicles (UAVs) as the fog access points (F-APs) is of high practical value to future fog radio access networks (F-RANs). Compared to the F-AP, UAV enabled F-AP possesses stronger line-of-sight links with the ground terminals (GTs) due to its high altitude as well as high and flexible mobility in 3D spaces. Nevertheless, unlike terrestrial F-AP that has reliable power supply, UAV- enabled F-AP in practice has limited on board power, but requires high propulsion power to stay air-borne and support high mobility. Motivated by the above considerations, this paper aims to explore how deep reinforcement learning (DRL) can be applied to enhance the energy-efficient performance of UAV- enabled F-RANs. Such DRL method supports UAV enabled F-AP to real-timely optimize the air-to-ground fog computing by reasoning network configuration in an online manner, which sheds new light on the existing works in UAV-enabled networks. In specific, the DRL based network configuration considers parameters including resource allocation, task offloading, cache deployment and 3D UAV trajectory, and get validated via simulation in this paper. In the end, the main design challenges and promising directions for future research are also discussed. © 2023 IEEE.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Nimblock: Scheduling for Fine-grained FPGA Sharing through Virtualization 23

Nimblock: Scheduling for Fine-grained FPGA Sharing through V...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Mandava, Meghna Reckamp, Paul Chen, Deming Univ Illinois Urbana IL 61801 USA

ISBN: (纸本)9798400700958

As FPGAs become ubiquitous compute platforms, existing research has focused on enabling virtualization features to facilitate fine-grained FPGA sharing. We employ an overlay architecture which enables arbitrary, independent user logic to share portions of a single FPGA by dividing the FPGA into independently reconfigurable slots. We then explore scheduling possibilities to effectively time- and space-multiplex the virtualized FPGA by introducing Nimblock. the Nimblock scheduling algorithm balances application priorities and performance degradation to improve response time and reduce deadline violations. Unlike other algorithms, Nimblock explores pre-emption as a scheduling parameter to dynamically change resource allocations, and automatically allocates resources to enable suitable parallelism for an application without additional user input. In our exploration, we evaluate five scheduling algorithms: a baseline, three existing algorithms, and our novel Nimblock algorithm. We demonstrate system feasibility by realizing the complete system on a Xilinx ZCU106 FPGA and evaluating on a set of real-world benchmarks. In our results, we achieve up to 5.7x lower average response times when compared to a no-sharing and no-virtualization scheduling algorithm and up to 2.1x average response time improvement over competitive scheduling algorithms that support sharing within our virtualization environment. We additionally demonstrate up to 49% fewer deadline violations and up to 2.6x lower tail response times when compared to other high-performance algorithms.

关键词： Virtualization Reconfigurable computing Real-time Scheduling Overlay architectures

来源：评论

学校读者我要写书评

暂无评论

Fast-OMRA: Fast Online Motion Resolution Adaptation for Neural B-Frame Coding 16

Fast-OMRA: Fast Online Motion Resolution Adaptation for Neur...

引用

16th IEEE Latin American symposium on Circuits and Systems, LASCAS 2025

作者： Nguyenquang, Sang Gao, Zong-Lin Ho, Kuan-Wei Hoangvan, Xiem Peng, Wen-Hsiao National Yang Ming Chiao Tung University Department of Computer Science Hsinchu Taiwan Vnu University of Engineering and Technology Faculty of Electronics and Telecommunications Hanoi Viet Nam

ISBN: (纸本)9798331522124

Most learned B-frame codecs with hierarchical temporal prediction suffer from the domain shift issue caused by the discrepancy in the Group-of-Pictures (GOP) size used for training and test. As such, the motion estimation network may fail to predict large motion properly. One effective strategy to mitigate this domain shift issue is to downsample video frames for motion estimation. However, finding the optimal downsampling factor involves a time-consuming rate-distortion optimization process. this work introduces lightweight classifiers to determine the downsampling factor. To strike a good rate-distortion-complexity trade-off, our classifiers observe simple state signals, including only the coding and reference frames, to predict the best downsampling factor. We present two variants that adopt binary and multi-class classifiers, respectively. the binary classifier adopts the Focal Loss for training, classifying between motion estimation at high and low resolutions. Our multi-class classifier is trained with novel soft labels incorporating the knowledge of the rate-distortion costs of different downsampling factors. Both variants operate as add-on modules without the need to re-train the B-frame codec. Experimental results confirm that they achieve comparable coding performance to the brute-force search methods while greatly reducing computational complexity. © 2025 IEEE.

关键词： Image coding

来源：评论

学校读者我要写书评

暂无评论

Proceedings - 2023 IEEE 16th International Conference on Cloud computing, CLOUD 2023

Proceedings - 2023 IEEE 16th International Conference on Clo...

引用

16th IEEE International Conference on Cloud computing, CLOUD 2023

ISBN: (纸本)9798350304817

the proceedings contain 72 papers. the topics discussed include: InsightsSumm - summarization of ITOps incidents through in-context prompt engineering;fine-grained heterogeneous execution framework with energy aware scheduling;learning representations on logs for AIOps;EN-Beats: a novel ensemble learning-based method for multiple resource predictions in cloud;the case for the anonymization of offloaded computation;deep reinforcement learning in cloud elasticity through offline learning and return based scaling;demystifying deep learning in predictive monitoring for cloud-native SLOs;fine-grained heterogeneous execution framework with energy aware scheduling;Storm-RTS: stream processing with stable performance for multi-cloud and cloud-edge;blaze: a high-performance, scalable, and efficient data transfer framework with configurable and extensible features;and Kepler: a framework to calculate the energy consumption of containerized applications.

关键词：

来源：评论

学校读者我要写书评

暂无评论

uSystolic: Byte-Crawling Unary Systolic Array 28

uSystolic: Byte-Crawling Unary Systolic Array

引用

28th Annual IEEE International symposium on high-performance computer architecture (HPCA)

作者： Di Wu San Miguel, Joshua Univ Wisconsin Dept ECE Madison WI 53706 USA

ISBN: (纸本)9781665420273

General matrix multiply (GEMM) is an important operation in broad applications, especially the thriving deep neural networks. To achieve low power consumption for GEMM, researchers have already leveraged unary computing, which manipulates bitstreams with extremely simple logic. However, existing unary architectures are not well generalizable to varying GEMM configurations in versatile applications and incompatible to the binary computing stack, imposing challenges to execute unary GEMM effortlessly. In this work, we address the problem by architecting a hybrid unary-binary systolic array, uSystolic, to inherit the legacy-binary data scheduling with slow (thus power-efficient) data movement, i.e., data bytes are crawling out from memory to drive uSystolic. uSystolic exhibits tremendous area and power improvements as a joint effect of 1) low-power computing kernel, 2) spatial-temporal bitstream reuse, and 3) on-chip SRAM elimination. For the evaluated edge computing scenario, compared with the binary parallel design, the rated-coded uSystolic reduces the systolic array area and total on-chip area by 59.0% and 91.3%, with the on-chip energy and power efficiency improved by up to 112.2x and 44.8x for AlexNet.

关键词： Power demand Processor scheduling Neural networks Random access memory computer architecture Systolic arrays Hybrid power systems

来源：评论

学校读者我要写书评

暂无评论

VAQUERO: A Scratchpad-based Vector Accelerator for Query Processing 29

VAQUERO: A Scratchpad-based Vector Accelerator for Query Pro...

引用

29th IEEE International symposium on high-performance computer architecture (HPCA)

作者： Pavon, Julian Vargas Valdivieso, Ivan Marimon, Joan Figueras, Roger Moll, Francesc Unsal, Osman Valero, Mateo Cristal, Adrian Barcelona Supercomp Ctr Barcelona Spain Univ Politecn Cataluna Barcelona Spain

ISBN: (纸本)9781665476522

Database Management Systems (DBMS) have become an essential tool for industry and research and are often a significant component of data centers. there have been many efforts to accelerate DBMS application performance. One of the most explored techniques is the use of vector processing. Unfortunately, conventional vector architectures have not been able to exploit the full potential of DBMS acceleration. In this paper, we present VAQUERO, our Scratchpad-based Vector Accelerator for QUEry pROcessing. VAQUERO improves the efficiency of vector architectures for DBMS operations such as data aggregation and hash joins featuring lookup tables. Lookup tables are significant contributors to the performance bottlenecks in DBMS processing suffering from insufficient ISA support in the form of scatter-gather instructions. VAQUERO introduces a novel Advanced Scratchpad Memory specifically designed with two mapping modes - direct- and associative-mode. these mapping modes enable VAQUERO to accelerate real-world databases with workload sizes that significantly exceed the scratchpad memory capacity. Additionally, the associative-mode allows to use VAQUERO with DBMS operators that use hashed keys, e.g. hash-join and hash-aggregate. VAQUERO has been designed considering general DBMS algorithm requirements instead of being based on a particular database organization. For this reason, VAQUERO is capable to accelerate DBMS operators for both row- and column-oriented databases. In this paper, we evaluate the efficiency of VAQUERO using two highly optimized popular open-source DBMS, namely the row-based PostgreSQL and column-based MonetDB. We implemented VAQUERO at the RTL level and prototype it, by performing Place&Route, at the 7nm technology node. VAQUERO incurs a modest 0.15% area overhead compared with an Intel Ice Lake processor. Our evaluation shows that VAQUERO significantly outperforms PostgreSQL and MonetDB by 2.09x and 3.32x respectively, when processing operators and queries

关键词： Vector computing Scratchpad Memory Database Accelerator PostgreSQL MonetDB

来源：评论

学校读者我要写书评

暂无评论

iNUMAlloc: Towards Intelligent Memory Allocation for AI Accelerators with NUMA 21

iNUMAlloc: Towards Intelligent Memory Allocation for AI Acce...

引用

作者： Xu, Yuanchao Qian, Ruyi Wang, Yida Huo, Qirun Capital Normal University College of Information Engineering Beijing China Skl of Computer Architecture Institute of Computing Technology Cas Beijing China

ISBN: (纸本)9798350329223

the amazing success of deep neural network benefits from the rise of big data. As deep learning models are becoming more scale than ever before, their requirements for memory bandwidth are growing at a tremendous pace. Some AI accelerators adopt non-uniform memory access (NUMA) architecture to mitigate this issue and hence complicate device memory allocation. Although extensive studies have been conducted on how to mitigate resource contention and reduce latency, almost all of them target on CPU-oriented NUMA systems but not on AI accelerators where memory allocation precedes task scheduling. the current memory allocator generally adopts an interleaved memory allocation strategy, which is very easy to implement but far from *** tackle this issue, this paper proposes iNUMAlloc, an intelligent memory allocator specialized for AI accelerators with NUMA architecture by combining program behavior and predictable hardware resources altogether. Preliminary evaluation shows that it can help to improve the accuracy and efficiency of memory allocation, thereby achieving stable execution time. © 2023 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Message from Program Chairs

Proceedings - Symposium on Computer Architecture and High Pe...

引用

Proceedings - symposium on computer architecture and high performance computing 2024年 X页

作者： Osthoff, Carla Nelson Amaral, J. National Laboratory for Scientific Computing Brazil University of Alberta Canada

来源：评论

学校读者我要写书评

暂无评论

MXFaaS: Resource Sharing in Serverless Environments for Parallelism and Efficiency 23

MXFaaS: Resource Sharing in Serverless Environments for Para...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Stojkovic, Jovan Xu, Tianyin Franke, Hubertus Torrellas, Josep Univ Illinois Champaign IL 61820 USA IBM Res Yorktown Hts NY USA

ISBN: (纸本)9798400700958

Although serverless computing is a popular paradigm, current serverless environments have high overheads. Recently, it has been shown that serverless workloads frequently exhibit bursts of invocations of the same function. Such pattern is not handled well in current platforms. Supporting it efficiently can speed-up serverless execution substantially. In this paper, we target this dominant pattern with a new serverless platform design named MXFaaS. MXFaaS improves function performance by efficiently multiplexing (i.e., sharing) processor cycles, I/O bandwidth, and memory/processor state between concurrently executing invocations of the same function. MXFaaS introduces a new container abstraction called MXContainer. To enable efficient use of processor cycles, an MXContainer carefully helps schedule same-function invocations for minimal response time. To enable efficient use of I/O bandwidth, an MXContainer coalesces remote storage accesses and remote function calls from same-function invocations. Finally, to enable efficient use of memory/processor state, an MXContainer first initializes the state of its container and only later, on demand, spawns a process per function invocation, so that all invocations can share unmodified memory state and hence minimize memory footprint. We implement MXFaaS in two serverless platforms and run diverse serverless benchmarks. With MXFaaS, serverless environments are much more efficient. Compared to a state-of-the-art serverless environment, MXFaaS on average speeds-up execution by 5.2x, reduces P99 tail latency by 7.4x, and improves throughput by 4.8x. In addition, it reduces the average memory usage by 3.4x.

关键词： Serverless computing cloud computing resource management

来源：评论

学校读者我要写书评

暂无评论

MapZero: Mapping for Coarse-grained Reconfigurable architectures with Reinforcement Learning and Monte-Carlo Tree Search 23

MapZero: Mapping for Coarse-grained Reconfigurable Architect...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Kong, Xiangyu Huang, Yi Zhu, Jianfeng Man, Xingchen Liu, Yang Feng, Chunyang Gou, Pengfei Tang, Minggui Wei, Shaojun Liu, Leibo Tsinghua Univ BNRist Sch Integrated Circuits Beijing Peoples R China GBA Innovat Inst High Performance Server Guangzhou Guangdong Peoples R China HEXIN Technol Co Ltd Guangzhou Guangdong Peoples R China

ISBN: (纸本)9798400700958

Coarse-grained reconfigurable architecture (CGRA) has become a promising candidate for data-intensive computing due to its flexibility and high energy efficiency. CGRA compilers map data flow graphs (DFGs) extracted from applications onto CGRAs, playing a fundamental role in fully exploiting hardware resources for acceleration. Yet the existing compilers are time-demanding and cannot guarantee optimal results due to the traversal search of enormous search spaces brought about by the spatio-temporal flexibility of CGRA structures and the complexity of DFGs. Inspired by the amazing progress in reinforcement learning (RL) and Monte-Carlo tree search (MCTS) for real-world problems, we consider constructing a compiler that can learn from past experiences and comprehensively understand the target DFG and CGRA. In this paper, we propose an architecture-aware compiler for CGRAs based on RL and MCTS, called MapZero - a framework to automatically extract the characteristics of DFG and CGRA hardware and map operations onto varied CGRA fabrics. We apply Graph Attention Network to generate an adaptive embedding for DFGs and also model the functionality and interconnection status of the CGRA, aiming at training an RL agent to perform placement and routing intelligently. Experimental results show that MapZero can generate superior-quality mappings and reduce compilation time hundreds of times compared to state-of-the-art methods. MapZero can find high-quality mappings very quickly when the feasible solution space is rather small and all other compilers fail. We also demonstrate the scalability and broad applicability of our framework.

关键词： Coarse-Grained Reconfigurable architecture Compiler Graph Neural Network Reinforcement Learning

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共444页 << < 5 6 7 8 9 10 11 12 13 14 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：