检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2,291 篇 会议
29 篇 期刊文献
8 册 图书

馆藏范围

2,328 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

1,454 篇 工学
- 1,375 篇 计算机科学与技术...
- 671 篇 软件工程
- 295 篇 电气工程
- 290 篇 信息与通信工程
- 125 篇 电子科学与技术（可...
- 98 篇 网络空间安全
- 73 篇 控制科学与工程
- 57 篇 动力工程及工程热...
- 40 篇 生物工程
- 34 篇 机械工程
- 21 篇 材料科学与工程（可...
- 21 篇 建筑学
- 16 篇 生物医学工程（可授...
- 15 篇 光学工程
- 14 篇 环境科学与工程（可...
- 12 篇 仪器科学与技术
- 11 篇 土木工程
409 篇 理学
- 307 篇 数学
- 49 篇 物理学
- 46 篇 生物学
- 43 篇 系统科学
- 37 篇 统计学（可授理学、...
- 11 篇 地球物理学
- 9 篇 化学
212 篇 管理学
- 155 篇 管理科学与工程(可...
- 111 篇 工商管理
- 65 篇 图书情报与档案管...
30 篇 经济学
- 30 篇 应用经济学
28 篇 法学
- 25 篇 社会学
18 篇 医学
- 14 篇 临床医学
8 篇 农学
7 篇 教育学
3 篇 文学
1 篇 艺术学

主题

530 篇 computer archite...
204 篇 high performance...
201 篇 concurrent compu...
173 篇 hardware
173 篇 distributed comp...
164 篇 application soft...
146 篇 computer science
133 篇 parallel process...
126 篇 computational mo...
125 篇 delay
117 篇 costs
115 篇 computer network...
109 篇 grid computing
96 篇 bandwidth
91 篇 laboratories
77 篇 processor schedu...
67 篇 scalability
66 篇 resource managem...
62 篇 cloud computing
56 篇 distributed comp...

机构

7 篇 univ chicago dep...
7 篇 computer science...
7 篇 carnegie mellon ...
6 篇 univ wisconsin m...
6 篇 mathematics and ...
6 篇 intel corp santa...
6 篇 mathematics and ...
6 篇 changsha univers...
6 篇 institute of com...
5 篇 penn state univ ...
5 篇 univ toronto on
5 篇 school of electr...
5 篇 georgia inst tec...
5 篇 sandia national ...
5 篇 univ illinois ur...
5 篇 computer systems...
5 篇 college of compu...
5 篇 department of co...
4 篇 department of co...
4 篇 school of comput...

作者

9 篇 i. foster
8 篇 mutlu onur
7 篇 chong frederic t...
7 篇 guedes dorgival
7 篇 zhou huiyang
7 篇 magoules frederi...
7 篇 prasanna viktor ...
6 篇 navaux philippe ...
6 篇 patt yale n.
6 篇 torrellas josep
6 篇 kim nam sung
6 篇 d.k. panda
6 篇 wen-mei w. hwu
6 篇 r.k. iyer
5 篇 xie yuan
5 篇 loh gabriel h.
5 篇 schwan karsten
5 篇 li chao
5 篇 ahamed abal-kass...
5 篇 panda dhabaleswa...

语言

2,311 篇 英文
17 篇 其他
1 篇 中文

检索条件"任意字段=Proceedings - 16th Symposium on Computer Architecture and High Performance Computing"

共 2328 条记录，以下是41-50 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

相关度排序

相关度排序
时效性降序
时效性升序

A high performance AI-Powered Cache Mechanism for IoT Devices 16

A High Performance AI-Powered Cache Mechanism for IoT Device...

引用

16th International Conference on Cyber-Enabled Distributed computing and Knowledge Discover, CyberC 2024

作者： Dai, Wenyun Zhang, Muge Fairleigh Dickinson University Computer Science TeaneckNJ United States Fairleigh Dickinson University Computer Science Vancouver Canada

ISBN: (纸本)9798331506896

Various Internet of things (IoT) devices bring great convenience to users. However, the privacy protection and overall performance still need improvements due to their limited storage and computation abilities. Cloud computing can be used to solve or lessen these issues by providing server-level security, unlimited storage, and high computation abilities. It is time-consuming to conduct all the calculation in local IoT devices. It is also not wise to offload everything to Cloud, because the data could be massive that takes longer time to transfer than to compute. In this paper, we design a framework in which the local IoT devices are the cache and the remote Cloud servers are the memory. To further improve system efficiency and adaptability, we use some lightweight Artificial Intelligence (AI) methods to predict users' behavior patterns, that can adapt to changes while maintaining optimal performance. © 2024 IEEE.

关键词： Cache memory

来源：评论

学校读者我要写书评

暂无评论

A NUMA-aware Graph Database for Hybrid Memory System 21

A NUMA-aware Graph Database for Hybrid Memory System

引用

21st IEEE International symposium on Parallel and Distributed Processing with Applications, 13th IEEE International Conference on Big Data and Cloud computing, 16th IEEE International Conference on Social computing and Networking and 13th International Conference on Sustainable computing and Communications, ISPA/BDCloud/SocialCom/SustainCom 2023

作者： Tan, Shaoheng Liu, Chang Li, Dingding Tang, Yong Zeng, Deze South China Normal University School of Computer Science Guangdong Guangzhou510631 China China University of Geosciences School of Computer Science Hubei Wuhan430074 China

ISBN: (纸本)9798350329223

Compared to traditional relational databases (RDBMS), specialized graph databases (GDBs) can efficiently store and process graph data in both time and space. Hence, domains like social networks often use GDBs for data management, such as representing friendships. Optane DC Persistent Memory Module (DCPMM) is a novel memory device combining byte-addressability, non-volatility, high density and superior performance. Intuitively, applying GDB on DCPMM with the Non-Uniform Memory Access (NUMA) architecture may improve I/O performance, yet the real performance may be unstable due to stacked hardware characteristics. Specifically, DCPMM exhibits serious NUMA effects and unexpected read/write asymmetry compared to DRAM. To tackle these issues, we propose a NUMA-aware graph database called NAPGDB. NAPGDB employs delegated thread pools to improve the latency of accessing remote nodes. Additionally, it categorizes read/write requests into short-running and long-running tasks, with the former being served synchronously instead of using delegated threads to improve CPU utilization. We implement NAPGDB and the experimental results show that NAPGDB can improve the remote throughput by 1.3x-1.7x compared with the multithreaded Redis with RedisGraph. © 2023 IEEE.

关键词： Graph Databases

来源：评论

学校读者我要写书评

暂无评论

ETTE: Efficient Tensor-Train-based computing Engine for Deep Neural Networks 23

ETTE: Efficient Tensor-Train-based Computing Engine for Deep...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Gong, Yu Yin, Miao Huang, Lingyi Xiao, Jinqi Sui, Yang Deng, Chunhua Yuan, Bo Rutgers State Univ New Brunswick NJ 08901 USA ScaleFlux Inc Milpitas CA USA

ISBN: (纸本)9798400700958

Tensor-train (TT) decomposition enables ultra-high compression ratio, making the deep neural network (DNN) accelerators based on this method very attractive. TIE, the state-of-the-art TT based DNN accelerator, achieved high performance by leveraging a compact inference scheme to remove unnecessary computations and memory access. However, TIE increases memory costs for stage-wise intermediate results and additional intra-layer data transfer, leading to limited speedups even the models are highly compressed. To unleash the full potential of TT decomposition, this paper proposes ETTE, an algorithm and hardware co-optimization framework for Efficient Tensor-Train Engine. At the algorithm level, ETTE proposes new tensor core construction and computation ordering mechanism to reduce stage-wise computation and storage cost at the same time. At the hardware level, ETTE proposes a lookahead-style across-stage processing scheme to eliminate the unnecessary stage-wise data movement. By fully leveraging the decoupled input and output dimension factors, ETTE develops an efficient low-cost memory partition-free access scheme to efficiently support the desired matrix transformation. We demonstrate the effectiveness of ETTE via implementing a 16PE hardware prototype with CMOS 28nm technology. Compared with GPU on various workloads, ETTE achieves 6.5x - 253.1x higher throughput and 189.2x - 9750.5x higher energy efficiency. Compared with the state-of-the-art DNN accelerators, ETTE brings 1.1x - 58.3x, 2.6x - 1170.4x and 1.8x - 2098.2x improvement on throughput, energy efficiency and area efficiency, respectively.

关键词： tensor decomposition neural networks low rank accelerator

来源：评论

学校读者我要写书评

暂无评论

computing seismic attributes with deep-learning models 35

Computing seismic attributes with deep-learning models

引用

35th IEEE International symposium on computer architecture and high performance computing (SBAC-PAD)

作者： Hecker, Nicolas Napoli, Otavio O. Astudillo, Carlos A. Navarro, Joao Paulo Souza, Alan Miranda, Daniel Villas, Leandro A. Borin, Edson Univ Estadual Campinas UNICAMP Ctr Estudos Energia & Petr CEPETRO Campinas Brazil Univ Estadual Campinas UNICAMP Inst Comp IC Campinas Brazil NVIDIA Sao Paulo Brazil Petr Brasileiro SA PETROBRAS Rio De Janeiro Brazil

ISBN: (纸本)9798350381603

Seismic data contains valuable information about the Earth's subsurface, which is useful in oil and gas (O&G) exploration. Seismic attributes are derived from seismic data to highlight relevant data structures and properties, improving geological or geophysical data interpretation. However, when calculated on large datasets, quite common in the O&G industry, these attributes may be computationally expensive regarding computing power and memory capacity. Deep learning techniques can reduce these costs by avoiding direct attribute calculation. Some of these techniques may, however, be too complex, require large volumes of training data, and demand large computational capacity. this work shows that a conventional U-Net Convolutional Neural Network (CNN) model, with 31 million parameters, can be used to compute diverse seismic attributes directly from seismic data. the F3 dataset and attributes calculated on it were employed to train the models, each specialized in a specific attribute. the trained CNN models yield low prediction errors for most of the tested attributes. these results evince that simple CNN models are able to infer seismic attributes with high accuracy.

关键词： Convolutional neural networks Deep learning Regression model Seismic attribute computation

来源：评论

学校读者我要写书评

暂无评论

Nimblock: Scheduling for Fine-grained FPGA Sharing through Virtualization 23

Nimblock: Scheduling for Fine-grained FPGA Sharing through V...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Mandava, Meghna Reckamp, Paul Chen, Deming Univ Illinois Urbana IL 61801 USA

ISBN: (纸本)9798400700958

As FPGAs become ubiquitous compute platforms, existing research has focused on enabling virtualization features to facilitate fine-grained FPGA sharing. We employ an overlay architecture which enables arbitrary, independent user logic to share portions of a single FPGA by dividing the FPGA into independently reconfigurable slots. We then explore scheduling possibilities to effectively time- and space-multiplex the virtualized FPGA by introducing Nimblock. the Nimblock scheduling algorithm balances application priorities and performance degradation to improve response time and reduce deadline violations. Unlike other algorithms, Nimblock explores pre-emption as a scheduling parameter to dynamically change resource allocations, and automatically allocates resources to enable suitable parallelism for an application without additional user input. In our exploration, we evaluate five scheduling algorithms: a baseline, three existing algorithms, and our novel Nimblock algorithm. We demonstrate system feasibility by realizing the complete system on a Xilinx ZCU106 FPGA and evaluating on a set of real-world benchmarks. In our results, we achieve up to 5.7x lower average response times when compared to a no-sharing and no-virtualization scheduling algorithm and up to 2.1x average response time improvement over competitive scheduling algorithms that support sharing within our virtualization environment. We additionally demonstrate up to 49% fewer deadline violations and up to 2.6x lower tail response times when compared to other high-performance algorithms.

关键词： Virtualization Reconfigurable computing Real-time Scheduling Overlay architectures

来源：评论

学校读者我要写书评

暂无评论

MXFaaS: Resource Sharing in Serverless Environments for Parallelism and Efficiency 23

MXFaaS: Resource Sharing in Serverless Environments for Para...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Stojkovic, Jovan Xu, Tianyin Franke, Hubertus Torrellas, Josep Univ Illinois Champaign IL 61820 USA IBM Res Yorktown Hts NY USA

ISBN: (纸本)9798400700958

Although serverless computing is a popular paradigm, current serverless environments have high overheads. Recently, it has been shown that serverless workloads frequently exhibit bursts of invocations of the same function. Such pattern is not handled well in current platforms. Supporting it efficiently can speed-up serverless execution substantially. In this paper, we target this dominant pattern with a new serverless platform design named MXFaaS. MXFaaS improves function performance by efficiently multiplexing (i.e., sharing) processor cycles, I/O bandwidth, and memory/processor state between concurrently executing invocations of the same function. MXFaaS introduces a new container abstraction called MXContainer. To enable efficient use of processor cycles, an MXContainer carefully helps schedule same-function invocations for minimal response time. To enable efficient use of I/O bandwidth, an MXContainer coalesces remote storage accesses and remote function calls from same-function invocations. Finally, to enable efficient use of memory/processor state, an MXContainer first initializes the state of its container and only later, on demand, spawns a process per function invocation, so that all invocations can share unmodified memory state and hence minimize memory footprint. We implement MXFaaS in two serverless platforms and run diverse serverless benchmarks. With MXFaaS, serverless environments are much more efficient. Compared to a state-of-the-art serverless environment, MXFaaS on average speeds-up execution by 5.2x, reduces P99 tail latency by 7.4x, and improves throughput by 4.8x. In addition, it reduces the average memory usage by 3.4x.

关键词： Serverless computing cloud computing resource management

来源：评论

学校读者我要写书评

暂无评论

MapZero: Mapping for Coarse-grained Reconfigurable architectures with Reinforcement Learning and Monte-Carlo Tree Search 23

MapZero: Mapping for Coarse-grained Reconfigurable Architect...

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Kong, Xiangyu Huang, Yi Zhu, Jianfeng Man, Xingchen Liu, Yang Feng, Chunyang Gou, Pengfei Tang, Minggui Wei, Shaojun Liu, Leibo Tsinghua Univ BNRist Sch Integrated Circuits Beijing Peoples R China GBA Innovat Inst High Performance Server Guangzhou Guangdong Peoples R China HEXIN Technol Co Ltd Guangzhou Guangdong Peoples R China

ISBN: (纸本)9798400700958

Coarse-grained reconfigurable architecture (CGRA) has become a promising candidate for data-intensive computing due to its flexibility and high energy efficiency. CGRA compilers map data flow graphs (DFGs) extracted from applications onto CGRAs, playing a fundamental role in fully exploiting hardware resources for acceleration. Yet the existing compilers are time-demanding and cannot guarantee optimal results due to the traversal search of enormous search spaces brought about by the spatio-temporal flexibility of CGRA structures and the complexity of DFGs. Inspired by the amazing progress in reinforcement learning (RL) and Monte-Carlo tree search (MCTS) for real-world problems, we consider constructing a compiler that can learn from past experiences and comprehensively understand the target DFG and CGRA. In this paper, we propose an architecture-aware compiler for CGRAs based on RL and MCTS, called MapZero - a framework to automatically extract the characteristics of DFG and CGRA hardware and map operations onto varied CGRA fabrics. We apply Graph Attention Network to generate an adaptive embedding for DFGs and also model the functionality and interconnection status of the CGRA, aiming at training an RL agent to perform placement and routing intelligently. Experimental results show that MapZero can generate superior-quality mappings and reduce compilation time hundreds of times compared to state-of-the-art methods. MapZero can find high-quality mappings very quickly when the feasible solution space is rather small and all other compilers fail. We also demonstrate the scalability and broad applicability of our framework.

关键词： Coarse-Grained Reconfigurable architecture Compiler Graph Neural Network Reinforcement Learning

来源：评论

学校读者我要写书评

暂无评论

Analyzing HPC Monitoring Data With a View Towards Efficient Resource Utilization 36

Analyzing HPC Monitoring Data With a View Towards Efficient ...

引用

36th IEEE International symposium on computer architecture and high-performance computing, SBAC-PAD 2024

作者： Maloney, Samuel Suarez, Estela Eicker, Norbert Guimaraes, Filipe Frings, Wolfgang Forschungszentrum Jülich Jülich Supercomputing Centre Germany University of Bonn Institute of Computer Science Germany University of Wuppertal School of Mathematics and Natural Sciences Germany

ISBN: (纸本)9798350356168

Compute nodes in modern HPC systems are growing in size and their hardware has become ever more diverse. Still, many HPC centers allocate the resources of full nodes exclusively to avoid contention, despite the associated risk of underutilization. this paper describes a thorough resource utilization study of CPU and GPU compute and memory capacity, and interconnect bandwidth on JUWELS, a mature leadership-class modular supercomputer, with the aim of identifying opportunities for improving utilization through advanced scheduling and node sharing. Separate analysis of CPU-only and GPU-accelerated nodes finds that CPU compute usage is already close to optimal for the CPU-only nodes, whereas there is plenty of scope for co-scheduling CPU-based jobs on GPU-accelerated nodes. Memory capacity and node-level interconnect bandwidth are sufficient to provision co-scheduled jobs. We analyze multiple one-month datasets to validate robustness of conclusions over time and compare with previous studies on other systems to establish generalizability of results. © 2024 IEEE.

关键词： Dynamic/Adaptive scheduling high performance computing (HPC) Monitoring Predictive analytics Resource management

来源：评论

学校读者我要写书评

暂无评论

μManycore: A Cloud-Native CPU for Tail at Scale 23

μManycore: A Cloud-Native CPU for Tail at Scale

引用

50th Annual International symposium on computer architecture (ISCA)

作者： Stojkovic, Jovan Liu, Chunao Shahbaz, Muhammad Torrellas, Josep Univ Illinois Champaign IL 61820 USA Purdue Univ W Lafayette IN 47907 USA

ISBN: (纸本)9798400700958

Microservices are emerging as a popular cloud-computing paradigm. Microservice environments execute typically-short service requests that interact with one another via remote procedure calls (often across machines), and are subject to stringent tail-latency constraints. In contrast, current processors are designed for traditional monolithic applications. they support global hardware cache coherence, provide large caches, incorporate microarchitecture for long-running, predictable applications (such as advanced prefetching), and are optimized to minimize average latency rather than tail latency. To address this imbalance, this paper proposes mu Manycore, an architecture optimized for cloud-native microservice environments. Based on a characterization of microservice applications, mu Manycore is designed to minimize unnecessary microarchitecture and mitigate overheads to reduce tail latency. Indeed, rather than supporting manycore-wide hardware cache coherence, mu Manycore has multiple small hardware cache-coherent domains, called Villages. Clusters of villages are interconnected with an on-package leaf-spine network, which has many redundant, low-hop-count paths between clusters. To minimize latency overheads, mu Manycore schedules and queues service requests in hardware, and includes hardware support to save and restore process state when doing a context-switch. Our simulation-based results show that mu Manycore delivers high performance. A cluster of 10 servers with a 1024-core mu Manycore in each server delivers 3.7x lower average latency, 15.5x higher throughput, and, importantly, 10.4x lower tail latency than a cluster with iso-power conventional server-class multicores. Similar good results are attained compared to a cluster with power-hungry iso-area conventional server-class multicores.

关键词： Microservices cloud computing manycore architecture

来源：评论

学校读者我要写书评

暂无评论

IEEE symposium on Low-Power and high-Speed Chips and Systems, COOL CHIPS 2024 - proceedings

IEEE Symposium on Low-Power and High-Speed Chips and Systems...

引用

27th IEEE symposium on Low-Power and high-Speed Chips and Systems, COOL CHIPS 2024

ISBN: (纸本)9798350384147

the proceedings contain 16 papers. the topics discussed include: a low-power and real-time neural-rendering dense SLAM processor with 3-level hierarchical sparsity exploitation;reinforcement learning hardware accelerator using cache-based memoization for optimized Q-table selection;branch divergence-aware flexible approximating technique on GPUs;a 22 nm 10 TOPS mixed-precision neural network SoC for image processing with energy-efficient dilated convolution support;bit-separable radix-4 booth multiplier for power-efficient CNN accelerator;a microservice scheduler for heterogeneous resources on edge-cloud computing continuum;power-efficient acceleration of GCNs on coarse-grained linear arrays;and MRCA: multi-grained reconfigurable cryptographic accelerator for diverse security requirements.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共233页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：