检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

16,569 篇 会议
127 篇 期刊文献
13 册 图书

馆藏范围

16,709 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

8,337 篇 工学
- 7,321 篇 计算机科学与技术...
- 3,627 篇 软件工程
- 1,914 篇 电气工程
- 1,256 篇 信息与通信工程
- 1,010 篇 电子科学与技术（可...
- 554 篇 控制科学与工程
- 310 篇 动力工程及工程热...
- 222 篇 仪器科学与技术
- 218 篇 机械工程
- 212 篇 生物工程
- 172 篇 网络空间安全
- 167 篇 光学工程
- 153 篇 生物医学工程（可授...
- 124 篇 建筑学
- 113 篇 材料科学与工程（可...
- 112 篇 安全科学与工程
- 111 篇 环境科学与工程（可...
- 98 篇 交通运输工程
1,816 篇 理学
- 1,120 篇 数学
- 354 篇 物理学
- 259 篇 系统科学
- 240 篇 生物学
- 196 篇 统计学（可授理学、...
- 119 篇 化学
1,329 篇 管理学
- 1,046 篇 管理科学与工程(可...
- 497 篇 工商管理
- 382 篇 图书情报与档案管...
148 篇 经济学
- 147 篇 应用经济学
140 篇 医学
- 111 篇 临床医学
107 篇 法学
- 92 篇 社会学
58 篇 农学
29 篇 教育学
22 篇 文学
8 篇 军事学
3 篇 艺术学

主题

5,426 篇 computer archite...
2,015 篇 hardware
1,287 篇 high performance...
1,175 篇 computational mo...
978 篇 application soft...
928 篇 parallel process...
895 篇 concurrent compu...
892 篇 computer science
798 篇 bandwidth
703 篇 throughput
669 篇 delay
613 篇 field programmab...
582 篇 distributed comp...
572 篇 costs
530 篇 computer network...
511 篇 scalability
496 篇 cloud computing
455 篇 runtime
441 篇 kernel
436 篇 grid computing

机构

78 篇 university of ch...
37 篇 school of comput...
36 篇 ibm thomas j. wa...
36 篇 mathematics and ...
32 篇 college of compu...
29 篇 college of compu...
28 篇 georgia inst tec...
28 篇 department of co...
28 篇 state key labora...
28 篇 institute of com...
27 篇 tsinghua univers...
26 篇 school of comput...
26 篇 department of co...
23 篇 school of comput...
21 篇 univ chinese aca...
21 篇 mathematics and ...
20 篇 intel corp santa...
19 篇 georgia institut...
19 篇 oak ridge nation...
19 篇 barcelona superc...

作者

31 篇 dhabaleswar k. p...
17 篇 wayne luk
17 篇 dongarra jack
17 篇 hwu wen-mei w.
17 篇 yan solihin
17 篇 mateo valero
17 篇 nam sung kim
17 篇 hari subramoni
16 篇 jason cong
16 篇 ninghui sun
16 篇 onur mutlu
16 篇 navaux philippe ...
16 篇 dally william j.
16 篇 chong frederic t...
15 篇 wang lei
15 篇 yu wang
15 篇 zomaya albert y.
15 篇 jack dongarra
15 篇 kim nam sung
14 篇 lei wang

语言

16,537 篇 英文
116 篇 其他
54 篇 中文
1 篇 西班牙文
1 篇 葡萄牙文

检索条件"任意字段=IEEE International Symposium on Computer Architecture and High Performance Computing"

共 16709 条记录，以下是281-290 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Energy-Efficient high-performance Photonic Backplane Network for Rack-Scale computing Systems

Energy-Efficient High-Performance Photonic Backplane Network...

引用

ieee-computer-Society Annual symposium on VLSI (ISVLSI)

作者： Feng, Jun Chen, Shixi Zhang, Jiaxu Fu, Yuxiang Xu, Jiang Hong Kong Univ Sci & Technol Hong Kong Peoples R China Hong Kong Univ Sci & Technol Guangzhou Peoples R China

ISBN: (数字)9781665466059

ISBN: (纸本)9781665466059

With the mainstream paradigm in high-performance computer architecture being shifted considerably from single-core systems toward multi-core computing systems, system performance and energy efficiency are extensively challenged by the communication capacity among processing units and memories/storage. The emerging silicon photonics, on the other hand, promises exceedingly high bandwidth, low latency, and low energy consumption by enabling optical interconnects between entities. However, the inherent distinction between optical interconnects and electrical ones requires fundamentally different methods to compose a system-level communication network. This paper proposes a systematic approach to optimizing the overall energy-delay product of the photonic backplane network for rack-scale computing systems. The experimental results show the system with proposed 2-layer photonic backplane network with the floorplan optimization improves system performance by 3x and improve system performance per energy consumption by around 46%, which can also maintain good scalability.

关键词： high-performance computing systems rack-scale computing systems silicon photonics

来源：评论

学校读者我要写书评

暂无评论

Genomics-GPU: A Benchmark Suite for GPU-accelerated Genome Analysis

Genomics-GPU: A Benchmark Suite for GPU-accelerated Genome A...

引用

ieee international symposium on performance Analysis of Systems and Software (ISPASS)

作者： Liu, Zhuren Zhang, Shouzhe Garrigus, Justin Zhao, Hui Univ North Texas Dept Comp Sci & Engn Denton TX 76203 USA

ISBN: (纸本)9798350397390

Genomic analysis is the study of genes which includes the identification, measurement, or comparison of genomic features. Genomics research is of great importance to our society because it can be used to detect diseases, create vaccines, and develop drugs and treatments. As a type of general-purpose accelerators with massive parallel processing capability, GPUs have been recently used for genomics analysis. Developing GPU-based hardware and software frameworks for genome analysis is becoming a promising research area. To support this type of research, benchmarks are needed that can feature representative, concurrent, and diverse applications running on GPUs. In this work, we created a benchmark suite called Genomics-GPU, which contains 10 widely-used genomic analysis applications. It covers genome comparison, matching, and clustering for DNAs and RNAs. We also adapted these applications to exploit the CUDA Dynamic Parallelism (CDP), a recent advanced feature supporting dynamic GPU programming, to further improve the performance. Our benchmark suite can serve as a basis for algorithm optimization and also facilitate GPU architecture development for genomics analysis.

关键词： genomics bioinformatics benchmarking GPU accelerated computing genome analysis computer architecture

来源：评论

学校读者我要写书评

暂无评论

Mapping quantum algorithms to multi-core quantum computing architectures 56

Mapping quantum algorithms to multi-core quantum computing a...

引用

56th ieee international symposium on Circuits and Systems (ISCAS)

作者： Ovide, Anabel Rodrigo, Santiago Bandic, Medina Van Someren, Hans Feld, Sebastian Abadal, Sergi Alarcon, Eduard Almudever, Carmen G. Univ Politecn Valencia Valencia Spain Tech Univ Catalonia BarcelonaTech Barcelona Spain Delft Univ Technol Delft Netherlands

ISBN: (纸本)9781665451093

Current monolithic quantum computer architectures have limited scalability. One promising approach for scaling them up is to use a modular or multi-core architecture, in which different quantum processors (cores) are connected via quantum and classical links. This new architectural design poses new challenges such as the expensive inter-core communication. To reduce these movements when executing a quantum algorithm, an efficient mapping technique is required. In this paper, a detailed critical discussion of the quantum circuit mapping problem for multi-core quantum computing architectures is provided. In addition, we further explore the performance of a mapping method, which is formulated as a partitioning over time graph problem, by performing an architectural scalability analysis.

关键词： scalability quantum computing systems multicore quantum computers mapping of quantum algorithms

来源：评论

学校读者我要写书评

暂无评论

Geo-Distributed Analytical Streaming architecture for IoT Platforms

Geo-Distributed Analytical Streaming Architecture for IoT Pl...

引用

2024 ieee international Conference on Cluster computing, CLUSTER 2024

作者： Hoseiny Farahabady, M. Reza Zomaya, Albert Y. The University of Sydney School of Computer Science Center for Distributed and High Performance Computing SydneyNSW2006 Australia

ISBN: (纸本)9798350358711

The surge in real-time IoT data introduces scalability and computational challenges, necessitating advanced architectural and technological solutions. Streamed data processing is increasingly adopted across industries to enhance operational efficiency by extracting insights from vast, unstructured datasets. However, complex analytical tasks, such as multi-join queries, often require stateful iterative calculations on high-volume, high-velocity data, which challenges conventional programming models like MapReduce. This paper introduces an architectural model enabling application developers to create intricate streaming computational logic within an IoT platform. Our architecture supports scalable applications across distributed edge-tier nodes, particularly for iterative analytical operations on streamed data. We discuss core concepts and a timestamp model (borrowed from the timely data-flow concept) attached to data items circulating between computational blocks, which can execute concurrently on different edge-tier nodes. Additionally, we detail a buffer management mechanism that dynamically adjusts memory size in each computational block on nodes with limited capacity. This mechanism considers application performance requirements and runtime conditions to optimize buffer sizes. performance evaluation against cloud-tier alternatives confirms the effectiveness of our solution. Experimental results show a significant reduction in p-99 delay compared to cloud-tier deployment with a database engine for analytical applications involving multi-join operations. © 2024 ieee.

关键词： MapReduce

来源：评论

学校读者我要写书评

暂无评论

Implementation of Radio Wave Propagation using RT Cores and Consideration of Programming Models

Implementation of Radio Wave Propagation using RT Cores and ...

引用

37th ieee international Parallel and Distributed Processing symposium (IPDPS)

作者： Hashinoki, Shinya Ohshima, Satoshi Katagiri, Takahiro Nagai, Torn Hoshino, Tetsuya Nagoya Univ Grad Sch Informat Nagoya Aichi Japan Nagoya Univ Informat Technol Ctr Nagoya Aichi Japan

ISBN: (纸本)9798350311990

With the NVIDIA Turing architecture generation, several NVIDIA graphics processing units (GPUs) have introduced ray tracing acceleration hardware (RT cores). Ray tracing processing can be regarded as a simulation of wave and particle propagation, collision, and reflection. Therefore, it is expected to be applied to computational science and highperformance computing. However, few studies have been conducted using RT cores. The purpose of this research is to demonstrate the use of RT cores in the scientific and technical computing fields. We implemented a radio wave propagation loss calculation with the programmable ray tracing application framework OptiX and evaluated its performance. Furthermore, we investigated the challenges of reducing the description of framework-specific settings and the needs of hardware allocation. In the simple two spheres experiment, the RT core implementation showed the highest performance. Moreover, the acceleration was super linear scaling, between (10000, 5000) and (20000, 10000). In the experiment with a sphere and planes, the performance achieved by the RT cores was up to approximately 390 times higher than the parallel execution of the BVH search algorithm. We also proved that a large number of RT cores yielded higher performance. In the open data problem space experiment, we evaluated various GPUs and revealed that a larger number of RT cores is effective. These results show that RT cores are sufficiently effective for radio propagation calculations with an adequate number of ray projections. Through this research, we contributed to the RT core use in computational science by proposing an implementation method for ray tracing applications and revealing the effects of RT cores in radio wave propagation loss calculations.

关键词： high-performance computing ray tracing RT core radio wave propagation

来源：评论

学校读者我要写书评

暂无评论

high performance Optimized and Unified Bus Monitor for Distributed Avionics architecture Validation under Hardware-in-Loop Simulation Framework 12

High Performance Optimized and Unified Bus Monitor for Distr...

引用

12th international Conference on Current Trends in Advanced computing, ICCTAC 2024

作者： Karvande, Rajesh Shankar Madhavi, Tatineni 'f' RCI DRDO TS Hyderabad India EECE GITAM TS Hyderabad India

ISBN: (纸本)9798350395808

The Avionics system of the aerospace and defense industry designed with complex distributed architecture. Every system is intelligent and processing data on it's own. The subsystems communicate with the onboard computer which is the main unit of avionics system. This data exchange is based mostly on serial communication with a maximum rate of 1 Mbps. Many data communication protocols are available;among them most popular is the MIL-STD-1553 because of dual redundant and simple architecture. The data traffic on the MIL-STD 1553 bus is monitored and recorded on the Bus Monitor. The configuration was one Bus Monitor for one communication node. In the case of multi-mode avionics configuration more than one Bus-Monitor is essential to integrate. These Bus Monitor configured separately with no synchronization among them. The development of a single computer-based multi-node optimized Bus Monitor for distributed architecture is designed and developed. The new design and development process of optimized multi-mode Bus Monitor under the real time Linux platform is explained in this paper. Further, the unified Bus-Monitor for number of aerospace project's configuration is developed and deployed on high performance processor architecture with optimized embedded computer. The development has been done for Hardware-In-Loop Simulation facility that is used for validation of avionics software © 2024 ieee.

关键词： Buses

来源：评论

学校读者我要写书评

暂无评论

Application Defined On-chip Networks for Heterogeneous Chiplets: An Implementation Perspective 28

Application Defined On-chip Networks for Heterogeneous Chipl...

引用

28th Annual ieee international symposium on high-performance computer architecture (HPCA)

作者： Wang, Tianqi Feng, Fan Xiang, Shaolin Li, Qi Xia, Jing Huawei Shenzhen Peoples R China

ISBN: (纸本)9781665420273

With the help of advanced packaging technologies to integrate multiple chips (e.g., CPU, AI, IO), a chiplet-based SoC design process can enable fast system construction. However, the design of network-on-chip used within the individual chiplets and across chiplets is an extremly challenge. We introduce the design process and methodology of a bufferless multi-ring NoC for heterogeneous chiplet-based SoC. Our design is portable and can be used in diverse scenarios, like Server-CPU, AI-Processor, and Baseband-Processor. The co-design of the application, architecture, and implementation is the key to make the system power efficient and high performance. We determined many architectural design choices by reflecting an analysis of a set of target applications by application teams and several physical implementation constraints provided by development teams. In this paper, we present the pragmatic practice of our co-design effort for the NoC. As a result, the system has been proven to achieve 16TB/s bandwidth in an AI processor and low latency, in a server CPU with nearly one hundred cores.

关键词： Design methodology computer architecture Bandwidth Network-on-chip Packaging Energy efficiency Servers

来源：评论

学校读者我要写书评

暂无评论

performance Analysis and Optimization of Nvidia H100 Confidential computing for AI Workloads 22

Performance Analysis and Optimization of Nvidia H100 Confide...

引用

22nd ieee international symposium on Parallel and Distributed Processing with Applications, ISPA 2024

作者： Tan, Yifan Mi, Zeyu Institute of Parallel and Distributed Systems Seiee Shanghai Jiao Tong University China

ISBN: (纸本)9798331509712

NVIDIA's H100 Confidential computing (CC) counters the security hazards inherent in cloud AI workloads. It enforces data encryption to achieve data confidentiality, which leads to substantial throughput reductions as high as 93% in various AI workloads (such as TensorRT, PEFT and vLLM). Confronting this substantial overhead issue, we first delve into the underlying causes through meticulous analysis. This groundwork enables us to devise an innovative runtime system that operates seamlessly in the background, completely transparent to end-users. The cornerstone of our system lies in leveraging multiple encryption workers. Experiments demonstrate that our solution effectively reduces throughput drop to less than 28.1%. © 2024 ieee.

关键词： Confidential computing GPU H100 LLM

来源：评论

学校读者我要写书评

暂无评论

PIMCloud: QoS-Aware Resource Management of Latency-Critical Applications in Clouds with Processing-in-Memory 28

PIMCloud: QoS-Aware Resource Management of Latency-Critical ...

引用

28th Annual ieee international symposium on high-performance computer architecture (HPCA)

作者： Chen, Shuang Jiang, Yi Delimitrou, Christina Martinez, Jose F. Cornell Univ Comp Syst Lab Ithaca NY 14850 USA

ISBN: (纸本)9781665420273

The slowdown of Moore's Law, combined with advances in 3D stacking of logic and memory, have pushed architects to revisit the concept of processing-in-memory (PIM) to overcome the memory wall bottleneck. This PIM renaissance finds itself in a very different computing landscape from the one twenty years ago, as more and more computation shifts to the cloud. Most PIM architecture papers still focus on best-effort applications, while PIM's impact on latency-critical cloud applications is not well understood. This paper explores how datacenters can exploit PIM architectures in the context of latency-critical applications. We adopt a general-purpose cloud server with HBM-based, 3D stacked logic+memory modules, and study the impact of PIM on six diverse interactive cloud applications. We reveal the previously neglected opportunity that PIM presents to these services, and show the importance of properly managing PIM-related resources to meet the QoS targets of interactive services and maximize resource efficiency. Then, we present PIMCloud, a QoS-aware resource manager designed for cloud systems with PIM allowing colocation of multiple latency-critical and best-effort applications. We show that PIMCloud efficiently manages PIM resources: it (1) improves effective machine utilization by up to 70% and 85% (average 24% and 33%) under 2-app and 3-app mixes, compared to the best state-of-the-art manager;(2) helps latency-critical applications meet QoS;and (3) adapts to varying load patterns.

关键词： Three-dimensional displays Stacking Quality of service computer architecture Resource management Servers

来源：评论

学校读者我要写书评

暂无评论

AFS: Accurate, Fast, and Scalable Error-Decoding for Fault-Tolerant Quantum computers 28

AFS: Accurate, Fast, and Scalable Error-Decoding for Fault-T...

引用

28th Annual ieee international symposium on high-performance computer architecture (HPCA)

作者： Das, Poulami Pattison, Christopher A. Manne, Srilatha Carmean, Douglas M. Svore, Krysta M. Qureshi, Moinuddin Delfosse, Nicolas Georgia Inst Technol Atlanta GA 30332 USA CALTECH Pasadena CA 91125 USA Microsoft Quantum Redmond WA USA Microsoft Res Redmond WA USA

ISBN: (纸本)9781665420273

Quantum computers promise computational advantages for many important problems across various application domains. Unfortunately, physical quantum devices are highly susceptible to errors that limit us from running most of these quantum applications. Quantum Error Correction (QEC) codes are required to implement Fault-Tolerant Quantum computers (FTQC) on which computations can be performed without encountering errors. Error decoding is a critical component of quantum error correction and is responsible for transforming a set of qubit measurements generated by the QEC code, called the syndrome, into error locations and error types. For the feasibility of implementation, error decoders must not only identify errors with high accuracy, but also be fast and scalable to a large number of qubits. Unfortunately, most of the prior works on error decoding have focused primarily only on the accuracy and have relied on software implementations that are too slow to be of practical use. Furthermore, these studies only look at designing a single decoder and do not analyze the challenges involved in scaling the storage and bandwidth requirements when performing error correction in large systems with thousands of qubits. In this paper, we present AFS, an accurate, fast, and scalable decoder architecture that is designed to operate in the context of systems with hundreds of logical qubits. We present the hardware implementation of AFS, which is based on the Union Find decoding algorithm and employs a three-stage pipelined design. AFS provides orders of magnitude higher accuracy compared to recent SFQ-based hardware decoders (logical error rate of 6 x 10(-10) for physical error rate of 10(-3)) and low decoding latency (42ns on average), while being robust to measurement errors introduced while extracting syndromes during the QEC cycles. We also reduce the amount of decoding hardware required to perform QEC simultaneously on all the logical qubits by co-designing the micro-architectur

关键词： Quantum computing Quantum error correction Fault-tolerant quantum computing Decoding Union-Find decoding Surface codes

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共500页 << < 25 26 27 28 29 30 31 32 33 34 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：