检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

1,570 篇 会议
58 篇 期刊文献
32 册 图书

馆藏范围

1,660 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

924 篇 工学
- 851 篇 计算机科学与技术...
- 455 篇 软件工程
- 222 篇 电气工程
- 80 篇 电子科学与技术（可...
- 73 篇 信息与通信工程
- 51 篇 控制科学与工程
- 18 篇 机械工程
- 18 篇 建筑学
- 12 篇 仪器科学与技术
- 10 篇 动力工程及工程热...
- 9 篇 材料科学与工程（可...
- 8 篇 光学工程
- 8 篇 土木工程
- 7 篇 力学（可授工学、理...
- 4 篇 化学工程与技术
- 4 篇 航空宇航科学与技...
- 3 篇 测绘科学与技术
247 篇 理学
- 175 篇 数学
- 57 篇 物理学
- 26 篇 系统科学
- 12 篇 统计学（可授理学、...
- 8 篇 化学
- 4 篇 地球物理学
- 4 篇 生物学
74 篇 管理学
- 58 篇 管理科学与工程(可...
- 30 篇 工商管理
- 22 篇 图书情报与档案管...
13 篇 医学
- 13 篇 临床医学
10 篇 经济学
- 10 篇 应用经济学
7 篇 法学
- 7 篇 社会学
2 篇 教育学
2 篇 农学
2 篇 艺术学
1 篇 文学

主题

246 篇 parallel process...
205 篇 parallel archite...
169 篇 computer archite...
157 篇 parallel process...
120 篇 hardware
116 篇 concurrent compu...
76 篇 computer science
69 篇 program processo...
65 篇 parallel program...
57 篇 runtime
55 篇 delay
52 篇 application soft...
47 篇 processor schedu...
43 篇 yarn
42 篇 registers
41 篇 costs
39 篇 parallel algorit...
37 篇 optimizing compi...
36 篇 computational mo...
34 篇 algorithm design...

机构

7 篇 ibm thomas j. wa...
6 篇 universitat poli...
6 篇 swiss fed inst t...
5 篇 georgia inst tec...
5 篇 carnegie mellon ...
5 篇 intel corporatio...
4 篇 univ michigan ad...
4 篇 ecole polytech f...
4 篇 univ illinois ur...
4 篇 barcelona superc...
4 篇 univ manchester ...
4 篇 georgia inst of ...
4 篇 ghent university
4 篇 stanford univers...
3 篇 computer science...
3 篇 school of electr...
3 篇 seoul natl univ
3 篇 tsinghua univ de...
3 篇 georgia inst tec...
3 篇 ohio state univ ...

作者

9 篇 mateo valero
9 篇 mahlke scott
8 篇 m. kandemir
8 篇 a. choudhary
7 篇 sarkar vivek
7 篇 valero mateo
6 篇 j. ramanujam
6 篇 kandemir mahmut
6 篇 olukotun kunle
6 篇 agrawal gagan
5 篇 cohen albert
5 篇 rauchwerger lawr...
5 篇 vivek sarkar
5 篇 amato nancy m.
4 篇 j. becker
4 篇 p. banerjee
4 篇 mudge trevor
4 篇 a. gonzalez
4 篇 jack dongarra
4 篇 plaza antonio

语言

1,643 篇 英文
15 篇 其他
3 篇 中文

检索条件"任意字段=Proceedings of the Conference on Parallel Architectures and Compilation Techniques"

共 1660 条记录，以下是31-40 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

parallel architectures and compilation techniques - conference proceedings, PACT

Parallel Architectures and Compilation Techniques - Conferen...

引用

26th International conference on parallel architectures and compilation techniques, PACT 2017

ISBN: (纸本)9781467395243

The proceedings contain 50 papers. The topics discussed include: performance improvement via always-abort HTM;DRUT: an efficient turbo boost solution via load balancing in decoupled look-ahead architecture;proxy benchmarks for emerging big-data workload;lightweight provenance service for high-performance computing;application-driven near-data processing for similarity search;improving datacenter efficiency through partitioning-aware scheduling;the liberation day of nondeterministic program;location-aware computation mapping for ManyCore processors;DaQueue: a data-aware work-queue design for GPGPUs;accelerate GPU concurrent kernel execution by mitigating memory pipeline stalls;design space exploration for performance optimization of deep neural networks on shared memory accelerators;cutting the fat: speeding up RBM for fast deep learning through generalized redundancy elimination;statement reordering to alleviate register pressure for stencils oN GPUS;NUMA-aware power management for chip multiprocessors;elastic reconfiguration for heterogeneous NOCs with Binochs;leeway: addressing variability in dead-block prediction for last-level caches;application clustering policies to address system fairness with Intel's cache allocation technology;GRaphie: large-scale asynchronous graph traversals on just a GPU;weak memory models: balancing definitional simplicity and implementation flexibility;exploiting asymmetric SIMD register configurations in arm-to-x86 dynamic binary translation;and a generalized framework for automatic scripting language parallelization.

关键词：

来源：评论

学校读者我要写书评

暂无评论

EC-SpMM: Efficient compilation of SpMM Kernel on GPUs 23

EC-SpMM: Efficient Compilation of SpMM Kernel on GPUs

引用

52nd International conference on parallel Processing (ICPP)

作者： Lin, Junqing Zhang, Honghe Shi, Xiaolong Sun, Jingwei Yu, Xianzhi Yao, Jun Sun, Guangzhong Univ Sci & Technol China Hefei Peoples R China Huawei Noahs Ark Lab Shenzhen Peoples R China

ISBN: (纸本)9798400708435

As deep neural networks (DNNs) become increasingly large and complicated, pruning techniques are proposed for lower memory footprint and more efficient inference. The most critical kernel to execute pruned sparse DNNs on GPUs is Sparse-dense Matrix Multiplication (SpMM). To maximize the performance of SpMM, despite the high-performance code generated from recent tensor compilers, they often take a long time for iteratively searching candidate configurations. Such a long time slows down the cycle of exploring better DNN architectures or pruning algorithms. In this paper, we propose EC-SpMM to efficiently generate high-performance SpMM kernels for sparse DNN inference. Based on the analysis of nonzero elements' layout, the characterization of GPU architecture, and a rank-based cost model, EC-SpMM can effectively reduce the search space and eliminate possibly low-performance candidates. Experimental results show that EC-SpMM can reduce the compilation time by a factor of 35x, while the performance of generated SpMM kernels is comparable or even better, compared with the state-of-the-art sparse tensor compiling solution.

关键词： Sparse-dense matrix multiplication deep neural network tensor compiler GPU

来源：评论

学校读者我要写书评

暂无评论

14th Workshop on parallel Programming and Run-Time Management techniques for Many-Core architectures and 12th Workshop on Design Tools and architectures for Multicore Embedded Computing Platforms, PARMA-DITAM 2023

14th Workshop on Parallel Programming and Run-Time Managemen...

引用

14th Workshop on parallel Programming and Run-Time Management techniques for Many-Core architectures and 12th Workshop on Design Tools and architectures for Multicore Embedded Computing Platforms, PARMA-DITAM 2023

ISBN: (纸本)9783959772693

The proceedings contain 8 papers. The topics discussed include: ByteNite: a new business model for grid computing;challenges and opportunities in C/C++ source-to-source compilation;RUST-encoded stream ciphers on a RISC-V parallel ultra-low-power processor;an evaluation of the state-of-the-art software and hardware implementations of BIKE;MonTM: monitoring-based thermal management for mixed-criticality systems;dynamic power consumption of the full posit processing unit: analysis and experiments;and adjacent LSTM-based page scheduling for hybrid DRAM/NVM memory systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PARENDI: Thousand-Way parallel RTL Simulation 25

PARENDI: Thousand-Way Parallel RTL Simulation

引用

30th International conference on Architectural Support for Programming Languages and Operating Systems-ASPLOS

作者： Emami, Mahyar Bourgeat, Thomas Larus, James R. Ecole Polytech Fed Lausanne Lausanne Switzerland

ISBN: (纸本)9798400710797

Hardware development critically depends on cycle-accurate RTL simulation. However, as chip complexity increases, conventional single-threaded simulation becomes impractical due to stagnant single-core performance. PARENDI is an RTL simulator that addresses this challenge by exploiting the abundant fine-grained parallelism inherent in RTL simulation and efficiently mapping it onto the massively parallel Graphcore IPU (Intelligence Processing Unit) architecture. PARENDI scales up to 5888 cores on 4 Graphcore IPU sockets. It allows us to run large RTL designs up to 4x faster than the most powerful state-of-the-art x64 multicore systems. To achieve this performance, we developed new partitioning and compilation techniques and carefully quantified the synchronization, communication, and computation costs of parallel RTL simulation: The paper comprehensively analyzes these factors and details the strategies that PARENDI uses to optimize them.

关键词： Bulk-synchronous parallel RTL Simulation Cycle-accurate Partitioning Submodular Load Balancing

来源：评论

学校读者我要写书评

暂无评论

Q-Pilot: Field Programmable Qubit Array compilation with Flying Ancillas 24

Q-Pilot: Field Programmable Qubit Array Compilation with Fly...

引用

61st ACM/IEEE Design Automation conference, DAC 2024

作者： Wang, Hanrui Tan, Daniel Bochen Liu, Pengyu Liu, Yilian Gu, Jiaqi Cong, Jason Han, Song MIT United States University of California Los Angeles United States Carnegie Mellon University United States Cornell University United States Arizona State University United States

ISBN: (纸本)9798400706011

Neutral atom arrays, particularly the reconfigurable field programmable qubit arrays (FPQA) with atom movement, show strong promise for quantum computing. FPQA has a dynamic qubit connectivity, facilitating cost-effective execution of long-range gates, but it also poses new challenges in the compilation. Inspired by the FPGA compilation strategy, we develop a router, Q-Pilot, that leverages flying ancillas to implement 2-Q gates between data qubits mapped to fixed atoms. Equipped with domain-specific routing techniques, Q-Pilot achieves 1.4×, 27.7×, and 6.7× reductions in circuit depth for 100-qubit random, quantum simulation, and QAOA circuits, respectively, compared to alternative fixed atom array architectures. © 2024 Copyright held by the owner/author(s).

关键词： Qubits

来源：评论

学校读者我要写书评

暂无评论

53rd International conference on parallel Processing, ICPP 2024 - Workshops proceedings

53rd International Conference on Parallel Processing, ICPP 2...

引用

53rd International conference on parallel Processing, ICPP 2024

ISBN: (纸本)9798400718021

The proceedings contain 19 papers. The topics discussed include: structures and techniques for streaming dynamic graph processing on decentralized message-driven systems;interference-aware function inlining for code size reduction;the rewriting of DataRaceBench benchmark for OpenCL program validations;support post quantum cryptography with SIMD everywhere on RISC-V architectures;substitution of kernel functions based on pattern matching on schedule trees;fusing depthwise and pointwise convolutions for efficient inference on GPUs;design of a decentralized Web3 access interface;a distributed particle swarm optimization algorithm based on Apache spark for asynchronous parallel training of deep neural networks;and graph federated learning with center moment constraints for node classification.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PCC: An End-to-End compilation Framework for Neural Networks on Photonic-Electronic Accelerators 42

PCC: An End-to-End Compilation Framework for Neural Networks...

引用

42nd International conference on Computer Design

作者： Hu, Bohan Liu, Yinyi Liu, Zhenguo Zhang, Wei Xu, Jiang Hong Kong Univ Sci & Technol Guangzhou Microelect Thrust Guangzhou Peoples R China Hong Kong Univ Sci & Technol Dept Elect & Comp Engn Hong Kong Peoples R China

ISBN: (纸本)9798350380415;9798350380408

Photonic computing, known for its high bandwidth and energy efficiency, harnesses physical phenomena in the optical domain to accelerate a wide range of computational operations such as dot product, matrix multiplication, Fourier transform, 1D convolution, and more. However, the multitude of computational operations mentioned above poses challenges in mapping realistic neural network workloads onto underlying photonic hardware. This complexity requires extensive expertise and laborious programming, impeding the practical adoption and deployment of photonic acceleration. To address this gap, we propose an end-to-end compilation framework comprising a Photonic Compiler Collection (PCC). This framework automates the mapping of high-level deep neural network (DNN) specifications onto target architectures of photonic-electronic accelerators. Additionally, we present a method to streamline neural network workloads by leveraging the multi-level intermediate representation (MLIR) and compiler optimization techniques, targeting photonic-specific patterns. Moreover, we conduct a comprehensive case study illustrating the integration of a typical computational operator, the MachZehnder Interferometer (MZI) mesh, into PCC. Our experimental results demonstrate that PCC achieves up to a 4x speedup on DNN workloads compared to hand-crafted implementations. In summary, our proposed framework offers a practical and automated solution for compiling, optimizing, and flexibly supporting newer operators of photonic devices. We anticipate that our framework will significantly accelerate the development and deployment of photonic applications in real-world AI scenarios.

关键词： compiler deep learning photonic computing

来源：评论

学校读者我要写书评

暂无评论

Exploring parallel Blockchains: From Related Concepts to Application Scenarios 4

Exploring Parallel Blockchains: From Related Concepts to App...

引用

4th IEEE International conference on Digital Twins and parallel Intelligence, DTPI 2024

作者： Zhang, Ziyi Zhu, Peng Nanjing University of Science and Technology School of Economics and Management Nanjing China

ISBN: (纸本)9798350349252

Within CPSS, traditional blockchain technology encounters scalability and flexibility issues. With its multi-chain structure, parallel blockchain provides an innovative way to address these problems. This paper examines the foundational ideas and architectures of the parallel blockchain, highlighting the possible uses in intelligent transportation, smart healthcare, the IoT and financial transaction. The research also supplements the relevant concepts of DAO and explores the application of these technologies in intricate CPSS contexts. parallel blockchain solves the shortcomings brought about by the absence of effective modeling and experimental techniques in traditional blockchain by integrating virtual and real systems and using a bidirectional guidance approach to test and optimize real systems. The paper concludes by summarizing recent research findings and talking about the potential for parallel blockchain applications in increasingly complex systems that span multiple domains in the future. © 2024 IEEE.

关键词： CPSS DAO parallel Blockchain parallel Execution

来源：评论

学校读者我要写书评

暂无评论

conference Organization

Conference Organization

引用

International conference on parallel Architecture and compilation techniques (PACT)

来源：评论

学校读者我要写书评

暂无评论

Efficiency Comparison of parallel and Serial Computation techniques for Multi-Regional Weather Data Aggregation 2

Efficiency Comparison of Parallel and Serial Computation Tec...

引用

2nd IEEE International conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics, ICIITCEE 2024

作者： Mallegowda, M. Hajjarge, Vikas Divate, Vinayak Vittal Mohan, Krishna Kanavalli, Anita Ramaiah Institute of Technology Dept. of Cse Bengaluru India Siddaganga Institute of Technology Dept. of Cse Bengaluru India Ramaiah Institute of Technology Dept. of Cse Karnataka Bangalore India

ISBN: (纸本)9798350306415

In this study, the efficiency of parallel and serial computation techniques for aggregating data from diverse regions is investigated. parallel computation breaks data into smaller segments and assigns each segment to a different processor or core, as opposed to traditional serial computation, which processes data sequentially. The research focuses on comparing the total amount of time needed by each approach to collect data from diverse regions in order to assess the effectiveness of parallel processing. According to the research, parallel computation significantly reduces the amount of time needed to collect data, and this correlation is directly related to the number of processors or cores used. The potential benefits of multi-core architectures in accelerating the gathering of data from multiple places are highlighted in this study. The outcomes highlight the efficacy of parallel computation methods, illuminating their potential to accelerate multi-regional data collection procedures in a variety of applications. © 2024 IEEE.

关键词：

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共166页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：