检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

3,157 篇 会议
72 篇 期刊文献
65 册 图书

馆藏范围

3,293 篇 电子文献
1 种 纸本馆藏

日期分布

学科分类号

2,338 篇 工学
- 2,058 篇 计算机科学与技术...
- 1,036 篇 软件工程
- 414 篇 电气工程
- 326 篇 信息与通信工程
- 310 篇 电子科学与技术（可...
- 112 篇 控制科学与工程
- 69 篇 机械工程
- 67 篇 光学工程
- 67 篇 生物工程
- 62 篇 生物医学工程（可授...
- 35 篇 动力工程及工程热...
- 33 篇 仪器科学与技术
- 32 篇 建筑学
- 30 篇 材料科学与工程（可...
- 29 篇 化学工程与技术
- 25 篇 土木工程
- 21 篇 力学（可授工学、理...
721 篇 理学
- 482 篇 数学
- 174 篇 物理学
- 79 篇 生物学
- 65 篇 系统科学
- 60 篇 统计学（可授理学、...
- 36 篇 化学
246 篇 管理学
- 158 篇 管理科学与工程(可...
- 102 篇 图书情报与档案管...
- 70 篇 工商管理
63 篇 医学
- 53 篇 临床医学
- 21 篇 基础医学(可授医学...
22 篇 农学
- 19 篇 作物学
21 篇 法学
- 19 篇 社会学
15 篇 经济学
12 篇 文学
11 篇 教育学
4 篇 军事学

主题

327 篇 parallel process...
204 篇 graphics process...
203 篇 computer archite...
157 篇 parallel archite...
136 篇 parallel process...
123 篇 parallel algorit...
121 篇 graphics process...
115 篇 hardware
113 篇 image processing
86 篇 concurrent compu...
86 篇 computational mo...
76 篇 signal processin...
72 篇 parallel program...
71 篇 field programmab...
68 篇 instruction sets
68 篇 multicore proces...
67 篇 parallel computi...
65 篇 algorithm design...
58 篇 throughput
57 篇 gpu

机构

9 篇 college of compu...
9 篇 natl univ def te...
8 篇 carleton univ sc...
8 篇 national laborat...
6 篇 hosei univ dept ...
6 篇 inria rennes
6 篇 st francis xavie...
5 篇 chinese acad sci...
5 篇 univ aizu dept c...
5 篇 polish japanese ...
5 篇 computer science...
5 篇 college of compu...
5 篇 city university ...
4 篇 shanghai jiao to...
4 篇 charles univ pra...
4 篇 rwth aachen univ...
4 篇 hainan internati...
4 篇 department of co...
4 篇 university of ch...
4 篇 universidad carl...

作者

11 篇 jack dongarra
10 篇 roman wyrzykowsk...
8 篇 dongarra jack
7 篇 liu jie
7 篇 konrad karczewsk...
7 篇 quintana-orti en...
6 篇 hannig frank
6 篇 li dongsheng
6 篇 teich juergen
6 篇 li chao
6 篇 nakano koji
6 篇 peng shietung
6 篇 li yamin
6 篇 chu wanming
6 篇 krulis martin
5 篇 zhang lei
5 篇 ito yasuaki
5 篇 li kenli
5 篇 wanlei zhou
5 篇 tudruj marek

语言

3,230 篇 英文
53 篇 其他
15 篇 中文

检索条件"任意字段=5th International Conference on Algorithms and Architectures for Parallel Processing"

共 3294 条记录，以下是61-70 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

FPGA Implementation of a Point Cloud processing KNN Algorithm Used in GCN Network 5

FPGA Implementation of a Point Cloud Processing KNN Algorith...

引用

5th international conference on Electronics and Communication Technologies, ECT 2023

作者： Zhang, Zhaoyang Li, Hui School of Integrated Circuit Science and Engineering University of Electronic Science and Technology of China Chengdu China

ISBN: (纸本)9798350307719

KNN (k-nearest neighbor) algorithm is an important method which exhibits great performance in many fields. It is a commonly used step in graph convolutional networks (GCN) when graph structure is not available. However, only a few works have been proposed to implement KNN part of GCN through hardware solutions. In this paper, we present an FPGA implementation of k-nearest neighbor construction part of a point cloud processing GCN. Unlike several related works, the distance calculation part and sorting part of our design executes in parallel. By employing a dedicated calculation procedure and optimizing previous architectures, we successfully reduce resource consumption of our design. the proposed KNN unit which implements KNN part of 3D-GCN only consumes 5907 LUTs, 1687 FFs and 18 DSPs. Our work is resource-efficient compared to previous methods and achieves 24x performance speed up over Core i5 CPU, operating at a much lower clock frequency. © 2023 IEEE.

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

BPPV-Chain: A Sharding Blockchain System with Output Shard Batch processing and parallel Transaction Verification 5

BPPV-Chain: A Sharding Blockchain System with Output Shard B...

引用

5th international conference on Blockchain Technology (ICBCT)

作者： Ding, Jinfeng Hu, Qihua Lin, Changkai Shi, Minglin Cheng, Hongju Fuzhou Univ Colledge Comp & Data Sci Fuzhou Peoples R China

ISBN: (纸本)9798400708930

the current sharding schemes show some shortcomings, such as poor performance while handling cross-shard transactions, and the transactions are verified in an inefficient way. A sharding blockChain system with output shard Batch processing and parallel transaction Verification(BPPV-Chain) is proposed in this study. the core idea of the proposed scheme is elucidated as follows. the output shard is capable of verifying and processing the input availability certificates generated by the input shard in a batch manner, and it can generate the transaction availability certificates of different input shards when coping with the cross-shard transactions. the input shard unlocks or spends UTXO following the transaction avaliability certificates to for the cross-shard collaboration. On that basis, the communication complexity of cross-shard transactions can be reduced. Moreover, a parallel transaction verification scheme is present to increase the efficiency of transaction verification. In this scheme, UTXO is verified in a serial manner to prevent double spending and the signatures and values of multiple transactions are checked in a parallel manner. As indicated by the experimental results, BPPV-Chain outperforms existing sharding blockchain systems especially under the percentage of cross-shard transactions of not more than 80%. Furthermore, BPPV-Chain also ensures linear growth of throughput with the number of shards increasing, such that the scalability of BPPV-Chain can be confirmed.

关键词： Blockchain Sharding parallel verification Cross-shard transaction Output shard batch processing

来源：评论

学校读者我要写书评

暂无评论

Simulation Modelling of Dynamic Production Scheduling on parallel Machines with Sequence-Independent Setups 5th

Simulation Modelling of Dynamic Production Scheduling on Par...

引用

5th 5th Olympus - international conference on Supply Chain

作者： Karamanli, Anastasia Xanthopoulos, Alexandros Kansizoglou, Ioannis Gasteratos, Antonios Koulouriotis, Dimitrios Democritus Univ Thrace Sch Engn Dept Prod & Management Engn V Sofias 12 Xanthi 67100 Greece Natl Tech Univ Athens Sch Mech Engn Heroon Polytech 9 Athens 15772 Greece

ISBN: (纸本)9783031693434;9783031693441

More and more emphasis is allocated to production job scheduling by the modern organizations in order to preserve their performance indicators stable. this research addresses the dynamic scheduling of jobs in a production system with a single input queue and parallelmachines. the processing times and the times between job arrivals are assumed probabilistic. Jobs belong to different classes and a due date is assigned to each job at the time of its arrival. A machine needs to be setup every time it switches production from some job class to another. In this article, is considered a set of alternative priority rules for dynamic job scheduling using discrete-event simulation. the priority heuristics are compared in respect to several performance metrics in a series of simulation experiments. the behaviour of the scheduling heuristics is assessed under the influence of various parameters. Moreover, are offered managerial insights for scheduling decisions in industry based on the numerical results.

关键词： Dynamic Scheduling Stochastic System parallel Machines Sequence-Independent Setups Discrete-Event Simulation Priority Rules

来源：评论

学校读者我要写书评

暂无评论

Enhancing parallelization with OpenMP through Multi-Modal Transformer Learning 5

Enhancing Parallelization with OpenMP through Multi-Modal Tr...

引用

5th international conference on Computer Engineering and Application (ICCEA)

作者： Chen, Yuehua Yuan, Huaqiang Hou, Fengyao Hu, Peng Dongguan Univ Technol Dongguan Peoples R China Chinese Acad Sci Inst High Energy Phys Beijing Peoples R China Spallat Neutron Source Sci Ctr Dongguan Peoples R China

ISBN: (纸本)9798350386783;9798350386776

the popularity of multicore processors and the rise of High Performance Computing as a Service (HPCaaS) have made parallel programming essential to fully utilize the performance of multicore systems. OpenMP, a widely adopted shared-memory parallel programming model, is favored for its ease of use. However, it is still challenging to assist and accelerate automation of its parallelization. Although existing automation tools such as Cetus and DiscoPoP to simplify the parallelization, there are still limitations when dealing with complex data dependencies and control flows. Inspired by the success of deep learning in the field of Natural Language processing (NLP), this study adopts a Transformer-based model to tackle the problems of automatic parallelization of OpenMP instructions. We propose a novel Transformer-based multimodal model, ParaMP, to improve the accuracy of OpenMP instruction classification. the ParaMP model not only takes into account the sequential features of the code text, but also incorporates the code structural features and enriches the input features of the model by representing the Abstract Syntax Trees (ASTs) corresponding to the codes in the form of binary trees. In addition, we built a BTCode dataset, which contains a large number of C/C++ code snippets and their corresponding simplified AST representations, to provide a basis for model training. Experimental evaluation shows that our model outperforms other existing automated tools and models in key performance metrics such as F1 score and recall. this study shows a significant improvement on the accuracy of OpenMP instruction classification by combining sequential and structural features of code text, which will provide a valuable insight into deep learning techniques to programming tasks.

关键词： component OpenMP Natural Language processing Abstract Syntax Trees parallelization

来源：评论

学校读者我要写书评

暂无评论

Development and Verification of the GPU-Based Monte Carlo Particle Transport Program MagiC 5

Development and Verification of the GPU-Based Monte Carlo Pa...

引用

5th international Academic Exchange conference on Science and Technology Innovation, IAECST 2023

作者： Gao, Kekun Chen, Zhenping Sun, Aikou Yu, Tao University of South China School of Computing / Software Hunan Hengyang China School of Nuclear Science and Technology University of South China Hunan Hengyang China

ISBN: (纸本)9798350357738

Monte Carlo (MC) methods, due to their strong geometric simulation capabilities, comprehensive physical modeling, and minimal simulation approximation, are widely applied in areas such as radiation transport, physical criticality safety analysis, shielding design, and radiological medicine. With the increase in the scale of high-resolution numerical simulation problems and computational demands, traditional serial Monte Carlo programs can no longer meet the requirements of high-performance computing. In order to enhance the computational performance and efficiency of Monte Carlo programs, parallelization based on Graphics processing Units (GPUs) has become an important development direction. Traditional Monte Carlo programs typically only utilize Central processing Unit (CPU) resources for computation, which means users equipped with Graphics processing Units (GPUs) cannot fully exploit their computational capabilities, resulting in resource wastage. the floating-point computing performance of GPUs on personal computers is significantly superior to CPUs. therefore, optimizing Monte Carlo programs to fully utilize GPU performance will improve computational efficiency, making the overall computing process more efficient and flexible. this paper develops the Monte Carlo particle transport program MagiC based on GPUs. Building upon traditional parallel algorithms based on history simulation, the paper implements GPU parallel algorithms based on event simulation. Compared to CPU-based Monte Carlo particle transport programs, MagiC can achieve higher flow rates and memory bandwidth. the effectiveness and reliability of MagiC are verified through application cases in reactors and nuclear medicine. the research has certain engineering application value. © 2023 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation of parallel processing Adder Against Basic Adders on FPGAs 5th

Performance Evaluation of Parallel Processing Adder Against ...

引用

5th international conference on Computing Science, Communication and Security

作者： Fichadia, Dhaval Purohit, Kishor Soni, Bhavesh Einfochips Ltd Ahmadabad India Ganpat Univ Mehsana India

ISBN: (纸本)9783031751691;9783031751707

Adders are essential components of modern digital circuits, and their primary design goal is to achieve high speed. However, power consumption and chip area are also important considerations in modern circuit design. Optimizing digital adder performance plays a crucial role in enhancing the speed of binary operations within complex circuits. Various architectures address the carry propagation bottleneck, each with its own strengths and weaknesses. Choosing the most appropriate architecture depends on the specific application requirements, ensuring optimal performance within the available resource constraints. this paper provides a comprehensive analysis of various adder topologies and their performance characteristics. By carefully considering the trade-offs between delay, power consumption, and area, engineers can choose the optimal architecture for their specific application requirements, leading to significant improvements in digital system performance and efficiency. the analyzed adder topologies include Ripple Carry Adder (RCA), Carry Lookahead Adder (CLA), Carry Skip Adder (CSK), Carry Select Adder (CSLA), Carry Increment Adder (CIA), Brent kung adder (BKA), Kong stone adder. the analysis is conducted using HDL on the Xilinx ISE 14.7 platform.

关键词： Finite impulse response (FIR) Ripple carry adder (RCA) Carry Look Ahead Adder (CLA) Carry Select Adder (CSLA) Carry Increment Adder (CIA) Carry Skip Adder Kogge Stone Adder (CSKa) Arithmetic-logic unit (ALU) parallel prefix adder (PPA)

来源：评论

学校读者我要写书评

暂无评论

Development of New High Performance Computer architectures and Improvements in Danish Eulerian Model for Long Range Transport of Air Pollutants 14th

Development of New High Performance Computer Architectures a...

引用

14th international conference on Large-Scale Scientific Computations (LSSC)

作者： Georgiev, Krassimir Zlatev, Zahari Lirkov, Ivan Bulgarian Acad Sci Inst Informat & Commun Technol Sofia Bulgaria Aarhus Univ Dept Environm Sci Roskilde Denmark

ISBN: (纸本)9783031562075;9783031562082

the paper is devoted to an analysis and comparison in the development of new high - performance computers and the improvements and development of new more reliable versions of the Danish Eulerian model for computer studying of the transport of the air pollutants over Europe and surrounding areas, studying some economical and agricultural problems, regional and global climate changing, etc.

关键词： High-performance computer architectures mathematical and computer modelling in environmental studies speed-up and efficiency of parallel algorithms partial and ordinary differential equations

来源：评论

学校读者我要写书评

暂无评论

An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings 30

An efficient compilation of coarse-grained reconfigurable ar...

引用

30th Euromicro international conference on parallel, Distributed and Network-Based processing (PDP)

作者： Ohwada, Ayaka Kojima, Takuya Amano, Hideharu Keio Univ Yokohama Kanagawa Japan Univ Tokyo Bunkyo Ku Tokyo Japan

ISBN: (纸本)9781665469586

In recent years, IoT devices have become widespread, and energy-efficient coarse-grained reconfigurable architectures (CGRAs) have attracted attention. CGRAs comprise several processing units called processing elements (PEs) arranged in a two-dimensional array. the operations of PEs and the interconnections between them are adaptively changed depending on a target application, and this contributes to a higher energy efficiency compared to general-purpose processors. the application kernel executed on CGRAs is represented as a data flow graph (DFG), and CGRA compilers are responsible for mapping the DFG onto the PE array. thus, mapping algorithms significantly influence the performance and power efficiency of CGRAs as well as the compile lime. this paper proposes POCOCO, a compiler framework for CGRAs that can use pre-optimized subgraph mappings. this contributes to reducing the compiler optimization task. To leverage the subgraph mappings, we extend an existing mapping method based on a genetic algorithm. Experiments on three architectures demonstrated that the proposed method reduces the optimization lime by 48%, on an average, for the best case of the three architectures.

关键词： Program processors throughput Energy efficiency Libraries Reconfigurable architectures Arrays Task analysis

来源：评论

学校读者我要写书评

暂无评论

Accelerating CNN inference on long vector architectures via co-design 37

Accelerating CNN inference on long vector architectures via ...

引用

37th IEEE international parallel and Distributed processing Symposium (IPDPS)

作者： Gupta, Sonia Rani Papadopoulou, Nikela Pericas, Miguel Chalmers Univ Technol Comp Sci & Engn Gothenburg Sweden

ISBN: (纸本)9798350337662

CPU-based inference can be deployed as an alternative to off-chip accelerators. In this context, emerging vector architectures are a promising option, owing to their high efficiency. Yet the large design space of convolutional algorithms and hardware implementations makes the selection of design options challenging. In this paper, we present our ongoing research into co-designing future vector architectures for CPU-based Convolutional Neural Networks (CNN) inference focusing on the im2col+GEMM and Winograd kernels. Using the Gem5 simulator we explore the impact of several hardware microarchitectural features including (i) vector lanes, (ii) vector lengths, (iii) cache sizes, and (iv) options for integrating the vector unit into the CPU pipeline. In the context of im2col+GEMM, we study the impact of several BLIS-like algorithmic optimizations such as (1) utilization of vector registers, (2) loop unrolling, (3) loop reorder, (4) manual vectorization, (5) prefetching, and (6) packing of matrices, on the RISC-V Vector Extension and ARM-SVE ISAs. We use the YOLOv3 and VGG16 network models for our evaluation. Our co-design study shows that BLIS-like optimizations are not beneficial to all types of vector microarchitectures. We additionally demonstrate that longer vector lengths (of at least 8192 bits) and larger caches (of 256MB) can boost performance by 5x, with our optimized CNN kernels, compared to a vector length of 512-bit and 1MB of L2 cache. In the context of Winograd, we present our novel approach of inter-tile parallelization across the input/output channels by using 8x8 tiles per channel to vectorize the algorithm on vector length agnostic (VLA) architectures. Our method exploits longer vector lengths and offers high memory reuse, resulting in performance improvement of up to 2.4x for non-strided convolutional layers with 3x3 kernel size, compared to our optimized im2col+GEMM approach on the Fujitsu A64FX processor. Our co-design study furthermore reveals that W

关键词： CNNs GEMM Winograd long vector architectures vector-length agnostic ISAs co-design optimizations

来源：评论

学校读者我要写书评

暂无评论

Decentralized parallel Blockchain Agricultural Product Traceability System Security Analysis 5

Decentralized Parallel Blockchain Agricultural Product Trace...

引用

5th international conference on Mobile Computing and Sustainable Informatics, ICMCSI 2024

作者： Fenglin, Sun Guangxi Normal University Qixing District Guangxi Guilin City541000 China

ISBN: (纸本)9798350395235

this research work presents a decentralized parallel blockchain-based agricultural product traceability system, which aims to enhance information security and data model efficiency, and achieve real-time tracking of agricultural products. through an in-depth analysis of decentralized blockchain applications, the system adopts distributed ledger processing for low-cost transactions. the Digital Promise technology ensures the credibility of the data transmission and privacy protection. the proposed decentralized parallel blockchain structure shows significant traceability results, ensuring transparent traceability of agricultural products. Experimental results demonstrate the effectiveness of the system, improving food safety and quality management. the decentralized nature reduces the risk of data tampering, thereby enhancing public trust. Experimental results show short query response times, efficient data transfers, and zero failures, highlighting the system's excellent performance in handling traceability queries. these results underscore the remarkable advantages of blockchain technology in increasing the efficiency and reliability of traceability systems. © 2024 IEEE.

关键词： Blockchain

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共330页 << < 3 4 5 6 7 8 9 10 11 12 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：