检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

分类表

所选分类

>> <<

限定检索结果

标题

标题
作者
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

作者

作者
标题
主题词
出版物名称
出版社
机构
学科分类号
摘要
ISBN
ISSN
基金资助
索书号

文献类型

2,912 篇 会议
92 册 图书
55 篇 期刊文献

馆藏范围

3,059 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

2,283 篇 工学
- 2,055 篇 计算机科学与技术...
- 957 篇 软件工程
- 340 篇 信息与通信工程
- 325 篇 电气工程
- 128 篇 电子科学与技术（可...
- 125 篇 控制科学与工程
- 70 篇 机械工程
- 52 篇 仪器科学与技术
- 44 篇 光学工程
- 41 篇 动力工程及工程热...
- 40 篇 生物工程
- 35 篇 材料科学与工程（可...
- 30 篇 生物医学工程（可授...
- 28 篇 网络空间安全
- 27 篇 建筑学
- 24 篇 化学工程与技术
- 22 篇 土木工程
- 20 篇 交通运输工程
- 19 篇 环境科学与工程（可...
479 篇 理学
- 294 篇 数学
- 105 篇 物理学
- 50 篇 生物学
- 42 篇 统计学（可授理学、...
- 37 篇 系统科学
- 33 篇 化学
262 篇 管理学
- 186 篇 管理科学与工程(可...
- 92 篇 图书情报与档案管...
- 82 篇 工商管理
38 篇 医学
- 32 篇 临床医学
19 篇 经济学
- 19 篇 应用经济学
16 篇 农学
15 篇 法学
8 篇 教育学
4 篇 军事学
2 篇 文学
1 篇 艺术学

主题

193 篇 parallel process...
103 篇 distributed comp...
85 篇 parallel process...
80 篇 computer archite...
74 篇 distributed comp...
72 篇 cloud computing
69 篇 parallel program...
66 篇 graphics process...
59 篇 computational mo...
58 篇 application soft...
58 篇 graphics process...
56 篇 concurrent compu...
56 篇 parallel computi...
53 篇 big data
50 篇 mapreduce
50 篇 hardware
49 篇 computer communi...
49 篇 artificial intel...
45 篇 distributed data...
44 篇 computer science

机构

11 篇 information syst...
10 篇 science and tech...
8 篇 king saud univer...
7 篇 sun yat-sen univ...
6 篇 college of compu...
6 篇 univ chinese aca...
6 篇 national laborat...
6 篇 beijing institut...
6 篇 national laborat...
6 篇 natl univ def te...
6 篇 national laborat...
5 篇 university of te...
5 篇 the university o...
5 篇 department of co...
5 篇 northeastern uni...
5 篇 agency for scien...
5 篇 school of comput...
5 篇 national laborat...
5 篇 department of co...
5 篇 univ pisa dept c...

作者

20 篇 eyas el-qawasmeh
11 篇 jack dongarra
11 篇 liu jie
10 篇 roman wyrzykowsk...
10 篇 danelutto marco
9 篇 badia rosa m.
9 篇 li kuan-ching
9 篇 konrad karczewsk...
8 篇 prodan radu
8 篇 anon
8 篇 azizah abd manaf
8 篇 wang qinglin
6 篇 de sensi daniele
6 篇 mencagli gabriel...
6 篇 gorlatch sergei
6 篇 ewa deelman
6 篇 talia domenico
6 篇 mitschang bernha...
6 篇 lapegna marco
6 篇 buyya rajkumar

语言

2,954 篇 英文
91 篇 其他
16 篇 中文

检索条件"任意字段=3rd International Conference on Parallel and Distributed Processing and Applications"

共 3059 条记录，以下是341-350 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

GPU Optimization of Biological Macromolecule Multi-tilt Electron Tomography Reconstruction Algorithm 1

引用

19th international conference on Advanced Intelligent Computing Technology and applications (ICIC)

作者： Fu, Zi-Ang Wan, Xiaohua Zhang, Fa Lanzhou Univ Sch Informat Sci & Engn Lanzhou 730000 Gansu Peoples R China Chinese Acad Sci Inst Comp Technol Beijing 100190 Peoples R China

ISBN: (数字)9789819947492

ISBN: (纸本)9789819947485;9789819947492

Three-dimensional (3D) reconstruction in cryo-electron tomography (cryo-ET) plays an important role in studying in situ biological macromolecular structures at the nanometer level. Owing to limited tilt angle, 3D reconstruction of cryo-ET always suffers from a "missing wedge" problem which causes severe accuracy degradation. Multi-tilt reconstruction is an effective method to reduce artifacts and suppress the effect of the missing wedge. As the number of tilt series increases, large size data causes high computation and huge memory overhead. Limited by the memory, multi-tilt reconstruction cannot be performed in parallel on GPUs, especially when the image size reaches 1 K, 2 K, or even larger. To optimize large-scale multi-tilt reconstruction of cryo-ET, we propose a newGPU-based large-scale multi-tilt tomographic reconstruction algorithm (GMSIRT). Furthermore, we design a two-level data partition strategy in GM-SIRT to greatly reduce the memory required in the whole reconstructing process. Experimental results show that the performance of the GM-SIRT algorithm has been significantly improved compared with DM-SIRT, the distributed multi-tilt reconstruction algorithm on the CPU cluster. The acceleration ratio is over 300%, and the memory requirement only decreases to one-third of DM-SIRT when the image size reaches 2 K.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

HPC and GPU solutions for radio interferometry using RICK 33

HPC and GPU solutions for radio interferometry using RICK

引用

33rd Euromicro international conference on parallel, distributed, and Network-Based processing, PDP 2025

作者： De Rubeis, Emanuele Gheller, Claudio Lacopo, Giovanni Tornatore, Luca Elahi, Pascal Jahan Cytowski, Maciej Taffoni, Giuliano Varetto, Ugo Università di Bologna Istituto di Radioastronomia Dipartimento di Fisica e Astronomia INAF via Gobetti 93/2 BolognaI-40129 Italy Istituto di Radioastronomia INAF Via P. Gobetti 101 Bologna40129 Italy Università degli studi di Trieste INAF Astronomical Observatory of Trieste Trieste Italy INAF Astronomical Observatory of Trieste via GB Tiepolo 11 Trieste34143 Italy Pawsey Supercomputing Centre 1 Bryce Avenue KensingtonWA6151 Australia

ISBN: (纸本)9798331524937

Since the last decade, radio astronomy has started a new era: the advent of the Square Kilometer Array (SKA), preceded by its pathfinders, will produce a huge amount of data that will be hard to process with a traditional approach. This means that the current state-of-the-art software for data reduction and imaging will have to be re-modeled to face such data challenge. In order to manage such an increase in data size and computational requirements, scientists need to exploit modern high-performance computing (HPC) architectures. In particular, heterogeneous systems, based on complex combinations of CPUs, accelerators, high-speed networks and composite storage devices need to be used in an efficient and effective way. In this paper, we present an overview on Radio Imaging Code Kernels (RICK;[1];[2];[3]), a code able to perform the most computationally demanding steps of w-stacking gridder algorithm exploiting distributed parallelism and GPU acceleration. GPU offloading is possible through CUDA, HIP, and OpenMP, aiming at the largest possible usability among multiple architectures. After detailing the (multi-)GPU approach to the problem and listing all the new code implementations, we analyze its performances considering both the computational and communication workload. We will show how the full, distributed GPU offload of the code, first of its kind and crucial to deal with increasingly large interferometric data, represents not only an extremely fast and optimized approach, but also the greenest one if compared to its parallel CPU counterpart. This code, now publicly available, has been tested with a wide variety of modern interferometers and SKA pathfinders. This represents, to date, the first example of radio imaging software fully enabled to GPUs, becoming a potential state-of-the-art approach for the upcoming SKA. Finally, we will also present the future perspectives about the code, planned to be converted into a library and possibly be used by any of the most

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

QR Factorization of Block Low-Rank Matrices on Multi-instance GPU 23rd

QR Factorization of Block Low-Rank Matrices on Multi-insta...

引用

23rd international conference on parallel and distributed Computing, applications, and Technologies, PDCAT 2022

作者： Ohshima, Satoshi Ida, Akihiro Yokota, Rio Yamazaki, Ichitaro Information Technology Center Nagoya University Aichi Japan Research Institute for Value-Added-Information Generation Japan Agency for Marine-Earth Science and Technology Kanagawa Japan Global Scientific Information and Computing Center Tokyo Institute of Technology Tokyo Japan Scalable Algorithms Department Sandia National Laboratories New Mexico United States

ISBN: (纸本)9783031299261

The QR factorization, which is a fundamental operation in linear algebra, is used extensively in scientific simulations. The acceleration and memory reduction of it are important research targets. QR factorization using block low-rank matrices (BLR-QR) has previously been proposed to address this issue. In this study, we consider its implementation on a GPU. Current CPUs and GPUs have numerous computational cores and the performance consists of the total performance of them. Therefore, the degree of parallelism of the target calculation is important for obtaining high performance. By contrast, many applications, including BLR-QR, do not have sufficient parallelism. Batched computation has attracted attention for achieving high performance in such calculations. However, the use of it requires major code rewriting and is extremely laborious. Thus, we propose the use of the multi-instance GPU (MIG) feature of current GPUs. Using MIG, we succeeded in obtaining a 53.3% time reduction over the CPU and 77.6% over the GPU without MIG. From the above result, we succeeded in demonstrating rapid implementation of BLR-QR on MIG and usefulness of MIG. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

A Transpose-free Three-dimensional FFT Algorithm on ARM CPUs 23

A Transpose-free Three-dimensional FFT Algorithm on ARM CPUs

引用

23rd IEEE international conference on High Performance Computing and Communications, 7th IEEE international conference on Data Science and Systems, 19th IEEE international conference on Smart City and 7th IEEE international conference on Dependability in Sensor, Cloud and Big Data Systems and applications, HPCC-DSS-SmartCity-DependSys 2021

作者： Chen, Tun Jia, Haipeng Li, Zhihao Li, Chendi Zhang, Yunquan Skl of Computer Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China Huawei Technologies Co. Ltd Shenzhen China

ISBN: (纸本)9781665494571

According to the traditional multi-dimensional FFT, memory layouts of high-dimensional data are discontinuous. Transposition is introduced to keep high-dimensional data continuous in memory. However, transposition increases memory access and is a hot spot for multi-dimensional FFT. This paper proposes an optimization framework to eliminate explicit transpositions and optimize the three-dimensional (3D) FFT. This framework includes three research points. 1) combines the width-first and breadth-first search to optimize the butterfly network of one-dimensional (1D) FFT;2) adopts a column-order algorithm to eliminate data transposition;3) adopts a blocking algorithm of cache-aware to better use the hardware resources of ARM architecture. Based on this optimized framework, a multi-dimensional FFT library named MDFFT is implemented. The experiments demonstrate that MDFFT generally performs better than FFTW and ARMPL on ARM CPUs. © 2021 IEEE.

关键词： Three-dimensional displays Smart cities Layout parallel processing Libraries Hardware Optimization

来源：评论

学校读者我要写书评

暂无评论

TAPA-CS: Enabling Scalable Accelerator Design on distributed HBM-FPGAs 24

TAPA-CS: Enabling Scalable Accelerator Design on Distributed...

引用

29th ACM international conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2024

作者： Prakriya, Neha Chi, Yuze Basalama, Suhail Song, Linghao Cong, Jason UCLA Los Angeles United States

ISBN: (纸本)9798400703867

Despite the increasing adoption of FPGAs in compute clouds, there remains a significant gap in programming tools and abstractions which can leverage network-connected, cloud-scale, multi-die FPGAs to generate accelerators with high frequency and throughput. We propose TAPA-CS, a task-parallel dataflow programming framework which automatically partitions and compiles a large design across a cluster of FPGAs while achieving high frequency and throughput. TAPA-CS has three main contributions. First, it is an open-source framework which allows users to leverage virtually "unlimited" accelerator fabric, high-bandwidth memory (HBM), and on-chip memory. Second, given as input a large design, TAPA-CS automatically partitions the design to map to multiple FPGAs, while ensuring congestion control, resource balancing, and overlapping of communication and computation. Third, TAPA-CS couples coarse-grained floorplanning with interconnect pipelining at the inter- and intra-FPGA levels to ensure high frequency. FPGAs in our multi-FPGA testbed communicate through a high-speed 100Gbps Ethernet infrastructure. We have evaluated the performance of TAPA-CS on designs, including systolic-array based CNNs, graph processing workloads such as page rank, stencil applications, and KNN. On average, the 2-, 3-, and 4-FPGA designs are 2.1×, 3.2×, and 4.4× faster than the single FPGA baselines generated through Vitis HLS. TAPA-CS also achieves a frequency improvement between 11%-116% compared with Vitis HLS. © 2024 Copyright held by the owner/author(s).

关键词： Field programmable gate arrays (FPGA)

来源：评论

学校读者我要写书评

暂无评论

Covariances computation in the Gaia AVU-GSR parallel Solver with I/O techniques: a performance study as a function of writing cycle length 33

Covariances computation in the Gaia AVU-GSR Parallel Solver ...

引用

33rd Euromicro international conference on parallel, distributed, and Network-Based processing, PDP 2025

作者： Cesare, Valentina Becciani, Ugo Vecchiato, Alberto Lattanzi, Mario Gilberto Aldinucci, Marco Bucciarelli, Beatrice National Institute for Astrophysics Astrophysical Observatory of Catania Italy Astrophysical Observatory of Turin National Institute for Astrophysics Italy University of Turin Department of Computer Science Italy

ISBN: (纸本)9798331524937

The solver module of the Astrometric Verification Unit Global Sphere Reconstruction (AVU GSR) pipeline aims to find the astrometric parameters of ∼108 stars in the Milky Way, the attitude and instrumental settings of the Gaia satellite, and the parametrized post Newtonian parameter γ with a resolution of 10-100 micro-arcseconds. To perform this task, the code, which runs in production on Leonardo CINECA infrastructure, solves a system of linear equations with the iterative LSQR algorithm, where the coefficient matrix is large (10-50 TB) and sparse and the iterations stop when least square convergence is reached. The solver was ported to GPU with CUDA, obtaining a ∼14x acceleration factor over an original version CPU-parallelized with OpenMP. This work concentrates on a code section dedicated to covariances calculation, representing an important scientific task for Gaia mission, since the problems unknowns present strong correlations. Given the number of unknowns at mission end, the variances-covariances matrix is expected to occupy ∼1 EB, which represents a substantial "Big Data"issue. To compute a subset of the total covariances, we defined an I/Obased pipeline made of two jobs. The first job, the LSQR, writes the files every itnCovCP iterations, and the second job reads them and calculates the corresponding covariances. The two jobs can be launched either in sequence or concurrently. Previous studies demonstrated that the covariances calculation does not significantly slowdown the AVU-GSR production up to ∼3 107 covariances. Here we investigate the performance of the covariances pipeline as a function of itnCovCP. The results show that writing smaller files more frequently or writing larger files less frequently does not affect the global performance of the solver, whose speed only depends on the number of covariances to calculate and of system unknowns. © 2025 IEEE.

关键词： Covariance matrix

来源：评论

学校读者我要写书评

暂无评论

Zero-Shot Face Swapping with De-identification Adversarial Learning 22nd

Zero-Shot Face Swapping with De-identification Adversarial L...

引用

22nd international conference on parallel and distributed Computing, applications and Technologies (PDCAT 2021)

作者： Li, Huifang Li, Yidong Liu, Jiaming Hong, Zhibin Hu, Tianshu Ren, Yan Beijing Jiaotong Univ Sch Comp & Informat Technol Beijing 100044 Peoples R China Baidu Inc Baidu Technol Pk Bldg 2Xibeiwang East Rd Beijing 100193 Peoples R China QI ANXIN Technol Grp Inc Beijing 100044 Peoples R China

ISBN: (纸本)9783030967727;9783030967710

In this paper, we propose a Zero-shot Face Swapping Network (ZFSNet) to swap novel identities where no training data is available, which is very practical. In contrast to many existing methods that consist of several stages, the proposed model can generate images containing the unseen identity in a single forward pass without fine-tuning. To achieve it, based on the basic encoder-decoder framework, we propose an additional de-identification (De-ID) module after the encoder to remove the source identity information, which contributes to removing the source identity retaining in the encoding stream and improves the model's generalization capability. Then we introduce an attention component (ASSM) to blend the encoded source feature and the target identity feature adaptively. It amplifies proper local details and helps the decoder attend to the related identity feature. Extensive experiments evaluated on the synthesized and real images demonstrate that the proposed modules are effective in zero-shot face swapping. In addition, we also evaluate our framework on zero-shot facial expression translation to show its versatility and flexibility.

关键词： Face swapping Facial expression translation Adversarial learning

来源：评论

学校读者我要写书评

暂无评论

AP3: Adaptive Power Prediction Framework based on Spatial Partition Multi-Phase Model 23

AP3: Adaptive Power Prediction Framework based on Spatial Pa...

引用

作者： Chen, Juan Ou, Zhixin Guo, Yifei Qi, Xinxin Sun, Yuyang Deng, Lin Chen, Hongyu Lin, Zihan College of Computer National University of Defense Technology Changsha China

ISBN: (纸本)9781665494571

The accuracy of processor power modeling is an important foundation for power management and optimization on parallel computing system. It is difficult to build a high-accuracy instantaneous CPU/DRAM power prediction model. One of the main reasons for the low accuracy is that the processor architecture sometimes influence greatly on the accuracy of power prediction. For example, the accuracy of processor power model is affected by inaccurate modeling of the uncore part of processor. Another reason comes from the limitation of static models for instantaneous power prediction. Despite the existence of various optional linear/nonlinear models, the fixed training set and model coefficients are insufficient to make high-accuracy instantaneous predictions for various target programs. Aiming at the above two issues, this paper proposes an Adaptive Power Prediction framework based on spatial Partition multi-phase model (AP^{3}). Spatial partition mainly solves the impact of uncore power on the prediction accuracy, and adaptability solves the limitation of the static model on the power prediction accuracy. According to the experimental results on both ARM-based and x86-based processor platforms, AP^{3} greatly increases the CPU and DRAM power instantaneous prediction accuracy. Spatial partition reduce the prediction error (MRE) by 0.3%-8.2% compared to previous single model, while adaptive update further reduces the error (MRE) by 1.7%-7.1% compared to previous static model. © 2021 IEEE.

关键词： Training Adaptation models Smart cities Multicore processing Computational modeling Power system management Random access memory

来源：评论

学校读者我要写书评

暂无评论

Blockchain and the General Data Protection Regulation: Healthcare Data processing 23rd

Blockchain and the General Data Protection Regulation: Healt...

引用

23rd international conference on Computational Science and Its applications, ICCSA 2023

作者： Perchinunno, Paola Massari, Antonella L’Abbate, Samuela Crocetta, Corrado Department of Economics Management and Business Law University of Bari "Aldo Moro" Bari Italy Department of Humanities Research and Innovation University of Bari "Aldo Moro" Bari Italy

ISBN: (纸本)9783031371103

The General Data Protection Regulation (GDPR) of the European Union became binding in May 2018. The objective of the GDPR is essentially twofold. On the one hand, it seeks to facilitate the free movement of personal data between the various EU Member States, and, on the other hand, it establishes a framework for the protection of fundamental rights, based on the right to data protection as set out in Article 8 of the Charter of Fundamental Rights. The European Parliament has declared that the Blockchain must be considered a "tool that strengthens the autonomy of citizens by giving them the opportunity to control their data and decide which ones to share in the register, as well as the ability to choose who can see such data", thus favoring the transparency of transactions. Blockchain (or distributed register technology – DLT) technologies and their potential for the European Union’s digital single market have been widely discussed in recent years. It has been argued that blockchain technologies could be a suitable tool to achieve some of the goals of GDPR. Blockchains can be designed to allow data sharing, and improve transparency on data access. This study analyzes the relationship between blockchain and GDPR, to highlight existing problems and study possible solutions in relation to the processing of health data. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Transparency

来源：评论

学校读者我要写书评

暂无评论

Radio Number of the Cartesian Product of Stars and Middle Graph of Cycles

Radio Number of the Cartesian Product of Stars and Middle Gr...

引用

ACIS international conference on Software Engineering, Artificial Intelligence, Networking, and parallel/distributed Computing (SNPD)

作者： Linlin Cui Feng Li Computer College Qinghai Normal University Xi’ning China

ISBN: (数字)9798350391954

ISBN: (纸本)9798350391961

With the rapid development of wireless communication network, channel is the most important part in the communication process, and the problem of reasonable channel assignment becomes increasingly serious. To solve this problem, we study the radio label problem of graph. The graph is usually used as the channel assignment modeling of wireless communication, and the channel assignment problem of the network is simulated by the vertex labeling problem of the graph. Various applications of radio labeling, such as frequency assignment in mobile communication systems, signal processing, parallel and distributed computing, circuit and sensor network design, play an important role in the channel assignment process of wireless communication networks. The channel assignment in the network is converted to the vertex labeling problem of the graph. The maximum radio label of the graph is called its span, and the minimum possible span is called the radio number of the graph. The aim is to find an optimal radio label to reduce the channel utilization rate in the network, so as to reduce the interference in the process of network communication. In this paper, we mainly study the topology of the Cartesian product of stars with $\mathbf{n}$ vertices and the middle graph of cycles, where $m \geq 3$. We simulate the channel assignment of a wireless communication network with the same structure as it, obtain the lower bound of its radio label, and determine optimal radio label.

关键词： Wireless communication Stars Interference Channel allocation Signal processing Mobile communication Radiofrequency integrated circuits

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共306页 << < 31 32 33 34 35 36 37 38 39 40 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：