检索结果-内蒙古大学图书馆

4th international conference on Computer Science and Application Engineering, CSAE 2020

作者： Xu, Chunlei Zhuang, Weijin State Grid Jiangsu Electric Power Co. Ltd China Nanjing China China Electric Power Research Institute Nanjing China

ISBN: (纸本)9781450377720

In recent years, driven by hardware technology, the computing power and programmability of GPUs have been rapidly developed. With the characteristics of highly parallel computing, GPUs are no longer limited to daily graphics processing tasks. It begins to involve a wider range of high-performance generalpurpose computing field. One of the hotspots in the field of highperformance parallel computing is MapReduce, a massive data processing framework. Through inexpensive ordinary computer clusters, we can obtain large-scale data computing capabilities that were previously only owned by expensive large servers. However, most existing MapReduce systems run on CPU clusters, and the computing performance of a single node is limited. Therefore, this paper proposes a parallel computing framework based on GPU cluster and MapReduce, and validates the effectiveness of the framework through experiments. Experiments have proven that our framework can complete the work, and it has a significant speedup for large-scale applications. © 2020 ACM.

关键词： MapReduce

来源：评论

学校读者我要写书评

暂无评论

S-GAT: Accelerating Graph Attention Networks Inference on FPGA Platform with Shift Operation 26

S-GAT: Accelerating Graph Attention Networks Inference on FP...

引用

26th IEEE international conference on parallel and distributed Systems (IEEE ICPADS)

作者： Yan, Weian Tong, Weiqin Zhi, Xiaoli Shanghai Univ Sch Comp Engn & Sci Shanghai Peoples R China

ISBN: (纸本)9781728190747

Deep learning has been successful in many fields such as acoustics, image, and natural language processing. However, due to the unique characteristics of graphs, deep learning using universal graph data is not easy. The Graph Attention Networks (GATs) show the best performance in multiple authoritative node classification benchmark tests (including transductive and inductive). The purpose of this research is to design and implement an FPGA-based accelerator called S-GAT for graph attention networks that achieves excellent performance on acceleration and energy efficiency without losing accuracy, and does not rely on DSPs and large amounts of on-chip memory. We design S-GAT with software and hardware co-optimization. Specifically, we use model compression and feature quantization to reduce the model size, and use shift addition units (SAUs) to convert multiplication into shift operation to further reduce the computation requirements. We integrate the above optimizations into a universal hardware pipeline for various structures of GATs. At last, we evaluate our design on an Inspur F10A board with an Intel Arria 10 GX1150 and 16 GB DDR3 memory. Experimental results show that S-GAT can achieve 7.34 times speedup over Nvidia Tesla V100 and 593 times over Xeon CPU Gold 5115 while maintaining accuracy, and 48 times and 2400 times on energy efficiency respectively.

关键词： Graph Attention Networks Inference accelerating Heterogeneous computing Field Programmable Gate Array parallel computing Energy efficiency Shift operation

来源：评论

学校读者我要写书评

暂无评论

A Lightweight Sound Diagnosis Model for Transformer Discharge Fault Based on Knowledge Distillation with Supercomputing

A Lightweight Sound Diagnosis Model for Transformer Discharg...

引用

Computer Science and Management Technology (ICCSMT), international conference on

作者： Yujiang Long Wei Wei Ce Wang Information Center of Guizhou Power Grid Co. Ltd. Guiyang China

Transformer discharge is a serious power system fault, which can lead to catastrophic accidents. Intelligent analysis of discharge faults by collecting transformer sound provides a non-invasive and sustainable health monitoring method. This paper proposes a lightweight voice fault diagnosis model based on knowledge distillation. The method utilizes the computing power of the supercomputing platform to train multiple teacher models in parallel, and then distill the knowledge of the teacher model into a lightweight student model, which not only achieves good classification accuracy but also can be easily deployed to edge computing platform. The experimental results show that the accuracy of our method is 95.65%, with only 1.28M parameters and only 0.35GFlops calculation amount, which proves the superiority of the method.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PDC Course Development and Assessment Process for the betterment of Teaching-Learning Process

PDC Course Development and Assessment Process for the better...

引用

IEEE international conference on High Performance computing Workshops (HiPCW)

作者： Neelima Bayyapu Department of Computer Science and Engineering Manipal Institute of Technology (MIT) Manipal Academy of Higher Education (MAHE) Manipal Karnataka India

Outcome-based education is being adapted in most of the engineering colleges across India. This process is being accredited by a national body called National Board of Accreditation (NBA) in India. This accreditation process involves a continuous process of defining Program Outcomes (POs) and Course Outcomes (COs) and these are evaluated to check the OBE outcome level. This course evaluation includes course content, evaluation methods used for students' evaluation, pedagogy of teaching, Bloom's taxonomy mapping of learning levels, feedback from stakeholders and results and outcome analysis. If the required level is not attained, the COs and POs are reviewed and accordingly curriculum is revised. There are a lot of challenges being faced by faculty in this process of creating and assessing the curriculum as per NBA standards. These challenges are unique and non-trivial for PDC/HPC (parallel and distributed computing/High-Performance computing) courses. The preparation of OBE-based PDC/HPC course involves a lot of stakeholders and brainstorming over multiple sessions. Many universities are adopting PDC/HPC courses more recently across the world. With an intention to create a pointer for developing, delivering and reviewing a PDC/HPC course, this paper presents the course development process for the benefit of various stakeholders and specially PDC/HPC educators and institutions. This research paper presents the undergraduate teaching experience of parallel computing (PC) course with a critical evaluation based on course Outcomes (COs) and Bloom's taxonomy mapping of learning levels. This paper compiles a list of challenges faced by PDC/HPC educators and stakeholders focusing on the Indian education scenario. It lists the activities and the suggestions that can be applied to address these challenges suitably. The research and teaching experience-based discussions and strategies proposed in this paper help other PDC/HPC educators and stakeholders in the CS (Computer Sci

关键词： High performance computing Heuristic algorithms Education Taxonomy Focusing parallel processing Accreditation

来源：评论

学校读者我要写书评

暂无评论

Improving the energy efficiency of data-intensive applications running on clusters

引用

international JOURNAL OF parallel EMERGENT AND distributed SYSTEMS 2020年第3期35卷 246-259页

作者： Liu, Weifeng Zhou, Jie Gong, Bin Dai, Hongjun Guo, Meng Univ Jinan Sch Informat Sci & Engn Jinan Peoples R China Informat & Telecommun Co State Grid Shandong Elect Power Co Jinan Peoples R China Shandong Univ Sch Comp Sci & Technol Jinan Peoples R China Natl Supercomp Ctr Jinan Shandong Comp Sci Ctr Jinan Peoples R China

As an alternative to traditional computing architecture, cloud computing now is rapidly growing. However, it is based on models like cluster computing in general. Now supercomputers are getting more and more powerful, helping scientists have more indepth understanding of the world. At the same time, clusters of commodity servers have been mainstream in the IT industry, powering not only large Internet services but also a growing number of data-intensive scientific applications, such as MPI based deep learning applications. In order to reduce the energy cost, more and more efforts are made to improve the energy consumption of HPC systems. Because I/O accesses account for a large portion of the execution time for data intensive applications, it is critical to design energy-aware parallel I/O functions for addressing challenges related to HPC energy efficiency. As the de facto standard for designing parallel applications in cluster environment, the Message Passing Interface has been widely used in high performance computing, therefore, getting the energy consumption information of MPI applications is critical for improving the energy efficiency of HPC systems. In this work we first present our energy measurement tool, a software framework that eases the energy collection in cluster environment. And then we present an approach which can optimise the parallel I/O operation's energy efficiency. The energy scheduling algorithm is evaluated in a cluster.

关键词： parallel computing MPI energy measurement energy modeling

来源：评论

学校读者我要写书评

暂无评论

Research on Virtual Power Plants Participating in Ancillary Service Market

Research on Virtual Power Plants Participating in Ancillary ...

引用

Electrical Engineering and Control Science (IC2ECS), international conference on

作者： Guohui Lan Zitao Zhang Minxing Guo Li Lan Ran Lyu Su Wang Institute of Economics and Technology State Grid Shanghai Electric Power Company Shanghai China School of Electrical Engineering Southeast University Nanjing China

Under the “Double Carbon” strategy, the speed of new energy installation and grid connection has been continuously improved. In order to better absorb new energy power generation and ensure the safe and stable operation of the power system, it is urgent to provide ancillary service resources such as frequency regulation and peak shaving. The virtual power plant aggregates the distributed energy scattered in the power grid through advanced communication, computing, dispatching, market and other means, making it a “power generation system” that can be uniformly dispatched, and then follow the dispatching instructions and participate in the ancillary service market. Firstly, the relevant structure and development of virtual power plant is introduced. Secondly, the relevant types of virtual power plants participating in the ancillary service market are analyzed. Thirdly, the actual situation of virtual power plants participating in ancillary service markets for frequency regulation and peak shaving is compared and analyzed with reference to the market access conditions, compensation mechanism and allocation mechanism, which lays a theoretical foundation for different provinces to further improve the mechanism of virtual power plants participating in ancillary service markets. At last, according to China’s national conditions, suggestions on the construction of market mechanism for virtual power plants to participate in ancillary services are proposed.

关键词： Frequency modulation Urban areas Virtual power plants Dispatching Regulation Power grids Telecommunication computing

来源：评论

学校读者我要写书评

暂无评论

distributed optimal voltage regulation for distribution networks with DGs at the energy delivery grid edge with limited information exchange

Distributed optimal voltage regulation for distribution netw...

引用

2020 IEEE international conference on Power Systems Technology, POWERCON 2020

作者： Razeghi-Jahromi, Mohammad Coats, David Stoupis, James ABB Corporate Research Electrical Systems Department Raleigh United States

ISBN: (纸本)9781728163505

An algorithm for distributed optimal voltage regulation of distribution networks with distributed generators (DGs) at the grid edge is proposed in the paper. We first introduce a distributed recursive algorithm to estimate the sensitivity parameters of the network without needing any information exchange between the edge computing devices. Then, we cast the distributed voltage control problem as a distributed linear program which can be solved efficiently on edge computing devices to minimally adjust DGs generations. The algorithm is tested with EPRI Test Ckt24 and the results are compared with the situation when data exchange is available between the edge devices. © 2020 IEEE.

关键词： Voltage control

来源：评论

学校读者我要写书评

暂无评论

An Efficient Work-Stealing Scheduler for Task Dependency Graph 26

An Efficient Work-Stealing Scheduler for Task Dependency Gra...

引用

26th IEEE international conference on parallel and distributed Systems (IEEE ICPADS)

作者： Lin, Chun-Xun Huang, Tsung-Wei Wong, Martin D. F. Univ Illinois Dept Elect & Comp Engn Urbana IL 61801 USA Univ Utah Dept Elect & Comp Engn Salt Lake City UT USA

ISBN: (纸本)9781728190747

Work-stealing is a key component of many parallel task graph libraries such as Intel Threading Building Blocks (TBB) FlowGraph, Microsoft Task parallel Library (TPL) ***, Cpp-Taskflow, and Nabbit. However, designing a correct and effective work-stealing scheduler is a notoriously difficult job, due to subtle implementation details of concurrency controls and decentralized coordination between threads. This problem becomes even more challenging when striving for optimal thread usage in handling parallel workloads with complex task graphs. As a result, we introduce in this paper an effective work-stealing scheduler for execution of task dependency graphs. Our scheduler adopts a simple and efficient strategy to adapt the number of working threads to available task parallelism at any time during the graph execution. Our strategy is provably good in preventing resource underutilization and simultaneously minimizing resource waste when tasks are scarce. We have evaluated our scheduler on both micro-benchmarks and a real-world circuit timing analysis workload, and demonstrated promising results over existing methods in terms of runtime, energy efficiency, and throughput.

关键词： task dependency graph work stealing parallel computing scheduling multithreading

来源：评论

学校读者我要写书评

暂无评论

ICASEA 2021 - 3rd 2021 international conference on Advance of Sustainable Engineering and its Application

ICASEA 2021 - 3rd 2021 International Conference on Advance o...

引用

3rd international conference on Advance of Sustainable Engineering and its Application, ICASEA 2021

ISBN: (纸本)9781665497367

The proceedings contain 42 papers. The topics discussed include: evaluation of turbulence and non-Newtonian blood rheology models through an FDA nozzle;diagnosing of air compressor faults using frequency data driven approach;experimental analysis modelling of tower solar chimney;an experimental study to improve heat transfer rate in a double pipe heat exchanger using helical tape;study the effect of storage capacity on the performance of swimming pool heating system;improvement the heat dissipation by using different integral finned tubes for cross flow heat exchanger;design a new smart monitoring micro-grid photovoltaic system network based on mobile technology;a low-cost real-time monitoring system for the river level in Wasit province;cloud-based parallel computing system via single-client multi-hash single-server multi-thread;concatenated turbo polar codes: an overview;and practical work for a stand-alone photovoltaic system: efficient MPPT using neural network approach.

关键词：

来源：评论

学校读者我要写书评

暂无评论

The Design of FIR Filter Based on Improved DA and Implementation to High-Speed Ground Penetrating Radar System 16

The Design of FIR Filter Based on Improved DA and Implementa...

引用

16th IEEE international Wireless Communications and Mobile computing conference (IEEE IWCMC)

作者： Li, Jixi Bai, Xu Han, Shuai Yu, Yue Harbin Inst Technol Sch Elect & Informat Engn Harbin Peoples R China China Acad Informat & Commun Technol Beijing Peoples R China

ISBN: (纸本)9781728131290

As one of the basic components of digital signal processing, digital finite impulse response (FIR) filters are widely used in image processing, speech recognition, and many other fields. This paper proposes an improved distributed algorithm (DA) to implement high-order digital FIR filters with less logical delay and hardware utilization. Firstly, the parallel DA is designed and then improved by look-up-table (LUT) decomposition. Secondly, the improved DA FIR filters are implemented on the Xilinx kintex-7 FPGA chip and used in high-speed ground penetrating radar (GPR) system to process radar signals. Finally, the performance of the DA filters with different order and structures are analyzed and compared, taking logical delay and hardware utilization as the key indicators. It comes to a conclusion that the parallel DA with LUT decomposition can implement high-order filter more effectively than traditional structures.

关键词： FIR filter Look-up-table distributed Algorithm FPGA parallel processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：