检索结果-内蒙古大学图书馆

12th workshop on parallel Programming and Run-time Management Techniques for Many-Core Architectures and 10th workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms, PARMA-DITAM 2021

12th Workshop on Parallel Programming and Run-Time Managemen...

引用

12th workshop on parallel Programming and Run-time Management Techniques for Many-Core Architectures and 10th workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms, PARMA-DITAM 2021

ISBN: (纸本)9783959771818

The proceedings contain 5 papers. The topics discussed include: towards adaptive multi-alternative process network;BifurKTM: approximately consistent distributed transactional memory for GPUs;the impact of precision tuning on embedded systems performance: a case study on field-oriented control;resource aware GPU scheduling in kubernetes infrastructure;and HPC application cloudification: the streamflow toolkit.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An optimised state-of-charge balance control strategy for distributed energy storage units in islanded DC microgrid

引用

IET GENERATION TRANSMISSION & DISTRIBUTION 2021年第6期15卷 1021-1030页

作者： Mi, Yang Ma, Yuchen Yu, Si Cai, Pengcheng Ji, Liang Fu, Yang Yue, Dong Jin, Chi Wang, Peng Shanghai Univ Elect Power Sch Elect Engn 2588 Changyang Rd Shanghai 200090 Peoples R China State Grid Hunan Yueyang Power Supply Co Changsha Peoples R China Nanjing Univ Posts & Telecommun Nanjing Peoples R China Energy Res Inst NTU Singapore Singapore

The optimised droop control method is proposed to achieve the state-of-charge (SoC) balance among parallel-connected distributed energy storage units in islanded DC microgrid, which considers the difference of line impedance, initial state-of-charge values and capacities among distributed energy storage units. Since the droop control is the basic control strategy for load sharing in DC microgrid applications, however, the load sharing accuracy is degraded under conventional droop control method due to the unmatched line impedance in reality. Meanwhile, the initial state-of-charge values and capacities of each distributed energy storage unit are usually different. Hence, the state of charge for distributed energy storage units cannot be balanced. In order to prolong the lifetime of the distributed energy storage units and avoid the overuse of a certain distributed energy storage unit, the optimised droop control strategy based on sample and holder is designed, by modifying the droop coefficient adaptively, the accurate load sharing and balanced state of charge among distributed energy storage units are both obtained. Finally, the performance of the proposed control scheme is accessed through a series cases on technologies real-time digital simulator (RTDS) and its effectiveness is verified.

关键词： Control of electric power systems distributed power generation Distribution networks Power engineering computing Other energy storage

来源：评论

学校读者我要写书评

暂无评论

Optimizing Data Lakes for High-Performance Analytics in Big Data Ecosystems

Optimizing Data Lakes for High-Performance Analytics in Big ...

引用

Communications and Information Technologies (GCCIT), Global Conference on

作者： Priyam Vaghasiya Dhruvitkumar Patel Mondrian Collection Staten Island Performing Provider System

ISBN: (数字)9798350388916

ISBN: (纸本)9798350388923

This paper proposes an optimized framework for data lakes meant to enhance parameters like data rate, response time, and capacity in big data environments. Some common issues with traditional data lakes include issues of slow data access, and basic problem of scalability that becomes an issue as data sizes increase. These challenges are however countered by the proposed method through the use of distributed data structures of data, indexing quality, concurrent processing and in memory computing. The results obtained are 40% time saving in retrieving large data set and 50% improvement in query response time for heavy data. This in-memory processed system exhibited up to 60% gain in data throughput and achieved better scalability with shorter query response time as more parallel processing nodes were added. The evidences derived from the experiment clearly explain that the proposed method not only improves the performance but also offers cost efficient solution for the organizations those are dealing with big real time data analytics. Subsequent studies may involve the exploration of higher levels of machine learning to enhance the predictability of the analytical approaches used and optimal resource allocation to improved data lake underway in the cloud.

关键词： Costs Scalability Organizations parallel processing Big Data applications Throughput real-time systems time factors Resource management Optimization

来源：评论

学校读者我要写书评

暂无评论

高帧频面探测器分布式数据获取软件研究

引用

核技术 2024年第10期47卷 106-114页

作者：商琨琳李正恒鞠旭东周悦怀平刘志上海科技大学上海201210 中国科学院上海高等研究院上海201210

高帧频面探测器是上海硬X射线自由电子激光装置(Shanghai HIgh repetitioN rate xfel and Extreme light facility,SHINE)上主要成像类实验站的核心探测器,其预计数据通量将达到20 GB·s^(-1)以上。对于几十GB·s^(-1)原始数据... 详细信息

高帧频面探测器是上海硬X射线自由电子激光装置(Shanghai HIgh repetitioN rate xfel and Extreme light facility,SHINE)上主要成像类实验站的核心探测器,其预计数据通量将达到20 GB·s^(-1)以上。对于几十GB·s^(-1)原始数据流的实时接收和处理,传统单机系统很难应对。本文提出了一种多节点分布式面探测器数据获取和处理软件架构。首先,研究了不同网络库的性能,采用同步传输和CPU线程绑定等方法实现了接近3 GB·s^(-1)的单线程数据接收速率;依据X射线脉冲编号(Bunch ID)进行各模块数据的多节点分发与合并,采用4个节点实现了约23.5 GB·s^(-1)的并行事例重建速率;进一步,对数据刻度和压缩算法进行了实现和测试,结合bitshuffle和LZ4压缩算法,实现了约5.7倍的压缩率。本文验证了多节点分布式的并行数据获取方法的可行性,为后续面探测器高通量数据获取软件的开发打下基础。

关键词：面探测器数据获取数据压缩高帧频分布式计算

来源：评论

学校读者我要写书评

暂无评论

parallel space-time likelihood optimization for air pollution prediction on large-scale systems 22

Parallel space-time likelihood optimization for air pollutio...

引用

2022 Platform for Advanced Scientific Computing Conference, PASC 2022

作者： Salvaña, Mary Lai O. Abdulah, Sameh Ltaief, Hatem Sun, Ying Genton, Marc G. Keyes, David E. Extreme Computing Research Center King Abdullah University of Science and Technology Thuwal Saudi Arabia

ISBN: (纸本)9781450394109

Gaussian geostatistical space-time modeling is an effective tool for performing statistical inference of field data evolving in space and time, generalizing spatial modeling alone at the cost of the greater complexity of operations and storage, and pushing geostatistical modeling even further into the arms of high-performance computing. It makes inferences for missing data by leveraging space-time measurements of one or more fields. We propose a high-performance implementation of a widely applied space-time model for large-scale systems using a two-level parallelization technique. At the inner level, we rely on state-of-the-art dense linear algebra libraries and parallel runtime systems to perform complex matrix operations required to evaluate the maximum likelihood estimation (MLE). At the outer level, we parallelize the optimization process using a distributed implementation of the particle swarm optimization (PSO) algorithm. At this level, parallelization is accomplished using MPI sub-communicators, such that the nodes in each sub-communicator perform a single MLE iteration at a time. To evaluate the effectiveness of the proposed methodology, we assess the accuracy of the newly implemented space-time model on a set of large-scale synthetic space-time datasets. Moreover, we use the proposed implementation to model two air pollution datasets from the Middle East and US regions with 550 spatial locations X730 time slots and 945 spatial locations X500 time slots, respectively. The evaluation shows that the proposed approach satisfies high prediction accuracy on both synthetic datasets and real particulate matter (PM) datasets in the context of the air pollution problem. We achieve up to 757.16 TFLOPS/s using 1024 nodes (75% of the peak performance) using 490K geospatial locations on Shaheen-II Cray XC40 system. © 2022 Owner/Author.

关键词： Air pollution

来源：评论

学校读者我要写书评

暂无评论

MLP SINGER: TOWARDS RAPID parallel KOREAN SINGING VOICE SYNTHESIS 31

MLP SINGER: TOWARDS RAPID PARALLEL KOREAN SINGING VOICE SYNT...

引用

IEEE 31st International workshop on Machine Learning for Signal Processing (MLSP)

作者： Tae, Jaesung Kim, Hyeongju Lee, Younggun Yale Univ New Haven CT 06520 USA Neosapience Inc Seoul South Korea

ISBN: (纸本)9781728163383

Recent developments in deep learning have significantly improved the quality of synthesized singing voice audio. However, prominent neural singing voice synthesis systems suffer from slow inference speed due to their autoregressive design. Inspired by MLP-Mixer, a novel architecture introduced in the vision literature for attention-free image classification, we propose MLP Singer, a parallel Korean singing voice synthesis system. To the best of our knowledge, this is the first work that uses an entirely MLP-based architecture for voice synthesis. Listening tests demonstrate that MLP Singer outperforms a larger autoregressive GAN-based system, both in terms of audio quality and synthesis speed. In particular, MLP Singer achieves a real-time factor of up to 200 and 3400 on CPUs and GPUs respectively, enabling order of magnitude faster generation on both environments.(1)

关键词： singing voice synthesis parallel text-to-speech multi-layer perceptrons

来源：评论

学校读者我要写书评

暂无评论

Comparison of the Efficiency of parallel Algorithms KNN and NLM Based on CUDA for Large Image Processing 5

Comparison of the Efficiency of Parallel Algorithms KNN and ...

引用

5th International workshop on Computer Modeling and Intelligent systems, CMIS 2022

作者： Mochurad, Lesia Bliakhar, Roman Lviv Polytechnic National University 12 Bandera street Lviv79013 Ukraine

Digital image processing is widely used in various fields of science, such as medicine – X-ray analysis, magnetic resonance imaging, computed tomography, cosmology – collecting information from satellites, their transmission and analysis. Image noise accompanies any of the stages of image processing – from obtaining them to segmentation and object recognition. In order to process large images in real time, the paper proposes to use CUDA technology to parallelize KNN and NLM algorithms. The subject area of the satellite images is selected for noise reduction. A comparative analysis of the effectiveness of the proposed approach. It is investigated how the computation time of successive versions of algorithms changes with increasing the size of the image itself. Based on a series of numerical experiments, it was possible to achieve an acceleration of 40 times using CUDA technology. It is shown that the calculations on the CPU significantly exceed the time spent on execution compared to the GPU. This is due to the fact that in tasks of this type, several threads on the CPU are not able to compete with thousands of threads of the graphics core. We can also observe that as the number of GPU threads increases, the time decreases significantly. This study is especially relevant in current trends in video cards. that in tasks of this type, several threads on the CPU are not able to compete with thousands of threads of the graphics core. We can also observe that as the number of GPU threads increases, the time decreases significantly. This study is especially relevant in current trends in video cards. that in tasks of this type, several threads on the CPU are not able to compete with thousands of threads of the graphics core. We can also observe that as the number of GPU threads increases, the time decreases significantly. This study is especially relevant in current trends in video cards. © 2022 Copyright for this paper by its authors.

关键词： Optimization

来源：评论

学校读者我要写书评

暂无评论

Task Fusion in distributed Runtimes

Task Fusion in Distributed Runtimes

引用

parallel Applications workshop, Alternatives To MPI (PAW-ATM)

作者： Shiv Sundram Wonchan Lee Alex Aiken Department of Computer Science Stanford University Stanford CA NVIDIA Santa Clara USA

ISBN: (纸本)9781665454117

We present distributed task fusion, a run-time optimization for task-based runtimes operating on parallel and heterogeneous systems. distributed task fusion dynamically performs an efficient buffering, analysis, and fusion of asynchronously-evaluated distributed operations, reducing the overheads inherent to scheduling distributed tasks in implicitly parallel frameworks and runtimes. We identify the constraints under which distributed task fusion is permissible and describe an implementation in Legate, a domain-agnostic library for constructing portable and scalable task-based libraries. We present performance results using cuNumeric, a Legate library that enables scalable execution of NumPy pipelines on parallel and heterogeneous systems. We realize speedups up to 1.5x with task fusion enabled on up to 32 P100 GPUs, thus demonstrating efficient execution of pipelines involving many successive fine-grained tasks. Finally, we discuss potential future work, including complementary optimizations that could result in additional performance improvements.

关键词： Runtime Conferences Pipelines Graphics processing units Benchmark testing Dynamic scheduling Libraries

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the 1st workshop on Machine Learning and systems, EuroMLSys 2021

Proceedings of the 1st Workshop on Machine Learning and Syst...

引用

1st workshop on Machine Learning and systems, EuroMLSys 2021

ISBN: (纸本)9781450382984

The proceedings contain 16 papers. The topics discussed include: DPD-InfoGAN: differentially private distributed InfoGAN;towards optimal configuration of microservices;DistIR: an intermediate representation for optimizing distributed neural networks;towards a general framework for ML-based self-tuning databases;predicting CPU usage for proactive autoscaling;are we there yet? estimating training time for recommendation systems;Queen Jane approximately: enabling efficient neural network inference with context-adaptivity;AutoAblation: automated parallel ablation studies for deep learning;Vate: runtime adaptable probabilistic programming for Java;DISC: a dynamic shape compiler for machine learning workloads;and towards mitigating device heterogeneity in federated learning via adaptive model quantization.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Akka framework based on the Actor model for executing distributed Fog Computing applications

引用

FUTURE GENERATION COMPUTER systems-THE INTERNATIONAL JOURNAL OF ESCIENCE 2021年 117卷 439-452页

作者： Srirama, Satish Narayana Dick, Freddy Marcelo Surriabre Adhikari, Mainak Univ Hyderabad Sch Comp & Informat Sci Hyderabad India Univ Tartu Inst Comp Sci Mobile & Cloud Lab Tartu Estonia

Future Internet of Things (IoT)-driven applications will move from the cloud-centric IoT model to the hybrid distributed processing model, known as Fog computing, where some of the involved computational tasks (e.g. real-time data analytics) are partially moved to the edge of the network to reduce latency and improve the network efficiency. In recent times, Fog computing has generated significant research interest for IoT applications, however, there is still a lack of ideal approach and framework for supporting parallel and fault-tolerant execution of the tasks while collectively utilizing the resource-constrained Fog devices. To address this issue, in this paper, we propose an Akka framework based on the Actor Model for designing and executing the distributed Fog applications. The Actor Model was conceived as a universal paradigm for concurrent computation with additional requirements such as resiliency and scalability, whereas, the Akka toolkit is a reference implementation of the model. Further, to dynamically deploy the distributed applications on the Fog networks, a Docker containerization approach is used. To validate the proposed actor-based framework, a wireless sensor network case study is designed and implemented for demonstrating the feasibility of conceiving applications on the Fog networks. Besides that, a detailed analysis is produced for showing the performance and parallelization efficiency of the proposed model on the resource-constrained gateway and Fog devices. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Internet of Things Fog Computing Actor programming model Akka toolkit distributed processing Docker

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：