检索结果-内蒙古大学图书馆

High-speed parallel processing with photonic feedforward reservoir computing

OPTICS EXPRESS 2023年第26期31卷 43920-43933页

作者： Zhang, Junfeng Ma, Bowen Zou, Weiwen Shanghai Jiao Tong Univ Intelligent Microwave Lightwave Integrat Innovat C Dept Elect Engn State Key Lab Adv Opt Commun Syst 800 Dongchuan Rd Shanghai 200240 Peoples R China

High-speed photonic reservoir computing (RC) has garnered significant interest in neuromorphic computing. However, existing reservoir layer (RL) architectures mostly rely on time -delayed feedback loops and use analog -to -digital converters for offline digital processing in the implementation of the readout layer, posing inherent limitations on their speed and capabilities. In this paper, we propose a non -feedback method that utilizes the pulse broadening effect induced by optical dispersion to implement a RL. By combining the multiplication of the modulator with the summation of the pulse temporal integration of the distributed feedback -laser diode, we successfully achieve the linear regression operation of the optoelectronic analog readout layer. Our proposed fully -analog feed -forward photonic RC (FF-PhRC) system is experimentally demonstrated to be effective in chaotic signal prediction, spoken digit recognition, and MNIST classification. Additionally, using wavelength -division multiplexing, our system manages to complete parallel tasks and improve processing capability up to 10 GHz per wavelength. The present work highlights the potential of FF-PhRC as a high-performance, high-speed computing tool for real-time neuromorphic computing. (c) 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

关键词： Analog to digital converters Diffractive optical elements Optical computing Optical neural systems Optical systems Vertical cavity surface emitting lasers

来源：评论

学校读者我要写书评

暂无评论

Boosting Graph Embedding on a Single GPU

引用

IEEE TRANSACTIONS ON parallel AND distributed systems 2022年第11期33卷 3092-3105页

作者： Aljundi, Amro Alabsi Akyildiz, Taha Atahan Kaya, Kamer Sabanci Univ Fac Engn & Nat Sci TR-34956 Istanbul Turkey

Graphs are ubiquitous, and they can model unique characteristics and complex relations of real-life systems. Although using machine learning (ML) on graphs is promising, their raw representation is not suitable for ML algorithms. Graph embedding represents each node of a graph as a d-dimensional vector which is more suitable for ML tasks. However, the embedding process is expensive, and CPU-based tools do not scale to real-world graphs. In this work, we present GOSH, a GPU-based tool for embedding large-scale graphs with minimum hardware constraints. GOSH employs a novel graph coarsening algorithm to enhance the impact of updates and minimize the work for embedding. It also incorporates a decomposition schema that enables any arbitrarily large graph to be embedded with a single GPU. As a result, GOSH sets a new state-of-the-art in link prediction both in accuracy and speed, and delivers high-quality embeddings for node classification at a fraction of the time compared to the state-of-the-art. For instance, it can embed a graph with over 65 million vertices and 1.8 billion edges in less than 30 minutes on a single GPU.

关键词： parallel graph embedding graph coarsening machine learning GPU link prediction node classification

来源：评论

学校读者我要写书评

暂无评论

DAG Scheduling and Analysis on Multi-Core systems by Modelling parallelism and Dependency

引用

IEEE TRANSACTIONS ON parallel AND distributed systems 2022年第12期33卷 4019-4038页

作者： Zhao, Shuai Dai, Xiaotian Bate, Iain Univ York Dept Comp Sci York YO10 5DD N Yorkshire England

With ever more complex functionalities being implemented in emerging real-time applications, multi-core systems are demanded for high performance, with directed acyclic graphs (DAG) being used to model functional dependencies. For a single DAG task, our previous work presented a concurrent provider and consumer (CPC) model that captures the node-level dependency and parallelism, which are the two key factors of a DAG. Based on the CPC, scheduling and analysis methods were constructed to reduce makespan and tighten the analytical bound of the task. However, the CPC-based methods cannot support multi-DAGs as the interference between DAGs (i.e., inter-task interference) is not taken into account. To address this limitation, this article proposes a novel multi-DAG scheduling approach which specifies the number of cores a DAG can utilise so that it does not incur the inter-task interference. This is achieved by modelling and understanding the workload distribution of the DAG and the system. By avoiding the inter-task interference, the constructed schedule provides full compatibility for the CPC-based methods to be applied on each DAG and reduces the pessimism of the existing analysis. Experimental results show that the proposed multi-DAG method achieves an improvement up to 80% in schedulability against the original work that it extends, and outperforms the existing multi-DAG methods by up to 60% for tightening the interference.

关键词： Task analysis Interference Schedules parallel processing time factors Computational modeling Analytical models Multi-core systems directed acyclic graphs schedulability tests

来源：评论

学校读者我要写书评

暂无评论

DLA-Future: A Task-Based Linear Algebra Library Which Provides a GPU-Enabled distributed Eigensolver 1

引用

2nd International workshop on Asynchronous Many-Task systems and Applications (WAMTA)

作者： Solca, Raffaele Simberg, Mikael Meli, Rocco Invernizzi, Alberto Reverdell, Auriane Biddiscombe, John Swiss Fed Inst Technol Swiss Natl Supercomp Ctr Zurich Switzerland

ISBN: (数字)9783031617638

ISBN: (纸本)9783031617621;9783031617638

DLA-Future implements an efficient GPU-enabled distributed eigenvalue solver using a software architecture based on the C++ std::execution concurrency proposal. The state-of-the-art linear algebra implementations LAPACK and ScaLAPACK were designed for legacy systems and employ fork-join parallelism, which can perform inefficiently on modern architectures. The benefits of task-based linear algebra implementations are significant. The reduction of synchronization points and the ease of overlapping computation with communication are two of the main benefits that lead to improved performance. In specific cases, the ability to schedule multiple algorithms concurrently yields a noticeable reduction of time-to-solution. We present the implementation of DLA-Future and the results on different types of systems starting from Piz Daint multicore and GPU partitions, moving to more recent architectures available in ALPS. The benchmark results are divided into two categories. The first contains a comparison of DLA-Future against widely used eigensolver implementations. The second category showcases the performance of the eigensolver in real applications. We present results generated with CP2K, where DLA-Future support was easily added thanks to the provided C API, which is compatible with the ScaLAPACK interface.

关键词： eigenvalue solver generalized eigenvalue solver task-based linear algebra distributed linear algebra std::execution ScaLAPACK drop-in

来源：评论

学校读者我要写书评

暂无评论

A parallel Algorithm Template for Updating Single-Source Shortest Paths in Large-Scale Dynamic Networks

引用

IEEE TRANSACTIONS ON parallel AND distributed systems 2022年第4期33卷 929-940页

作者： Khanda, Arindam Srinivasan, Sriram Bhowmick, Sanjukta Norris, Boyana Das, Sajal K. Missouri Univ Sci & Technol Dept Comp Sci Rolla MO 65409 USA Virginia Commonwealth Univ Dept Radiat Oncol Richmond VA 23284 USA Univ North Texas Dept Comp Sci & Engn Denton TX 76201 USA Univ Oregon Dept Comp & Informat Sci Eugene OR 97403 USA

The Single Source Shortest Path (SSSP) problem is a classic graph theory problem that arises frequently in various practical scenarios;hence, many parallel algorithms have been developed to solve it. However, these algorithms operate on static graphs, whereas many real-world problems are best modeled as dynamic networks, where the structure of the network changes with time. This gap between the dynamic graph modeling and the assumed static graph model in the conventional SSSP algorithms motivates this work. We present a novel parallel algorithmic framework for updating the SSSP in large-scale dynamic networks and implement it on the shared-memory and GPU platforms. The basic idea is to identify the portion of the network affected by the changes and update the information in a rooted tree data structure that stores the edges of the network that are most relevant to the analysis. Extensive experimental evaluations on real-world and synthetic networks demonstrate that our proposed parallel updating algorithm is scalable and, in most cases, requires significantly less execution time than the state-of-the-art recomputing-from-scratch algorithms.

关键词： Heuristic algorithms Graphics processing units parallel algorithms Synchronization Multicore processing Complexity theory Wireless sensor networks Dynamic networks single source shortest path (SSSP) shared-memory parallel algorithm GPU implementation

来源：评论

学校读者我要写书评

暂无评论

Research on Mixed Logic Dynamic Modeling and Finite Control Set Model Predictive Control of Multi-Inverter parallel System

引用

Energy Engineering 2023年第3期120卷 649-664页

作者： Xiaojuan Lu Mengqiao Chen Qingbo Zhang Lanzhou Jiaotong University Lanzhou730070China

parallel connection of multiple inverters is an important means to solve the expansion,reserve and protection of distributed power generation,such as *** view of the shortcomings of traditional droop control methods such as weak anti-interference ability,low tracking accuracy of inverter output voltage and serious circulation phenomenon,a finite control set model predictive control(FCS-MPC)strategy of microgrid multiinverter parallel system based on Mixed Logical Dynamical(MLD)modeling is ***,the MLD modeling method is introduced logical variables,combining discrete events and continuous events to form an overall differential equation,which makes the modeling more *** a predictive controller is designed based on the model,and constraints are added to the objective function,which can not only solve the real-time changes of the control system by online optimization,but also effectively obtain a higher tracking accuracy of the inverter output voltage and lower total harmonic distortion rate(Total Harmonics Distortion,THD);and suppress the circulating current between the inverters,to obtain a good dynamic ***,the simulation is carried out onMATLAB/Simulink to verify the correctness of the model and the rationality of the proposed *** paper aims to provide guidance for the design and optimal control of multi-inverter parallel systems.

关键词： Multiple inverters in parallel microgrid mixed logic dynamic model finite control set model predictive control circulation

来源：评论

学校读者我要写书评

暂无评论

Organizational Patterns Derived from distributed Machine Control System Patterns 24

Organizational Patterns Derived from Distributed Machine Con...

引用

29th European Conference on Pattern Languages of Programs, People, and Practices

作者： Eloranta, Veli-Pekka Koskinen, Johannes Leppanen, Marko Reijonen, Ville Vincit Oyj Tampere Finland Tieteentekijoiden Tuki Ry Helsinki Finland Metso Oyj Helsinki Finland Itio Consulting Oy Helsinki Finland

ISBN: (纸本)9798400716836

This work explores "reverse engineering" organizational patterns from distributed machine control system (DMCS) patterns. The authors analyzed four core DMCS patterns (Separate real-time, Isolate Functionalities, Variable Manager, and Notifications) utilizing the similarity between communication structures within any organization and the architectural structures in the software architecture patterns. As a result, four new corresponding organizational patterns were written. In this paper, these patterns and the outline for the ideation process are presented.

关键词： Organizational Patterns Organizational Design Cross-domain Pattern Analysis Interdisciplinary Pattern Research Conway's Law Organizational Structures Organizational Transformation System Design distributed Machine Control systems Communication Software Architecture Pattern Mining

来源：评论

学校读者我要写书评

暂无评论

SLearn: A Case for Task Sampling Based Learning for Cluster Job Scheduling

引用

IEEE TRANSACTIONS ON CLOUD COMPUTING 2023年第3期11卷 2664-2680页

作者： Jajoo, Akshay Hu, Y. Charlie Lin, Xiaojun Deng, Nan Nokia Bell Labs Murray Hills NJ 07974 USA Purdue Univ Elmore Family Sch Elect & Comp Engn W Lafayette IN 47907 USA Google Inc Mountain View CA 94043 USA

The ability to accurately estimate job runtime properties allows a scheduler to effectively schedule jobs. State-of-the-art online cluster job schedulers use history-based learning, which uses past job execution information to estimate the runtime properties of newly arrived jobs. However, with fast-paced development in cluster technology (in both hardware and software) and changing user inputs, job runtime properties can change over time, which lead to inaccurate predictions. In this article, we explore the potential and limitation of real-time learning of job runtime properties, by proactively sampling and scheduling a small fraction of the tasks of each job. Such a task-sampling-based approach exploits the similarity among runtime properties of the tasks of the same job and is inherently immune to changing job behavior. Our analytical and experimental analysis of 3 production traces with different skew and job distribution shows that learning in space can be substantially more accurate. Our simulation and testbed evaluation on Azure of the two learning approaches anchored in a generic job scheduler using 3 production cluster job traces shows that despite its online overhead, learning in space reduces the average Job Completion time (JCT) by 1.28x, 1.56x, and 1.32x compared to the prior-art history-based predictor. We further analyze the experimental results to give intuitive explanations to why learning in space outperforms learning in time in these experiments. Finally, we show how sampling-based learning can be extended to schedule DAG jobs and achieve similar speedups over the prior-art history-based predictor.

关键词： distributed systems scheduling and task partitioning cluster scheduling online learning sampling-based learning distributed applications data-parallel applications big data applications

来源：评论

学校读者我要写书评

暂无评论

distributed On-Demand Deployment for Transparent Access to 5G Edge Computing Services

Distributed On-Demand Deployment for Transparent Access to 5...

引用

37th IEEE International parallel and distributed Processing Symposium (IPDPS)

作者： Hammer, Josef Hellwagner, Hermann Alpen Adria Univ Klagenfurt Inst Informat Technol Klagenfurt Austria

ISBN: (纸本)9798350311990

Multi-access Edge Computing (MEC) is a central piece of 5G telecommunication systems and is essential to satisfy the challenging low-latency demands of future applications. MEC provides a cloud computing platform at the edge of the radio access network. Our previous publications argue that edge computing should be transparent to clients, leveraging Software-Defined Networking (SDN). While we introduced a solution to implement such a transparent approach, one question remained: How to handle user requests to a service that is not yet running in a nearby edge cluster? One advantage of the transparent edge is that one could process the initial request in the cloud. However, this paper argues that on-demand deployment might be fast enough for many services, even for the first request. We present an SDN controller that automatically deploys an application container in a nearby edge cluster if no instance is running yet. In the meantime, the user's request is forwarded to another (nearby) edge cluster or kept waiting to be forwarded immediately to the newly instantiated instance. Our performance evaluations on a real edge/fog testbed show that the waiting time for the initial request - e.g., for an nginx-based service - can be as low as 0.5 seconds - satisfactory for many applications.

关键词： Multi-Access Edge Computing MEC Fog Computing Software-Defined Networking SDN Serverless Computing Container Docker Kubernetes

来源：评论

学校读者我要写书评

暂无评论

A Multi-Level parallel Algorithm for Detection of Single Scatterers in SAR Tomography 33

A Multi-Level Parallel Algorithm for Detection of Single Sca...

引用

33rd Euromicro International Conference on parallel, distributed, and Network-Based Processing, PDP 2025

作者： Russo, Massimiliano Nisar, Mehwish Pauciullo, Antonio Imperatore, Pasquale Lapegna, Marco Romano, Diego University of Naples Federico II Dept. of Mathematics and Applications Naples Italy National Research Council Naples Italy Inst. for High Performance Computing and Networking National Research Council Naples Italy

ISBN: (纸本)9798331524937

Synthetic Aperture Radar (SAR) tomography is an advanced technique for monitoring deformations of the Earth's surface. However, the computational complexity of SAR tomography algorithms often restricts their application to large-scale datasets. To address this issue, we introduce a multi-level parallel implementation of a single scatterer detection algorithm specifically designed to exploit the capabilities of modern heterogeneous High-Performance Computing (HPC) systems. By efficiently distributing the computational workload at different levels across multiple processing units, our parallel approach significantly reduces processing time, facilitating the analysis of extensive SAR datasets. We assess the performance of our parallel implementation using real-world SAR data, showcasing its effectiveness in enhancing both the efficiency and scalability of SAR tomography. Our work contributes to advancing remote sensing techniques and offers valuable insights into the application of HPC for large-scale environmental monitoring. © 2025 IEEE.

关键词： Light scattering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：