检索结果-内蒙古大学图书馆

35th Annual ACM Symposium on Applied computing (SAC)

作者： Sambo, Aliyu Sani Azad, R. Muhammad Atif Kovalchuk, Yevgeniya Indramohan, Vivek Padmanaabhan Shah, Hanifa Birmingham City Univ Sch Comp & Digital Technol Birmingham W Midlands England Birmingham City Univ Hlth Educ & Life Sci Birmingham W Midlands England Birmingham City Univ Comp Engn & Built Environm Birmingham W Midlands England

ISBN: (纸本)9781450368667

Traditionally, reducing complexity in Machine Learning promises benefits such as less overfitting. However, complexity control in Genetic Programming (GP) often means reducing the sizes of the evolving expressions, and past literature shows that size reduction does not necessarily reduce overfitting. In fact, whether size consistently represents complexity is itself debatable. Therefore, this paper proposes evaluation time of an evolving model - the computational time required to evaluate a model on data - as the estimate of its complexity. Evaluation time depends upon the size, but crucially also on the composition of an evolving model, and can thus distil its underlying complexity. To discourage complexity, this paper takes an innovative approach that asynchronously evaluates multiple models concurrently. These models race to their completion;thus, those models that finish earlier, join the population earlier to breed further in a steady-state fashion. Thus, the computationally simpler models, even if less accurate, get further chances to evolve before the more accurate yet expensive models join the population. Crucially, since evaluation times vary from one execution to another, this paper also shows how to significantly minimise this variation. The paper compares the proposed method on six challenging symbolic regression problems with both standard GP and GP with an effective bloat control method. The results demonstrated that the proposed asynchronous parallel GP (APGP) indeed produces individuals that are smaller, faster and more accurate than those in standard GP. While GP with bloat control (GP+BC) produced smaller individuals, it did so at the cost of lower accuracy than APGP both on training and test data, thus questioning the overall benefits of bloat control. Also, while APGP took the fewest evaluations to match the training accuracy of GP, GP+BC took the most. These results, and the portability of evaluation time as an estimate of complexity encourage f

关键词： genetic programming model complexity parallel computing

来源：评论

学校读者我要写书评

暂无评论

Wavelength-routing interconnect "Optical Hub" for parallel computing systems 2020

Wavelength-routing interconnect "Optical Hub" for parallel c...

引用

International Conference on High Performance computing in Asia-Pacific Region (HPC Asia)

作者： Urino, Yutaka Mizutani, Kenji Usuki, Tatsuya Nakamura, Shigeru PETRA Tsukuba R&D Ctr Tsukuba Ibaraki Japan

ISBN: (纸本)9781450372367

To solve the inter-node bandwidth bottleneck in parallel computing systems, we propose a wavelength-routing inter-node interconnect "Optical Hub". The physical topology of Optical Hub is star network, which leads to advantages in term of its throughput, size, energy consumption and life-time cost. The logical topology is full-mesh network, which leads to advantages in term of its latency and reliability. We introduced multi-path routings, which expand the effective bandwidth with the full-mesh topology such as Optical Hub, by replacing conventional MPI functions with our wrapper functions. We simulated execution time of parallel benchmarks on the parallel computing system with Optical Hub using parallel computing simulator SimGrid. As a result, we have confirmed that the parallel computing system with Optical Hub can achieve higher performance and lower energy consumption than conventional ones. We also examined the scalability of Optical Hub and showed that recursive hierarchical configurations of Optical Hub can save cable count drastically in case of large number of nodes against Dragonfly networks.

关键词： parallel computing inter-node interconnects wavelength routing full-mesh networks MPI collective communications parallel benchmarks simulations

来源：评论

学校读者我要写书评

暂无评论

JOINT for large-scale single-cell RNA-sequencing analysis via soft-clustering and parallel computing

引用

BMC GENOMICS 2021年第1期22卷 47-47页

作者： Cui, Tao Wang, Tingting Georgetown Univ Med Ctr Dept Pharmacol & Physiol Washington DC 20057 USA Georgetown Univ Med Ctr Interdisciplinary Program Neurosci Washington DC 20057 USA

Background: Single-cell RNA-Sequencing (scRNA-Seq) has provided single-cell level insights into complex biological processes. However, the high frequency of gene expression detection failures in scRNA-Seq data make it challenging to achieve reliable identification of cell-types and Differentially Expressed Genes (DEG). Moreover, with the explosive growth of single-cell data using 10x genomics protocol, existing methods will soon reach the computation limit due to scalability issues. The single-cell transcriptomics field desperately need new tools and framework to facilitate large-scale single-cell analysis. Results: In order to improve the accuracy, robustness, and speed of scRNA-Seq data processing, we propose a generalized zero-inflated negative binomial mixture model, "JOINT," that can perform probability-based cell-type discovery and DEG analysis simultaneously without the need for imputation. JOINT performs soft-clustering for cell-type identification by computing the probability of individual cells, i.e. each cell can belong to multiple cell types with different probabilities. This is drastically different from existing hard-clustering methods where each cell can only belong to one cell type. The soft-clustering component of the algorithm significantly facilitates the accuracy and robustness of single-cell analysis, especially when the scRNA-Seq datasets are noisy and contain a large number of dropout events. Moreover, JOINT is able to determine the optimal number of cell-types automatically rather than specifying it empirically. The proposed model is an unsupervised learning problem which is solved by using the Expectation and Maximization (EM) algorithm. The EM algorithm is implemented using the TensorFlow deep learning framework, dramatically accelerating the speed for data analysis through parallel GPU computing. Conclusions: Taken together, the JOINT algorithm is accurate and efficient for large-scale scRNA-Seq data analysis via parallel computing. The Py

关键词： RNA-Seq Single-cell Dropout JOINT Deep learning Probability Soft-clustering DEG parallel computing

来源：评论

学校读者我要写书评

暂无评论

Algorithms for Bidding Strategies in Local Energy Markets: Exhaustive Search through parallel computing and Metaheuristic Optimization

引用

ALGORITHMS 2021年第9期14卷 269页

作者： Angulo, Andres Rodriguez, Diego Garzon, Wilmer Gomez, Diego F. Al Sumaiti, Ameena Rivera, Sergio Univ Nacl Colombia Dept Elect & Elect Engn Bogota 10100 Colombia GERS Dept Int Studies Weston FL 33331 USA Khalifa Univ Dept Elect Engn Abu Dhabi 127788 U Arab Emirates

The integration of different energy resources from traditional power systems presents new challenges for real-time implementation and operation. In the last decade, a way has been sought to optimize the operation of small microgrids (SMGs) that have a great variety of energy sources (PV (photovoltaic) prosumers, Genset CHP (combined heat and power), etc.) with uncertainty in energy production that results in different market prices. For this reason, metaheuristic methods have been used to optimize the decision-making process for multiple players in local and external markets. Players in this network include nine agents: three consumers, three prosumers (consumers with PV capabilities), and three CHP generators. This article deploys metaheuristic algorithms with the objective of maximizing power market transactions and clearing price. Since metaheuristic optimization algorithms do not guarantee global optima, an exhaustive search is deployed to find global optima points. The exhaustive search algorithm is implemented using a parallel computing architecture to reach feasible results in a short amount of time. The global optimal result is used as an indicator to evaluate the performance of the different metaheuristic algorithms. The paper presents results, discussion, comparison, and recommendations regarding the proposed set of algorithms and performance tests.

关键词： operation in uncertain environments optimization reliability smart microgrid parallel computing combinatorial optimization ensemble heuristic algorithm

来源：评论

学校读者我要写书评

暂无评论

An Execution Time Comparison of parallel computing Algorithms for Solving Heat Equation 3rd

An Execution Time Comparison of Parallel Computing Algorithm...

引用

3rd International Conference on Smart Applications and Data Analysis for Smart Cyber Physical Systems

作者： Belhaous, Safa Hidila, Zineb Baroud, Sohaib Chokri, Soumia Mestari, Mohammed Hassan II Univ ENSET SSDIA Lab Mohammadia Morocco

ISBN: (纸本)9783030451820;9783030451837

parallel computing contributes significantly to most disciplines for solving several scientific problems such as partial differential equations (PDEs), load balancing, and deep learning. The primary characteristic of parallelism is its ability to ameliorate performance on many different sets of computers. Consequently, many researchers are continually expending their efforts to produce efficient parallel solutions for various problems such as heat equation. Heat equation is a natural phenomenon used in many fields like mathematics and physics. Usually, its associated model is defined by a set of partial differential equations (PDEs). This paper is primarily aimed at showing two parallel programs for solving the heat equation which has been discrete-sized using the finite difference method (FDM). These programs have been implemented through different parallel platforms such as SkelGIS and Compute Unified Device Architecture (CUDA).

关键词： parallel computing parallel programming Heat equation CUDA SkelGIS library GPU Finite difference method

来源：评论

学校读者我要写书评

暂无评论

A parallel computing Approach to Spatial Neighboring Analysis of Large Amounts of Terrain Data Using Spark

引用

SENSORS 2021年第2期21卷 365-365页

作者： Zhang, Jianbo Ye, Zhuangzhuang Zheng, Kai China Univ Geosci Sch Geog Informat Engn Wuhan 430074 Peoples R China

Spatial neighboring analysis is an indispensable part of geo-raster spatial analysis. In the big data era, high-resolution raster data offer us abundant and valuable information, and also bring enormous computational challenges to the existing focal statistics algorithms. Simply employing the in-memory computing framework Spark to serve such applications might incur performance issues due to its lack of native support for spatial data. In this article, we present a Spark-based parallel computing approach for the focal algorithms of neighboring analysis. This approach implements efficient manipulation of large amounts of terrain data through three steps: (1) partitioning a raster digital elevation model (DEM) file into multiple square tile files by adopting a tile-based multifile storing strategy suitable for the Hadoop Distributed File System (HDFS), (2) performing the quintessential slope algorithm on these tile files using a dynamic calculation window (DCW) computing strategy, and (3) writing back and merging the calculation results into a whole raster file. Experiments with the digital elevation data of Australia show that the proposed computing approach can effectively improve the parallel performance of focal statistics algorithms. The results also show that the approach has almost the same calculation accuracy as that of ArcGIS. The proposed approach also exhibits good scalability when the number of Spark executors in clusters is increased.

关键词： spatial neighboring analysis Spark parallel computing big data processing

来源：评论

学校读者我要写书评

暂无评论

parallel and distributed computing for stochastic dual dynamic programming

引用

COMPUTATIONAL MANAGEMENT SCIENCE 2022年第2期19卷 199-226页

作者： Avila, D. Papavasiliou, A. Loehndorf, N. Catholic Univ Louvain Ctr Operat Res & Econometr Louvain Belgium Univ Luxembourg Luxembourg Ctr Logist & Supply Chain Management Esch Sur Alzette Luxembourg

We study different parallelization schemes for the stochastic dual dynamic programming (SDDP) algorithm. We propose a taxonomy for these parallel algorithms, which is based on the concept of parallelizing by scenario and parallelizing by node of the underlying stochastic process. We develop a synchronous and asynchronous version for each configuration. The parallelization strategy in the parallelscenario configuration aims at parallelizing the Monte Carlo sampling procedure in the forward pass of the SDDP algorithm, and thus generates a large number of supporting hyperplanes in parallel. On the other hand, the parallel-node strategy aims at building a single hyperplane of the dynamic programming value function in parallel. The considered algorithms are implemented using Julia and JuMP on a high performance computing cluster. We study the effectiveness of the methods in terms of achieving tight optimality gaps, as well as the scalability properties of the algorithms with respect to an increasing number of CPUs. In particular, we study the effects of the different parallelization strategies on performance when increasing the number of Monte Carlo samples in the forward pass, and demonstrate through numerical experiments that such an increase may be harmful. Our results indicate that a parallel-node strategy presents certain benefits as compared to a parallel-scenario configuration.

关键词： Multistage stochastic programming Stochastic dual dynamic programming High performance computing Distributed computing parallel computing

来源：评论

学校读者我要写书评

暂无评论

parallel computing for Module-Based Computational Experiment 19th

Parallel Computing for Module-Based Computational Experiment

引用

19th Annual International Conference on Computational Science (ICCS)

作者： Yao, Zhuo Wang, Dali Riccuito, Danial Yuan, Fengming Fang, Chunsheng Univ Tennessee Knoxville TN 37996 USA Oak Ridge Natl Lab POB 2008MS 6301 Oak Ridge TN 37831 USA Jilin Univ Changchun 130012 Jilin Peoples R China

ISBN: (纸本)9783030227418;9783030227401

Large-scale scientific code plays an important role in scientific researches. In order to facilitate module and element evaluation in scientific applications, we introduce a unit testing framework and describe the demand for module-based experiment customization. We then develop a parallel version of the unit testing framework to handle long-term simulations with a large amount of data. Specifically, we apply message passing based parallelization and I/O behavior optimization to improve the performance of the unit testing framework and use profiling result to guide the parallel process implementation. Finally, we present a case study on litter decomposition experiment using a standalone module from a large-scale Earth System Model. This case study is also a good demonstration on the scalability, portability, and high-efficiency of the framework.

关键词： parallel computing Scientific software Message passing based parallelization Profiling

来源：评论

学校读者我要写书评

暂无评论

parallel accelerated computing architecture for dim target tracking on-board

引用

COMPUTATIONAL INTELLIGENCE 2024年第1期40卷 e12604-e12604页

作者： Yu, Jiyang Huang, Dan Li, Wenjie Wang, Xianjie Shi, Xiaolong China Acad Space Technol Beijing Inst Spacecraft Syst Engn Beijing Peoples R China China Res & Dev Acad Machinery Equipment Adv Syst Dept Beijing Peoples R China Chongqing Univ Technol Coll Comp Sci & Engn Chongqing Peoples R China China Res & Dev Acad Machinery Equipment Beijing Peoples R China

The real-time tracking process of dim targets in space is mainly achieved through the correlation and prediction of dots after the detection and calculation process. The on-board calculation of the tracking needs to be completed in milliseconds, and it needs to reach the microsecond level at high frame rates. For real-time tracking of dim targets in space, it is necessary to achieve universal tracking calculation acceleration in response to different space regions and complex backgrounds, which poses high requirements for engineering implementation architecture. This paper designs a Kalman filter calculation based on digital logic parallel acceleration architecture for real-time solution of dim target tracking on-board. A unified architecture of Vector Processing Element (VPE) was established for the calculation of Kalman filtering matrix, and an array computing structure based on VPE was designed to decompose the entire filtering process and form a parallel pipelined data stream. The prediction errors under different fixed point bit widths were analyzed and deduced, and the guidance methods for selecting the optimal bit width based on the statistical results were provided. The entire design was engineered based on Xilinx's XC7K325T, resulting in an energy efficiency improvement compared to previous designs. The single iteration calculation time does not exceed 0.7 microseconds, which can meet the current high frame rate target tracking requirements. The effectiveness of this design has been verified through simulation of random trajectory data, which is consistent with the theoretical calculation error.

关键词： dim target FPGA on-board parallel computing tracking vector processing element

来源：评论

学校读者我要写书评

暂无评论

parallel computing algorithm for real-time mapping between large-scale networks

Parallel computing algorithm for real-time mapping between l...

引用

IEEE Intelligent Transportation Systems Conference (IEEE-ITSC)

作者： Zhang, Ethan Tafreshian, Amirmahdi Masoud, Neda Univ Michigan Dept Civil & Environm Engn Ann Arbor MI 48105 USA

ISBN: (纸本)9781538670248

In this paper, we propose a scalable massivelyparallel algorithm to solve the general mapping problem in large-scale networks in real-time. The proposed parallel algorithm takes advantage of GPU architecture and launches millions of workers to calculate values on a target network simultaneously. Threads are managed through the SIMT execution model and target values are updated through atomic operations. Our experiments show the proposed algorithm can accomplish network mapping (find importance weights for links in a real-world large-scale shared-mobility network) with more than 2 million weights within 1.82 mu s (microsecondlevel), which is truly real-time. The algorithm performance suggests that mapping computations may no longer be the bottleneck in highly dynamic network-centered problems, as the computations can be completed faster than the solid state drive (SSD) read access latency. Compared to serial algorithms, the speedup is more than 12,000 times. The proposed algorithm is also scalable. Results on simulated data show that even when the network size grows exponentially, microsecond-level computing performance can still be obtained, and even more than 190,000 times speedup can be achieved. The proposed algorithm can serve as a cornerstone for ultra-fast processing of highly dynamic large-scale networks.

关键词： parallel computing Network mapping CUDA

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：