检索结果-内蒙古大学图书馆

High accuracy digital image correlation powered by GPU-based parallel computing

OPTICS AND LASERS IN ENGINEERING 2015年 69卷 7-12页

作者： Zhang, Lingqi Wang, Tianyi Jiang, Zhenyu Kemao, Qian Liu, Yiping Liu, Zejia Tang, Liqun Dong, Shoubin S China Univ Technol Sch Comp Sci Guangzhou 510640 Guangdong Peoples R China Nanyang Technol Univ Sch Comp Engn Singapore 639798 Singapore S China Univ Technol Sch Civil Engn & Transportat Guangzhou 510640 Guangdong Peoples R China

A sub-pixel digital image correlation (DIC) method with a path-independent displacement tracking strategy has been implemented on NVIDIA compute unified device architecture (CUDA) for graphics processing unit (GPU) devices. Powered by parallel computing technology, this parallel DIC (paDIC) method, combining an inverse compositional Gauss-Newton (IC-GN) algorithm for sub-pixel registration with a fast Fourier transform-based cross correlation (FFT-CC) algorithm for integer-pixel initial guess estimation, achieves a superior computation efficiency over the DIC method purely running on CPU. In the experiments using simulated and real speckle images, the paDIC reaches a computation speed of 1.66 x 10(5) POI/s (points of interest per second) and 1.13 x 10(5) POI/s respectively, 57-76 times faster than its sequential counterpart, without the sacrifice of accuracy and precision. To the best of our knowledge, it is the fastest computation speed of a sub-pixel DIC method reported heretofore. (C) 2015 Elsevier Ltd. All rights reserved.

关键词： Digital image correlation Inverse compositional Gauss-Newton algorithm parallel computing Graphics processing unit Compute unified device architecture

来源：评论

学校读者我要写书评

暂无评论

A data driven multi-state model for distribution system flexible planning utilizing hierarchical parallel computing

引用

APPLIED ENERGY 2018年 232卷 9-25页

作者： Ye, Chengjin Ding, Yi Song, Yonghua Lin, Zhenzhi Wang, Lei Zhejiang Univ Coll Elect Engn Hangzhou Zhejiang Peoples R China Univ Macau Macau Peoples R China State Grid Zhejiang Elect Power Co Econ Inst Hangzhou Zhejiang Peoples R China

With the development of smart grid and electricity market, the uncertainty for power flow is greatly aggravated, and thus leads to a great challenge on the traditional expansion methods for distribution systems to satisfy the future demands. In this paper, a data-driven multi-state distribution system expansion planning (DSEP) model is explored. Innovatively, amplitude values and profiles of uncertain factors in distribution systems are considered separately. Based on the massive historical measurement data, kernel density estimation and adaptive clustering are utilized to aggregate the typical amplitudes and profiles of time-varying variables respectively without prior knowledge. Consolidating all the uncertain factors, a multi-state model is established which extends DSEP into the deterministic initial planning and the probabilistic re-planning. The minimization of the overall planning cost is considered as the objective, which takes the initial planning costs and the expected costs of the initial plans being adapted to other future states into account. In this way, the flexibility of DSEP can be greatly enhanced and extra investments caused by frequent adjustments of plans are reduced. To avoid the rapid growth of CPU time due to multi-state model utilization, an integrated differential evolution and cross entropy algorithm implemented on a three-hierarchy parallel platform is proposed. The feasibilities of the proposed data-driven multi-state DSEP model and the parallel integrated solution method are demonstrated by numerical studies on a realistic 61-bus distribution system.

关键词： Distribution system Flexibility Multi-state model Data-driven Expected cost parallel computing

来源：评论

学校读者我要写书评

暂无评论

Memory-efficient 3D connected component labeling with parallel computing

引用

SIGNAL IMAGE AND VIDEO PROCESSING 2018年第3期12卷 429-436页

作者： Ohira, Norihiro Tokyo Metropolitan Ind Technol Res Inst Sumida Ku 12th FloorKFC Bldg 1-6-1 Tokyo 1300015 Japan

Connected component labeling is a frequently used image processing task in many applications. Moreover, in recent years, the use of 3D image data has become widespread, for instance, in 3D X-ray computed tomography and magnetic resonance imaging. However, because ordinary labeling algorithms use a large amount of memory and 3D images are generally large, labeling 3D image data can cause memory shortages. Furthermore, labeling a large image is time-consuming. In this paper, we proposed new memory-efficient connected component labeling algorithm for 3D images with parallel computing. In this method, we accelerate the labeling process using parallel computing. In addition, we use a spans matrix and compressed label matrix to reduce memory usage. We also use an equivalence chain approach to speed up the calculation. Furthermore, the algorithm has two options for further processing performance or further memory savings. In the experiments on real examples, the proposed algorithm with the option for processing speed was faster and used less memory than the conventional label equivalence method. In contrast, with the proposed method using the memory-efficient option, we could further reduce memory from one-eighth to one-thirteenth that used by the label equivalence method while maintaining the same performance.

关键词： Image processing Labeling Memory-efficient parallel computing

来源：评论

学校读者我要写书评

暂无评论

Cooperative Job Scheduling and Data Allocation in Data-Intensive parallel computing Clusters

引用

IEEE TRANSACTIONS ON CLOUD computing 2023年第3期11卷 2392-2406页

作者： Wang, Haoyu Liu, Guoxin Shen, Haiying Univ Virginia Dept Comp Sci Charlottesville VA 22904 USA Epic Syst I-31942 Verona WI Italy

In data-intensive parallel computing clusters, it is important to provide deadline-guaranteed service to jobs while minimizing resource usage (e.g., network bandwidth and energy). Under the current computing framework (that first allocates data and then schedules jobs), in a busy cluster with many jobs, it is difficult to achieve high data locality (hence low bandwidth consumption), deadline guarantee, and high energy savings simultaneously. We model the problem to simultaneously achieve these three objectives using integer programming. Due to the NP-hardness of the problem, we propose a heuristic Cooperative job Scheduling (CSA) and data Allocation method. CSA novelly reverses the order of data allocation and job scheduling in the current computing framework. Job-scheduling-first enables CSA to proactively consolidate tasks with more common requested data to the same server when conducting deadline-aware scheduling, and also consolidate the tasks to as few servers as possible to maximize energy savings. This facilitates the subsequent data allocation step to allocate a data block to the server that hosts most of this data's requester tasks, thus maximally enhancing data locality. To achieve the tradeoff between data locality and energy savings with specified weights, CSA has a cooperative recursive refinement process that recursively adjusts the job schedule and data allocation schedule. We further propose two enhancement algorithms (i.e., minimum k-cut data reallocation algorithm and bipartite based task reassignment algorithm) to further improve the performance of CSA through additional data reallocation and task reassignment, respectively. Trace-driven experiments in the simulation and the real cluster show that CSA outperforms other schedulers in supplying deadline-guarantee and resource-efficient services and the effectiveness of each enhancement. Also, the enhancement algorithms are effective in improving CSA.

关键词： Task analysis Servers Resource management Schedules Clustering algorithms Processor scheduling Costs Job scheduler data allocation parallel computing data locality

来源：评论

学校读者我要写书评

暂无评论

A collective entity linking algorithm with parallel computing on large-scale knowledge base

引用

JOURNAL OF SUPERcomputing 2020年第2期76卷 948-963页

作者： Xia, Yingchun Wang, Xingyue Gu, Lichuan Gao, Qijuan Jiao, Jun Wang, Chao Anhui Agr Univ Sch Informat & Comp Hefei Peoples R China Minist Agr Key Lab Agr Elect Commerce Hefei 230036 Peoples R China

Entity linking is a central concern of automatic knowledge question answering and knowledge base population. Traditional collective entity linking approaches only consider one of the entity contexts or semantic relations between entities. Thus, these approaches always have poor performance on Web documents. The efficiency of collective entity linking needs to be improved as well. This paper proposes a collective entity linking algorithm based on topic model and graph. Constructing the topic model can represent mentions and candidate entities by using topic distributions. It makes full use of context in documents. Entity semantic relations are represented by document similarities which are computed through the topic model. parallel computing is used to reduce long running time which is caused by topic model construction. Entity graph is constructed according to the relations between entities in the knowledge graph. Hypertext-Induced Topic Search exploits the entity graph to compute hub value and authority value of candidate entities. And the authority value is the basis for entity linking. Experimental results on open-domain corpus (NLPCC2014) demonstrate the validity of the proposed method. Experimental results show that the proposed approach has 5.2% improvement in F-1-measure than AGDISTIS on corp NLPCC2014.

关键词： Collective entity linking Topic model Graph parallel computing

来源：评论

学校读者我要写书评

暂无评论

Make optical transmission quality prediction more effective: A parallel computing view

引用

OPTICS COMMUNICATIONS 2024年 571卷

作者： Sun, Xiaochuan Yang, Shuohan Han, Jinpeng Li, Yingqi Meng, Qinghong North China Univ Sci & Technol Coll Artificial Intelligence 21 Bohai Rd Tangshan 063210 Hebei Peoples R China Hebei Key Lab Ind Intelligent Percept Tangshan 063210 Hebei Peoples R China

In optical networks, ensuring high quality of transmission (QoT) is essential to prevent degradation of optical signals, especially when the signal strength falls below a specified threshold. While machine learning (ML) is widely used for QoT prediction, predicting QoT accurately for large-scale optical links presents challenges. Traditional serial methods often result in high latency and decreased processing efficiency of optical channels. To solve this problem, this paper proposes a Dask-based P-FEDformer approach. Initially, a FEDformer-based predictor is constructed, and then QoT prediction for multiple channels is realized under the Dask parallel architecture. To enhance model prediction accuracy, wavelet decomposition technique is employed. Simulation results demonstrate the method's effectiveness in handling large amount of data with a 60% improvement in time efficiency compared to serial execution, while maintaining accurate QoT prediction.

关键词： Dask P-FEDformer parallel computing Quality of transmission prediction

来源：评论

学校读者我要写书评

暂无评论

An iso-time scaling method for big data tasks executing on parallel computing systems

引用

JOURNAL OF SUPERcomputing 2017年第10期73卷 4493-4516页

作者： Zeng, Guosun Liu, Wenjuan Tongji Univ Dept Comp Sci & Technol Shanghai 201804 Peoples R China Natl Engn & Technol Ctr High Performance Comp Tongji Branch Shanghai 201804 Peoples R China

Due to the sustained and rapid growth of big data and the demand on higher accuracy solutions for application problems, the completion time of fixed-time big data tasks executing on original parallel computing systems becomes longer and longer. To meet the requirement of fixed completion time, the original parallel computing systems need to be scaled accordingly. Therefore, this paper studies an iso-time scaling method to guide the scaling of parallel computing systems. Firstly, the models of big data parallel tasks and parallel computing systems are built, and an algorithm is designed to calculate the completion time of big data parallel tasks. Secondly, according to the actual situation of the current majority computing centers, we put forward some reasonable hypotheses, make full use of backup computational nodes, and optimize the cost of scaling parallel computing systems. Then, a vertical scaling algorithm is designed to upgrade computational nodes, and a horizontal scaling algorithm is designed to add computational nodes. Furthermore, this paper compares the two scaling algorithms in the aspects of time complexity, degree of parallelism and system utilization for scaled parallel computing system. Finally, some simulation experiments are conducted. The experimental results show that our method can keep the completion time within fixed time when the increasing data parallel tasks execute on the scaled parallel computing systems and it has better effect in scaling cost than traditional methods.

关键词： parallel computing Big data tasks Fixed completion time Scaling method Simulation scheduling test

来源：评论

学校读者我要写书评

暂无评论

A novel scalability metric about iso-area of performance for parallel computing

引用

JOURNAL OF SUPERcomputing 2014年第2期68卷 652-671页

作者： Xiong, Huanliang Zeng, Guosun Zeng, Yuan Wang, Wei Wu, Canghai Tongji Univ Dept Comp Sci & Technol Shanghai 201804 Peoples R China Jiangxi Agr Univ Software Coll Nanchang 330045 Peoples R China Natl Engn & Technol Ctr High Performance Comp Tongji Branch Shanghai 201804 Peoples R China Johns Hopkins Univ Carey Business Sch Baltimore MD 21202 USA

Scalability is an important performance metric of parallel computing, but the traditional scalability metrics only try to reflect the scalability for parallel computing from one side, which makes it difficult to fully measure its overall performance. This paper studies scalability metrics intensively and completely. From lots of performance parameters of parallel computing, a group of key ones is chosen and normalized. Further the area of Kiviat graph is used to characterize the overall performance of parallel computing. Thereby a novel scalability metric about iso-area of performance for parallel computing is proposed and the relationship between the new metric and the traditional ones is analyzed. Finally the novel metric is applied to address the scalability of the matrix multiplication Cannon's algorithm under LogP model. The proposed metric is significant to improve parallel computing architecture and to tune parallel algorithm design.

关键词： parallel computing Scalability Iso-area of performance Metrics

来源：评论

学校读者我要写书评

暂无评论

Coalition of metaheuristics through parallel computing for solving unconstrained continuous optimization problems

引用

ENGINEERING COMPUTATIONS 2022年第8期39卷 2895-2927页

作者： Senol, Mumin Emre Baykasoglu, Adil Dokuz Eylul Univ Fac Engn Dept Ind Engn Izmir Turkey

Purpose The purpose of this study is to develop a new parallel metaheuristic algorithm for solving unconstrained continuous optimization problems. Design/methodology/approach The proposed method brings several metaheuristic algorithms together to form a coalition under Weighted Superposition Attraction-Repulsion Algorithm (WSAR) in a parallel computing environment. The proposed approach runs different single solution based metaheuristic algorithms in parallel and employs WSAR (which is a recently developed and proposed swarm intelligence based optimizer) as controller. Findings The proposed approach is tested against the latest well-known unconstrained continuous optimization problems (CEC2020). The obtained results are compared with some other optimization algorithms. The results of the comparison prove the efficiency of the proposed method. Originality/value This study aims to combine different metaheuristic algorithms in order to provide a satisfactory performance on solving the optimization problems by benefiting their diverse characteristics. In addition, the run time is shortened by parallel execution. The proposed approach can be applied to any type of optimization problems by its problem-independent structure.

关键词： Metaheuristic algorithms parallel computing Continuous optimization Weighted superposition attraction-repulsion algorithm

来源：评论

学校读者我要写书评

暂无评论

Increasing the Simulation Performance of Large-Scale Evacuations Using parallel computing Techniques Based on Domain Decomposition

引用

FIRE TECHNOLOGY 2017年第3期53卷 1399-1438页

作者： Grandison, A. Cavanagh, Y. Lawrence, P. J. Galea, E. R. Univ Greenwich Fire Safety Engn Grp London SE10 9LS England

Evacuation simulation has the potential to be used as part of a decision support system during large-scale incidents to provide advice to incident commanders. To be viable in these applications, it is essential that the simulation can run many times faster than real time. parallel processing is a method of reducing run times for very large computational simulations by distributing the workload amongst a number of processors. This paper presents the development of a parallel version of the rule based evacuation simulation software buildingEXODUS using domain decomposition. Four Case Studies (CS) were tested using a cluster, consisting of 10 Intel Core 2 Duo (dual core) 3.16 GHz CPUs. CS-1 involved an idealised large geometry, with 20 exits, intended to illustrate the peak computational speed up performance of the parallel implementation, the population consisted of 100,000 agents;the peak computational speedup (PCS) was 14.6 and the peak real-time speedup (PRTS) was 4.0. CS-2 was a long area with a single exit area with a population of 100,000 agents;the PCS was 13.2 and the PRTS was 17.2. CS-3 was a 50 storey high rise building with a population of 8000/16,000 agents;the PCS was 2.48/4.49 and the PRTS was 17.9/12.9. CS-4 is a large realistic urban area with 60,000/120,000 agents;the PCS was 5.3/6.89 and the PRTS was 5.31/3.0. This type of computational performance opens evacuation simulation to a range of new innovative application areas such as real-time incident support, dynamic signage in smart buildings and virtual training environments.

关键词： parallel computing Evacuation Evacuation simulation Real-time Large-scale

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：