检索结果-内蒙古大学图书馆

Matching of a Huge Set of MR images with a parallel processing Model

JOURNAL OF MEDICAL SYSTEMS 2011年第5期35卷 795-800页

作者： Chen, Yuwen Chen, Yue Zhejiang Univ Coll Comp Sci Hangzhou 310003 Zhejiang Peoples R China

Matching medical image data is a key factor for appropriate computer aided diagnosis. For the past several decades, many image processing technologies have been developed and discussed. However, most of the methods are only of theoretical interest because the time complexity of the matching methods is too high for realistic handling of huge amounts of existing medical images. This paper presents a parallel processing model for matching huge amounts of MR images. A feature vector of an MR image is defined by professionals specifically in the area of neuroscience. Then a matching algorithm is developed based on matching the feature vectors. The algorithm is shown to be suitable for parallel process, and provides acceptable results. The experiments show that the overhead of synchronizing the parallel process is less significant than the improvement of the overall efficiency.

关键词： parallel processing model image matching Feature vector

来源：评论

学校读者我要写书评

暂无评论

parallelization using the standard simplex algorithm

Parallelization using the standard simplex algorithm

引用

International conference on parallel and distributed processing Techniques and Applications

作者： Yarmish, G Van Slyke, R CUNY Brooklyn Coll Brooklyn NY USA

ISBN: (纸本)1932415262

The simplex algorithm for linear programming has two major variants: the original, or standard method, and the revised method. Today, virtually all serious implementations are based on the revised method because it is much faster for sparse LPs, which are most common. However, the standard method has advantages as well. First, the standard method is effective for dense problems. While dense problems are uncommon in general, they occur frequently in some important applications such as wavelet decomposition, digital filter design, text categorization, and image processing. Second, the standard method can be easily and effectively extended to a coarse grained, distributed algorithm. We look at distributed linear programming especially optimized for loosely coupled workstations.

关键词： linear programming standardsimplex method dense matrices distributed computing parallel optimization

来源：评论

学校读者我要写书评

暂无评论

EmbRace: Accelerating Sparse Communication for distributed Training of Deep Neural Networks 51

EmbRace: Accelerating Sparse Communication for Distributed T...

引用

51st International conference on parallel processing (ICPP)

作者： Li, Shengwei Lai, Zhiquan Li, Dongsheng Zhang, Yiming Ye, Xiangyu Duan, Yabo Univ Def Technol Natl Key Lab Parallel & Distributed Proc Comp Coll Changsha Peoples R China Xiamen Univ Xiamen Peoples R China

ISBN: (纸本)9781450397339

distributed data-parallel training has been widely adopted for deep neural network (DNN) models. Although current deep learning (DL) frameworks scale well for dense models like image classification models, we find that these DL frameworks have relatively low scalability for sparse models like natural language processing (NLP) models that have highly sparse embedding tables. Most existing works overlook the sparsity of model parameters thus suffering from significant but unnecessary communication overhead. In this paper, we propose EmbRace, an efficient communication framework to accelerate communications of distributed training for sparse models. EmbRace introduces Sparsity-aware Hybrid Communication, which integrates AlltoAll and model parallelism into data-parallel training, so as to reduce the communication overhead of highly sparse parameters. To effectively overlap sparse communication with both backward and forward computation, EmbRace further designs a 2D Communication Scheduling approach which optimizes the model computation procedure, relaxes the dependency of embeddings, and schedules the sparse communications of each embedding row with a priority queue. We have implemented a prototype of EmbRace based on PyTorch and Horovod, and conducted comprehensive evaluations with four representative NLP models. Experimental results show that EmbRace achieves up to 2.41x speedup compared to the state-of-the-art distributed training baselines.

关键词： distributed training deep learning sparsity of NLP models communication scheduling

来源：评论

学校读者我要写书评

暂无评论

Computing the Euclidean distance transform on a linear array of processors

引用

JOURNAL OF SUPERCOMPUTING 2003年第2期25卷 177-185页

作者： Gavrilova, ML Alsuwaiyel, MH Univ Calgary Dept Comp Sci Calgary AB T2N 1N4 Canada KFUPM Dept Informat & Comp Sci Dhahran Saudi Arabia

Given an n x n binary image of white and black pixels, we present an optimal parallel algorithm for computing the distance transform and the nearest feature transform using the Euclidean metric. The algorithm employs ... 详细信息

关键词： feature transform distance transform Euclidean distance parallel algorithm linear array of processors image processing

来源：评论

学校读者我要写书评

暂无评论

parallel computation of 3D wavelets

Parallel computation of 3D wavelets

引用

Proceedings of the Scalable High-Performance Computing conference

作者： Suzuki, Laura R.C. Reid, J.Robert Burns, Thomas J. Lamont, Gary B. Rogers, Steven K. Air Force Inst of Technology Wright-Patterson AFB United States

The Discrete Wavelet Transform (DWT) is becoming a widely used tool in image processing and other data analysis areas. A non-conventional variation of a spatio-temporal 3D DWT has been developed in order to analyze motion in time-sequential imagery. The computational complexity of this algorithm is Θ(n3), where n is the number of samples in each dimension of the input image sequence. methods are needed to increase the speed of these computations for large data sets. Fortunately, wavelet decomposition is very amenable to parallelization. Coarse-grained parallel versions of this process have been design and implemented on three different architectures: a distributed network represented by a distributed network of Sun SPARCstation 2 workstations;two Intel hypercubes (an iPSC/2 and an iPSC/860);and a Thinking Machines Corporation CM-5, a massively parallel SPMD. This non-conventional 3D wavelet decomposition is very suitable for course-grain implementation on parallel computers with proper load balancing. Close to linear speedup over serial implementations has been achieved using a distributed network. Near-linear speedup was obtained on the hypercubes and the CM-5 for a variety of image-processing applications.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Domain Adaptive Semantic Segmentation via image Translation and Representation Alignment 19

Domain Adaptive Semantic Segmentation via Image Translation ...

引用

19th IEEE International Symposium on parallel and distributed processing with Applications (IEEE ISPA)

作者： Kang, Jingxuan Zang, Bin Cao, Weipeng Univ Liverpool Sch Comp Sci Liverpool Merseyside England Beijing Inst Technol Sch Comp Sci & Technol Beijing Peoples R China Shenzhen Univ Coll Comp Sci & Software Engn Shenzhen Peoples R China

ISBN: (纸本)9781665435741

Domain Adaptation for semantic segmentation is of vital significance since it enables effective knowledge transfer from a labeled source domain (i.e., synthetic data) to an unlabeled target domain (i.e., real images), where no effort is devoted to annotating target samples. Prior domain adaptation methods are mainly based on image-to-image translation model to minimize differences in image conditions between source and target domain. However, there is no guarantee that feature representations from different classes in the target domain can be well separated, resulting in poor discriminative representation. In this paper, we propose a unified learning pipeline, called image Translation and Representation Alignment (ITRA), for domain adaptation of segmentation. Specifically, it firstly aligns an image in the source domain with a reference image in the target domain using image style transfer technique (e.g., CycleGAN) and then a novel pixelcentroid triplet loss is designed to explicitly minimize the intraclass feature variance as well as maximize the inter-class feature margin. When the style transfer is finished by the former step, the latter one is easy to learn and further decreases the domain shift. Extensive experiments demonstrate that the proposed pipeline facilitates both image translation and representation alignment and significantly outperforms previous methods in both GTA5 -> Cityscapes and SYNTHIA -> Cityscapes scenarios.

关键词： Domain Adaptation Semantic Segmentation Style Transfer Centroid Alignment

来源：评论

学校读者我要写书评

暂无评论

A parallel CBIR implementation using perceptual grouping of blockbased visual patterns

A parallel CBIR implementation using perceptual grouping of ...

引用

IEEE International conference on image processing (ICIP 2007)

作者： Cheng, Shyi-Chyi Huang, Wei-Kan Liao, Yu-Jhih Wu, Da-Chun Natl Taiwan Ocean Univ Dept Comp Sci & Engn Taipei Taiwan Natl Kaohsiung Univ Sci & Technol 1 Dept Comp & Commun Engn Kaohsiung Taiwan

ISBN: (纸本)9781424414369

This paper proposes a parallel solution to retrieve images from distributed data sources using perceptual grouping of block-based visual patterns. The method of grouping visual patterns into image model based on generalized Hough transform is one of the most powerful techniques for image analysis. However, real-time applications of this method have been prohibited due to the computational intensity in similarity searching from a large centralized image collection. A query object is decomposed into non-overlapped blocks, where each of them is represented as a visual pattern obtained by detecting the line edge from the block using the moment-preserving edge detector. A voting scheme based on generalized Hough transform is proposed to provide object search method, which is invariant to the translation, rotation, scaling of image data. In this work, we describe a heterogeneous cluster-oriented CBIR implementation. First, the workload to perform an object search is analyzed, and then, a new load balancing algorithm for the CBIR system is presented. Simulation results show that the proposed method gives good performance and spans a new way to design a cost-effective CBIR system.

关键词： hough transforms object detection information retrieval distributed computing load modeling

来源：评论

学校读者我要写书评

暂无评论

Towards a Hierarchical Exascale Framework for Iterative parallel Data Analysis Algorithms 32

Towards a Hierarchical Exascale Framework for Iterative Para...

引用

32nd Euromicro International conference on parallel, distributed and Network-Based processing (PDP)

作者： Cesario, Eugenio Lindia, Paolo Univ Calabria Arcavacata Di Rende CS Italy

ISBN: (纸本)9798350363074;9798350363081

Several parallel and distributed data mining algorithms have been proposed in literature to perform large scale data analysis, overcoming the bottleneck of traditional methods on a single machine. However, although the master-worker approach greatly simplifies the synchronization of all nodes since only the master is in charge to do that, it also presents several problematic issues for large-scale data analysis tasks (involving thousands or millions of nodes). This paper presents a hierarchical (or multi-level) master-worker framework for iterative parallel data analysis algorithms, to overcome the scalability issues affecting classic master-worker solutions. Specifically, the framework is composed of (more than one) merger and worker nodes organized in a k-tree structure, in which the workers are on the leaves and the mergers are on the root and the internal nodes in the tree.

关键词： distributed data mining exascale computing high performance computing

来源：评论

学校读者我要写书评

暂无评论

INFORMATION-ENTROPY BASED LOAD BALANCING IN parallel ADAPTIVE VOLUME RENDERING

INFORMATION-ENTROPY BASED LOAD BALANCING IN PARALLEL ADAPTIV...

引用

International conference on Interfaces and Human Computer Interaction / International conference on Game and Entertainment Technologies / International conference on Computer Graphics, Visualization, Computer Vision and image processing

作者： Wang, Huawei Ai, Zhiwei Cao, Yi Inst Appl Phys & Computat Math 2 Fenghao East Rd Beijing 100094 Peoples R China

ISBN: (纸本)9789898533388

Aiming at TB-scale time-varying scientific datasets, this paper presents a novel static load balancing scheme based on information entropy to enhance the efficiency of the parallel adaptive volume rendering algorithm. An information-theory model is proposed firstly, and then the information entropy is calculated for each data patch, which is taken as a pre-estimation of the computational amount of ray sampling. According to their computational amounts, the data patches are distributed to the processing cores balancedly, and accordingly load imbalance in parallel rendering is decreased. Compared with the existing methods such as random assignment and ray estimation, the proposed entropy-based load balancing scheme can achieve a rendering speedup ratio of 1.23 similar to 2.84. It is the best choice in interactive volume rendering due to its speedup performance and view independence.

关键词： Load Balancing Information Entropy parallel Volume Rendering View Independence

来源：评论

学校读者我要写书评

暂无评论

Experiments with parallelizing a tribology application

Experiments with parallelizing a tribology application

引用

31st International conference on parallel processing (ICPP 2002)

作者： Chaudhary, V Hase, WL Jiang, H Sun, L Thakert, D Wayne State Univ Inst Comp Sci Detroit MI 48202 USA

ISBN: (纸本)0769516807

Different parallelization methods vary in their system requirements, programming styles, efficiency of exploring parallelism, and the application characteristics they can handle. Different applications can exhibit totally different performance gains depending on the parallelization method used. This paper compares OpenMP, MPI, and Strings(A distributed shared memory)for parallelizing a complicated tribology problem. The problem size and computing infrastructure are changed and their impacts on the parallelization methods are studied. All of the methods studied exhibit good performance improvements. This paper exhibits the benefits that are the result of applying parallelization techniques to applications in this field.

关键词： molecular dynamics OpenMP MPI distributed shared memory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：