检索结果-内蒙古大学图书馆

Mathematical Modeling of the Dynamics of Nonequilibrium in Time Convection-Diffusion Processes in Domains with Free Boundaries

引用

CYBERNETICS AND SYSTEMS ANALYSIS 2016年第3期52卷 427-440页

作者： Bulavatsky, V. M. Bogaenko, V. A. Natl Acad Sci Ukraine VM Glushkov Cybernet Inst Kiev Ukraine

A mathematical model is constructed that describes the dynamics of fractional-differential locally nonequilibrium in time convection-diffusion process of soluble substances in plain-vertical established filtration with free boundary. The respective boundary-value problem is formulated and the technique is outlined to derive its approximate solution. parallel algorithms for calculation of cluster systems are developed, the results of testing the response of parallel algorithms for GPU and the results of numerical experiments on simulation of the dynamics of the migration process under study are presented.

关键词： nonclassical diffusion models nonequilibrium in time convection diffusion process plain-vertical filtration in porous medium diffusion equation with delay fractional diffusion equation boundary-value problems approximate solutions parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Optimization-based Computation with Spiking Neurons

Optimization-based Computation with Spiking Neurons

引用

International Joint Conference on Neural Networks

作者： Stephen J. Verzi Craig M. Vineyard Eric D. Vugrin Meghan Galiardi Conrad D. James James B. Aimone Sandia National Laboratories Albuquerque NM 87185-1138

ISBN: (纸本)9781509061839

Considerable effort is currently being spent designing neuromorphic hardware for addressing challenging problems in a variety of pattern-matching applications. These neuromorphic systems offer low power architectures with intrinsically parallel and simple spiking neuron processing elements. Unfortunately, these new hardware architectures have been largely developed without a clear justification for using spiking neurons to compute quantities for problems of interest. Specifically, the use of spiking for encoding information in time has not been explored theoretically with complexity analysis to examine the operating conditions under which neuromorphic computing provides a computational advantage (time, space, power, etc.) In this paper, we present and formally analyze the use of temporal coding in a neural-inspired algorithm for optimization-based computation in neural spiking architectures.

关键词： Computer architecture Neurons Optimization Program processors parallel algorithms Algorithm design and analysis Linear programming

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT algorithms FOR ASSORTATIVE EDGE SWITCH IN LARGE LABELED NETWORKS 17

EFFICIENT ALGORITHMS FOR ASSORTATIVE EDGE SWITCH IN LARGE LA...

引用

Simulation Multiconference

作者： Hasanuzzaman Bhuiyan Maleq Khan Madhav Marathe Department of Computer Science Network Dynamics and Simulation Science Laboratory Biocomplexity Institute of Virginia Tech Department of Electrical Engineering and Computer Science Texas A&M University-Kingsville

ISBN: (纸本)9781510838222

An assortative edge switch is an operation on a labeled network, where two edges are randomly selected and the end vertices are swapped with each other if the labels of the end vertices of the edges remain invariant. Assortative edge switch has important applications in studying the mixing pattern and dynamic behavior of social networks, modeling and analyzing dynamic networks, and generating random networks. In this paper, we present an efficient sequential algorithm and a distributed-memory parallel algorithm for assortative edge switch. To our knowledge, they are the first efficient algorithms for this problem. The dependencies among successive assortative edge switch operations, the requirement of maintaining the assortative coefficient invariant, keeping the network simple, and balancing the computation loads among the processors pose significant challenges in designing a parallel algorithm. Our parallel algorithm achieves a speedup of 68 - 772 with 1024 processors for a wide variety of networks.

关键词： Assortative edge switch Random network generation Network dynamics parallel algorithms parallel algorithms efficient algorithm Switches edges Network PROCESSOR Edge of a figure or solid telecommunication standards cellular radio LARGE gene Speeding

来源：评论

学校读者我要写书评

暂无评论

Community Detection on the GPU

Community Detection on the GPU

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： Md. Naim Fredrik Manne Mahantesh Halappanavar Antonino Tumeo Department of Informatics University of Bergen Bergen Norway Pacific Northwest National Laboratory Richland WA USA

We present and evaluate a new GPU algorithm based on the Louvain method for community detection. Our algorithm is the first for this problem that parallelizes the access to individual edges. In this way we can fine tune the load balance when processing networks with nodes of highly varying degrees. This is achieved by scaling the number of threads assigned to each node according to its degree. Extensive experiments show that we obtain speedups up to a factor of 270 compared to the sequential algorithm. The algorithm consistently outperforms other recent shared memory implementations and is only one order of magnitude slower than the current fastest parallel Louvain method running on a Blue Gene/Q supercomputer using more than 500K threads.

关键词： Graphics processing units Optimization Instruction sets Message systems parallel algorithms Computers Clustering algorithms

来源：评论

学校读者我要写书评

暂无评论

Investigation of sparse matrix multiplication task for the PDCS “Buran”

Investigation of sparse matrix multiplication task for the P...

引用

East-West Design & Test Symposium (EWDTS)

作者： N. N. Levchenko A. S. Okunev D. N. Zmejev Institute for Design Problems in Microelectronics of Russian Academy of Sciences (IPPM RAS) Moscow Russia

In order to improve the efficiency of the sparse matrices multiplication task on a traditional cluster supercomputer, it is necessary to take into account different levels of parallelism when programming. To work around these problems the dataflow computing model with the dynamically formed context and the architecture of the parallel dataflow computing system can be used. The article describes the implementation of a parallel algorithm of the sparse matrices multiplication task on the parallel dataflow computing system. The experiments performed on the emulator of the system demonstrate the application perspectiveness of the dataflow computing model for this class of tasks.

关键词： Sparse matrices Computational modeling Handheld computers parallel algorithms Programming Context modeling

来源：评论

学校读者我要写书评

暂无评论

Adaptive loss-less data compression method optimized for GPU decompression

引用

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE 2017年第24期29卷

作者： Funasaka, Shunji Nakano, Koji Ito, Yasuaki Hiroshima Univ Dept Informat Engn Kagamiyama 1-4-1 Higashihiroshima Japan

There is no doubt that data compression is very important in computer engineering. However, most lossless data compression and decompression algorithms are very hard to parallelize, because they use dictionaries updated sequentially. The main contribution of this paper is to present a new lossless data compression method that we call adaptive loss-less (ALL) data compression. It is designed so that the data compression ratio is moderate, but decompression can be performed very efficiently on the graphics processing unit (GPU). This makes sense for applications such as training of deep learning, in which compressed archived data are decompressed many times. To show the potentiality of ALL data compression method, we have evaluated the running time using five images and five text data and compared ALL with previously published lossless data compression methods implemented in the GPU, Gompresso, CULZSS, and LZW. The data compression ratio of ALL data compression is better than the others for eight data out of these 10 data. Also, our GPU implementation on GeForce GTX1080 GPU for ALL decompression runs 84.0 to 231 times faster than the CPU implementation on Corei7-4790 CPU. Further, it runs 1.22 to 23.5 times faster than Gompresso, CULZSS, and LZW running on the same GPU.

关键词： GPGPU lossless data compression parallel algorithms parallel prefix scan

来源：评论

学校读者我要写书评

暂无评论

Scalable k-means clustering via lightweight coresets

arXiv

引用

arXiv 2017年

作者： Bachem, Olivier Lucic, Mario Krause, Andreas Google Brain ETH Zurich Switzerland Google Brain ETH Zurich Switzerland

Coresets are compact representations of data sets such that models trained on a coreset are provably competitive with models trained on the full data set. As such, they have been successfully used to scale up clustering models to massive data sets. While existing approaches generally only allow for multiplicative approximation errors, we propose a novel notion of lightweight coresets that allows for both multiplicative and additive errors. We provide a single algorithm to construct lightweight coresets for k-means clustering as well as soft and hard Bregman clustering. The algorithm is substantially faster than existing constructions, embarrassingly parallel, and the resulting coresets are smaller. We further show that the proposed approach naturally generalizes to statistical k-means clustering and that, compared to existing results, it can be used to compute smaller summaries for empirical risk minimization. In extensive experiments, we demonstrate that the proposed algorithm outperforms existing data summarization strategies in practice. Copyright © 2017, The Authors. All rights reserved.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Predicting Viral News Events in Online Media

Predicting Viral News Events in Online Media

引用

IEEE International Symposium on parallel and Distributed Processing Workshops and Phd Forum (IPDPSW)

作者： Xiaoyan Lu Boleslaw Szymanski Computer Science Department Rensselaer Polytechnic Institute Troy NY USA

The information diffusion and dissemination define critical dynamics observed in large complex networks. The underlying information propagation topology, however, is often hidden or incomplete because of the lack of explicit citations of the sources. We proposed a scalable parallel algorithm to derive the node embeddings to better understand the information dissemination patterns and predict emergent cascades of viral events in online media. Unlike previous works which concentrate on modeling the links of information propagation, our algorithm infers the topic-specific output influence and the input selectivity of nodes. The parallel algorithm iteratively merges local node embeddings in particular communities to obtain the global optimal results so that the processing of cascades can be significantly accelerated. Based on the obtained latent representation of nodes, the emergent cascades of viral news events in online media can be successfully predicted with an 80\% accuracy at its early stage. Experimental results show that our parallel inference algorithm achieves a 10-fold acceleration and requires a low communication overhead, while the accuracy of the cascade size prediction is preserved.

关键词： Stochastic processes Prediction algorithms Media Inference algorithms Delays Adaptation models parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Coupling brain-tumor biophysical models and diffeomorphic image registration

arXiv

引用

arXiv 2017年

作者： Scheufele, Klaudius Mang, Andreas Gholami, Amir Davatzikos, Christos Biros, George Mehl, Miriam University of Stuttgart Ipvs Universitätstraße 38 Stuttgart70569 University of Houston Department of Mathematics 3551 Cullen Blvd. HoustonTX77204-3008 United States University of Texas Ices 201 East 24th St AustinTX78712-1229 United States University of California Berkeley Eecs BerkeleyCA94720-1776 United States Department of Radiology University of Pennsylvania School of Medicine 3700 Hamilton Walk PhiladelphiaPA19104 United States

We present the SIBIA (Scalable Integrated Biophysics-based Image Analysis) framework for joint image registration and biophysical inversion and we apply it to analyse MR images of glioblastomas (primary brain tumors). Given the segmentation of a normal brain MRI and the segmentation of a cancer patient MRI, we wish to determine tumor growth parameters and a registration map so that if we "grow a tumor" (using our tumor model) in the normal segmented image and then register it to the segmented patient image, then the registration mismatch is as small as possible. We call this "the coupled problem" because it two-way couples the biophysical inversion and registration problems. In the image registration step we solve a large-deformation diffeomorphic registration problem parameterized by an Eulerian velocity field. In the biophysical inversion step we estimate parameters in a reaction-diffusion tumor growth model that is formulated as a partial differential equation (PDE). In SIBIA, we couple these two steps in an iterative manner. We first presented the components of SIBIA in "Gholami et al, Framework for Scalable Biophysics-based Image Analysis, IEEE/ACM Proceedings of the SC2017", in which we derived parallel distributed memory algorithms and software modules for the decoupled registration and biophysical inverse problems. In this paper, our contributions are the introduction of a PDE-constrained optimization formulation of the coupled problem, the derivation of the optimality conditions, and the derivation of a Picard iterative scheme for the solution of the coupled problem. In addition, we perform several tests to experimentally assess the performance of our method on synthetic and clinical datasets. We demonstrate the convergence of the SIBIA optimization solver in different usage scenarios. We demonstrate that using SIBIA, we can accurately solve the coupled problem in three dimensions (2563resolution) in a few minutes using 11 dual-x86 *** Codes 49K20, 49

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A Distributed Algorithm for Computing a Common Fixed Point of a Family of Paracontractions

引用

IFAC-PapersOnLine 2016年第18期49卷 552-557页

作者： Fullmer, Daniel Wang, Lili Morse, A. Stephen Department of Electrical Engineering New HavenCT06520 United States

A distributed algorithm is described for finding a common fixed point of a family of m > 1 nonlinear maps Mi: IRn→ IRnassuming that each map is a paracontraction and that such a common fixed point exists. The common fixed point is simultaneously computed by m agents assuming each agent i knows only Mi, the current estimates of the fixed point generated by its neighbors, and nothing more. Each agent recursively updates its estimate of the fixed point by utilizing the current estimates generated by each of its neighbors. Neighbor relations are characterized by a time-dependent directed graph (t) whose vertices correspond to agents and whose arcs depict neighbor relations. It is shown that for any family of paracontractions Mi, i {1,2,., m} which has at least one common fixed point, and any sequence of strongly connected neighbor graphs (t),t = 1,2,., the algorithm causes all agent estimates to converge to a common fixed point. © 2016

关键词： Directed graphs parallel algorithms Common fixed point Current estimates Neighbor graph nonlinear Nonlinear map paracontraction Strongly connected Time dependent

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：