This paper considers a model for classifying high school students by digital traces obtained from the VKontakte social network. The classification is based on the belonging of social network users to communities, the ...
详细信息
ISBN:
(纸本)9783030941413;9783030941406
This paper considers a model for classifying high school students by digital traces obtained from the VKontakte social network. The classification is based on the belonging of social network users to communities, the number of which is about hundreds of thousands, which leads to the emergence of big data in the process of analysis. The problem of working with big data is solved by parallelizing computations. The classification model was developed with the aim of recovering information from digital traces of users of social networks. On the basis of the trained model, the identification of users of the VKontakte social networkwas carried out by place of residence (village or city of theAltai Territory) and age (9 or 11 grade) among teenagerswith incomplete information on the grade and place of study in the digital traces. The best prediction accuracy for the trained model was of the order of 0.9. In the future, it is planned to build an extended classification model by including in the data sample of users of social networks of other age groups and to develop a support system for making managerial decisions for the university's admissions campaign.
The mathematical model of a two-dimensional nonlinear case of contaminant migration to the filter traps in catalytic porous media in isothermal conditions is presented. The mathematical model takes into account the mi...
详细信息
ISBN:
(纸本)9781728167602
The mathematical model of a two-dimensional nonlinear case of contaminant migration to the filter traps in catalytic porous media in isothermal conditions is presented. The mathematical model takes into account the micro and the meso/macro scale factors of the mass transfer process. The numerical solution of the respective boundary value problem was obtained by the method of finite differences. parallel computing optimization was conducted for numerical sweep method as well as for Orleans Grain management. The appropriate algorithmic complexity of the parallel block sweep method was shown. The general architecture of a new application was described. The resulted software was implemented in Microsoft Azure cloud computing platform. After a series of numerical experiments, the optimization performance results (speedup ratio and parallel efficiency) of a created solution were collected and discussed.
Pressure transient analysis model for Multiple Hydraulically Fractured Horizontal Well is proposed in this paper, which takes into account the adsorption effect and the diffusion effect of gas, the compression sensiti...
详细信息
ISBN:
(纸本)9781479925650
Pressure transient analysis model for Multiple Hydraulically Fractured Horizontal Well is proposed in this paper, which takes into account the adsorption effect and the diffusion effect of gas, the compression sensitivity effect, the wellbore storage and the skin factor. Applying Newman Product Principle, the semi-analytical pressure solution is presented. Considering the highly parallel infinite sum and integral in the solution, a GPU based parallel algorithm is implemented to accelerate calculation. Results on the platform with i3 540 CPU and NVIDIA GTX 550Ti GPU shows almost 40 times speedup, in which way the real time bottomhole pressure calculation of multiple fractured horizontal well is fulfilled.
In this work a computational procedure for two-scale topology optimisation problem using parallel computing techniques is developed. The goal is to obtain the best structure and material simultaneously, considering th...
详细信息
ISBN:
(纸本)9781905088294
In this work a computational procedure for two-scale topology optimisation problem using parallel computing techniques is developed. The goal is to obtain the best structure and material simultaneously, considering the minimum compliance criterion. An algorithmic strategy is presented in a suitable way for parallelization. In terms of parallel computing facilities it is used an IBM Cluster 1350 computing system comprising 70 computing nodes each with 2 dual core processors, for a total of 280 cores. Scalability studies are performed with mechanical structures of low/moderate dimensions and finally a very demanding computational problem ("bone" problem) shows the applicability of the model and methodology presented.
The scenario simulation analysis of water environmental emergencies is very important for risk prevention and control,and emergency *** quickly and accurately simulate the transport and diffusion process of high-inten...
详细信息
The scenario simulation analysis of water environmental emergencies is very important for risk prevention and control,and emergency *** quickly and accurately simulate the transport and diffusion process of high-intensity pollutants during sudden environmental water pollution events,in this study,a high-precision pollution transport and diffusion model for unstructured grids based on Compute Unified Device Architecture(CUDA)is *** finite volume method of a total variation diminishing limiter with the Kong proposed r-factor is used to reduce numerical diffusion and oscillation errors in the simulation of pollutants under sharp concentration conditions,and graphics processing unit acceleration technology is used to improve computational *** advection diffusion process of the model is verified numerically using two benchmark cases,and the efficiency of the model is evaluated using an engineering *** results demonstrate that the model perform well in the simulation of material transport in the presence of sharp ***,it has high computational *** acceleration ratio is 46 times the single-thread acceleration effect of the original *** efficiency of the accelerated model meet the requirements of an engineering application,and the rapid early warning and assessment of water pollution accidents is achieved.
Acceleration for the training process of Deep Neural Networks (DNNs) has been the focus of deep learning field. There were many researches of accelerating deep learning on different platforms. Among them, Intel Xeon P...
详细信息
ISBN:
(纸本)9781538637906
Acceleration for the training process of Deep Neural Networks (DNNs) has been the focus of deep learning field. There were many researches of accelerating deep learning on different platforms. Among them, Intel Xeon Phi Co-processor is a many-core platform which provides both strong programmability and high performance. But previous work about Intel Many Integrated Core (MIC) focused on parallel computing only in MIC. In this paper, we speed up the training process of DNNs applied for automatic speech recognition with CPU+MIC architecture. In this architecture, the training process of DNNs is executed both on MIC and CPU. We apply several optimization methods for I/O and calculation and set up experiments to approve these methods. Putting all methods together, results show that our optimized algorithm acquires about 20x speedup compared with the original sequential algorithm on CPU which uses one core.
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical ba...
详细信息
In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of "architecture algorithm programming application". Only in this way, parallel computing research becomes continuous development and more realistic.
Many-core parallel computing and programming are new challenges to formal specification and verification. This paper presents a semantic model for many-core parallel computing systems so that the systems can be modele...
详细信息
Many-core parallel computing and programming are new challenges to formal specification and verification. This paper presents a semantic model for many-core parallel computing systems so that the systems can be modeled and verified in a manageable way. The model is called Cylinder Computation Model (CCM) which is based on projection constructs in Projection Temporal Logic (PTL) and Modeling, Simulation and Verification Language (MSVL). To this end, the syntax and semantics of CCM are presented in detail. Further, some logic laws regarding CCM are given and the normal form of CCM programs is formalized and proved. Moreover, the operational semantics of CCM and an algorithm for implementing CCM programs within MSVL are also demonstrated. Finally, an example, simple word processor, is given to show how CCM works under MSVL paradigm. (C) 2012 Elsevier B.V. All rights reserved.
Phased-array ultrasonic nondestructive evaluation is an effective tool of safety assurance for key structural components. The paper presents a general post-processing methodology for phased-array ultrasonic inspection...
详细信息
Phased-array ultrasonic nondestructive evaluation is an effective tool of safety assurance for key structural components. The paper presents a general post-processing methodology for phased-array ultrasonic inspection data. The methodology is developed to integrate three components: mapping of sampling points to structure model, re-sampling from non-uniformly distributed sampling points in phased-array to a uniform volume, and data fusion strategies for multiple channels. An adaptive method called spatially adaptive Gaussian splatting is proposed for data re-sampling and fusion considering the reconstruction resolution and local characteristics of ultrasonic sound paths. This adaptivity provides a viable approach to minimize the effects of under-sampling, over-sampling, and holes which are introduced by the non-uniformly distributed sampling points. The processing of large scale data through segmentation and parallelization techniques is discussed in detail. The effectiveness and performance of the proposed methodology are investigated using actual phased-array ultrasonic testing data.
During the past decades, High-Performance computing (HPC) has been widely used in various industries. In particular, the exponential growth of GPU (graphics processing unit) is a key technology that has helped promoti...
详细信息
During the past decades, High-Performance computing (HPC) has been widely used in various industries. In particular, the exponential growth of GPU (graphics processing unit) is a key technology that has helped promoting the development of artificial intelligence in real-world use cases. When we use GPU to accelerate parallel applications, its programmability, resource management, and scheduling are non-trivial jobs to obtain optimized performance. Therefore, how to effectively exploit GPU resources and improve program performance has been a hot research topic recently. Benchmark does not always provide a good picture of the performance and details of the parallel applications. The various kinds of hardware devices and the constantly updated parallel programs make the performance analysis and modeling even more difficult. In this dissertation, there are four main contributions. First, we conduct a study on the GPU analytical performance model, which aims to estimate the suitable number of threads per block for performance improvement. Second, a novel method to elevate the limitation of GPU is proposed. This method offers a new way for optimization on GPU performance at the block schedule level. Third, we propose two parallel computing abstract models, namely, the computational and programming models that represent various computing paradigms based on Flynn’s taxonomy and simplify the workload distribution characteristics. This framework provides a general way to create an analytical performance model. Finally, we validate our proposed abstract models and demonstrate their usefulness with real-world applications in AI (Artificial Intelligence) on a distributed GPU system. The analytical performance model for CNN (Convolutional Neural Network) application analyzes performance characteristics on multiple GPUs, enabling users to evaluate their techniques before running applications on targeted machines.
暂无评论