Collaborative filtering is among the most preferred techniques when implementing recommender systems. Recently, great interest has turned toward parallel and distributed implementations of collaborative filtering algo...
详细信息
Collaborative filtering is among the most preferred techniques when implementing recommender systems. Recently, great interest has turned toward parallel and distributed implementations of collaborative filtering algorithms. This work is a survey of parallel and distributed collaborative filtering implementations, aiming to not only provide a comprehensive presentation of the field's development but also offer future research directions by highlighting the issues that need to be developed further.
parallel programming is the future of computer science. Now days shift to parallel processing makes it even more useful. This research effort aims at helping parallelism education on real life target systems, using pr...
详细信息
parallel programming is the future of computer science. Now days shift to parallel processing makes it even more useful. This research effort aims at helping parallelism education on real life target systems, using pr...
详细信息
parallel programming is the future of computer science. Now days shift to parallel processing makes it even more useful. This research effort aims at helping parallelism education on real life target systems, using production oriented software tools. On its beginning, a survey of software environments for parallel programming is presented. The surveyed software environments are categorized according to their main function. An identity is synthesized for each environment by software and project attributes. Based on it and a set of proper criteria there are selected two groups of tools, those of primary and those of secondary interest for this research. An analysis of functional characteristics is performed for both groups. From the first group an open source software environment is chosen as the basis platform that will be enriched with education oriented enhancements. The characteristics analysis is exploited for the proposal of a research and development framework. Its target is the support of parallel programming, on real life target systems, using production oriented software environments.
This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallel computer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the comb...
详细信息
This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallel computer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the combination of TLM analysis with Prony's method as well as with autoregressive moving average (ARMA) digital signal processing for electromagnetic field modelling. By combining these advanced computation techniques, typical electromagnetic field modelling of microwave structures by TLM analysis can be accelerated by a few orders of magnitude.
Mapping of data between nonmatching meshes is a key ingredient of multiphysics simulations. Black-box data mapping, which only operates on clouds of mesh vertices without connectivity, enables modular software environ...
详细信息
Large-scale deep learning models are trained distributedly due to memory and computing resource *** existing strategy generation approaches take optimal memory minimization as the *** fill in this gap,we propose a nov...
详细信息
Large-scale deep learning models are trained distributedly due to memory and computing resource *** existing strategy generation approaches take optimal memory minimization as the *** fill in this gap,we propose a novel algorithm that generates optimal parallelism strategies with the constraint of minimal memory *** propose a novel redundant memory cost model to calculate the memory overhead of each operator in a given parallel *** generate the optimal parallelism strategy,we formulate the parallelism strategy search problem into an integer linear programming problem and use an efficient solver to find minimal-memory intra-operator parallelism ***,the proposed algorithm has been extended and implemented in a multi-dimensional parallel training framework and is characterized by high throughput and minimal memory *** results demonstrate that our approach achieves memory savings of up to 67%compared to the latest Megatron-LM strategies;in contrast,the gap between the throughput of our approach and its counterparts is not large.
One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using sufficient local storage resources to hold huge amounts of data generated by the simulat...
详细信息
One of the challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using sufficient local storage resources to hold huge amounts of data generated by the simulation while providing high-performance I/O. DPFS, a distributedparallel file system, is designed and implemented to address this problem. DPFS collects locally distributed and unused storage resources as a supplement to the internal storage of parallel computing systems to satisfy the storage capacity requirement of large-scale applications. In addition, like parallel file systems, DPFS provides striping mechanisms that divide a file into small pieces and distributes them across multiple storage devices for parallel data access. The unique feature of DPFS is that it provides three file levels with each file level corresponding to a file striping method. In addition to the traditional linear striping method, DPFS also provides a novel Multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balancing and user interface are also addressed in DPFS. (C) 2004 Elsevier Inc. All rights reserved.
In this paper, we discuss a von Mises plasticity model with nonlinear isotropic hardening assuming small strains in a plane strain example of internally pressurised thick-walled cylinder subjected to different loading...
详细信息
In this paper, we discuss a von Mises plasticity model with nonlinear isotropic hardening assuming small strains in a plane strain example of internally pressurised thick-walled cylinder subjected to different loading conditions. The elastic deformation is modelled using the Navier-Cauchy equation. In regions where the von Mises stress exceeds the yield stress, corrections are made locally through a return mapping algorithm. We present a novel method that uses a Radial Basis Function-Finite Difference (RBF-FD) approach with Picard iteration to solve the system of nonlinear equations arising from plastic deformation. This technique eliminates the need to stabilise the divergence operator and avoids special positioning of the boundary nodes, while preserving the elegance of the meshless discretisation and avoiding the introduction of new parameters that would require tuning. The results of the proposed method are compared with analytical and Finite Element Method (FEM) solutions. The results show that the proposed method achieves comparable accuracy to FEM while offering significant advantages in the treatment of complex geometries without the need for conventional meshing or special treatment of boundary nodes or differential operators.
暂无评论