检索结果-内蒙古大学图书馆

distributed Censored Quantile Regression

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS 2023年第4期32卷 1685-1697页

作者： Sit, Tony Xing, Yue Chinese Univ Hong Kong Dept Stat Hong Kong Peoples R China Purdue Univ Dept Stat W Lafayette IN USA

This article discusses an extension of censored quantile regression to a distributed setting. With the growing availability of massive datasets, it is oftentimes an arduous task to analyze all the data with limited computational facilities efficiently. Our proposed method, which attempts to overcome this challenge, is comprised of two key steps, namely: (i) estimation of both Kaplan-Meier estimator and model coefficients in a parallel computing environment;(ii) aggregation of coefficient estimations from individual machines. We study the upper limit of the order of the number of machines for this computing environment, which, if fulfilled, guarantees that the proposed estimator converges at a comparable rate to that of the oracle estimator. In addition, we also provide two further modifications for distributed systems including (i) a communication-facilitated adaptation in the sense of Chen, Liu, and Zhang and (ii) a nonparametric counterpart along the direction of Kong and Xia for censored quantile regression. Numerical experiments are conducted to compare the proposed and the existing estimators. The promising results demonstrate the computation efficiency of the proposed methods. Finally, for practical concerns, a cross-validation procedure is also developed which can better select the hyperparameters for the proposed methodologies. for this article are available online.

关键词： distributed computing Lifetime and survival analysis Quantile estimation regression

来源：评论

学校读者我要写书评

暂无评论

A machine-learning-accelerated distributed LBFGS method for field development optimization: algorithm, validation, and applications

引用

COMPUTATIONAL GEOSCIENCES 2023年第3期27卷 425-450页

作者： Alpak, Faruk Gao, Guohua Florez, Horacio Shi, Steve Vink, Jeroen Blom, Carl Saaf, Fredrik Wells, Terence Shell Int Explorat & Prod Inc Shell Woodcreek Campus 150 N Dairy Ashford Rd Houston TX 77079 USA Shell Global Solut US Inc Houston TX USA Shell Global Solut BV The Hague Netherlands

We have developed a support vector regression (SVR) accelerated variant of the distributed derivative-free optimization (DFO) method using the limited-memory BFGS Hessian updating formulation (LBFGS) for subsurface field-development optimization problems. The SVR-enhanced distributed LBFGS (D-LBFGS) optimizer is designed to effectively locate multiple local optima of highly nonlinear optimization problems subject to numerical noise. It operates both on single- and multiple-objective field-development optimization problems. The basic D-LBFGS DFO optimizer runs multiple optimization threads in parallel and uses the linear interpolation method to approximate the sensitivity matrix of simulated responses with respect to optimized model parameters. However, this approach is less accurate and slows down convergence. In this paper, we implement an effective variant of the SVR method, namely epsilon-SVR, and integrate it into the D-LBFGS engine in synchronous mode within the framework of a versatile optimization library inside a next-generation reservoir simulation platform. Because epsilon-SVR has a closed-form of predictive formulation, we analytically calculate the approximated objective function and its gradients with respect to input model variables subject to optimization. We investigate two different methods to propose a new search point for each optimization thread in each iteration through seamless integration of epsilon-SVR with the D-LBFGS optimizer. The first method estimates the sensitivity matrix and the gradients directly using the analytical epsilon-SVR surrogate and then solves a LBFGS trust-region subproblem (TRS). The second method applies a trust-region search LBFGS method to optimize the approximated objective function using the analytical epsilon-SVR surrogate within a box-shaped trust region. We first show that epsilon-SVR provides accurate estimates of gradient vectors on a set of nonlinear analytical test problems. We then report the results of nume

关键词： Support vector regression SVR Machine learning LBFGS D-LBFGS Optimization Well location optimization WLO distributed computing Ensemble optimization

来源：评论

学校读者我要写书评

暂无评论

Data Utilization-Based Adaptive Data Management Method for distributed Storage System in WAN Environment

引用

Computer Systems Science & Engineering 2023年第9期46卷 3457-3469页

作者： Sanghyuck Nam Jaehwan Lee Kyoungchan Kim Mingyu Jo Sangoh Park School of Computer Science and Engineering Chung-Ang UniversitySeoul06974Korea Department of Computer Science and Engineering Kongju National UniversityCheonan31080Korea Qucell Networks Seongnam13590Korea

Recently,research on a distributed storage system that efficiently manages a large amount of data has been actively conducted following data production and demand *** expansion limits exist for traditional standalone storage systems,such as I/O and file system ***,the existing distributed storage system does not consider where data is consumed and is more focused on data dissemination and optimizing the lookup cost of data *** this leads to system performance degradation due to low locality occurring in a Wide Area Network(WAN)environment with high network *** problem hinders deploying distributed storage systems to multiple data centers over *** lowers the scalability of distributed storage systems to accommodate data storage *** paper proposes a method for distributing data in a WAN environment considering network latency and data locality to solve this problem and increase overall system *** proposed distributed storage method monitors data utilization and locality to classify data temperature as hot,warm,and *** assigned data temperature,the proposed algorithm adaptively selects the appropriate data center and places data accordingly to overcome the excess latency from the WAN environment,leading to overall system performance *** paper also conducts simulations to evaluate the proposed and existing distributed storage *** result shows that our proposed method reduced latency by 38%compared to the existing ***,the proposed method in this paper can be used in large-scale distributed storage systems over a WAN environment to improve latency and performance compared to existing methods,such as consistent hashing.

关键词： distributed system distributed storage distributed computing object storage

来源：评论

学校读者我要写书评

暂无评论

Synergistic feature selection and distributed classification framework for high-dimensional medical data analysis

引用

METHODSX 2025年 14卷 103219页

作者： Dhinakaran, D. Srinivasan, L. Raja, S. Edwin Valarmathi, K. Nayagam, M. Gomathy Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci & T Dept Comp Sci & Engn Chennai India Dr NGP Inst Technol Dept Comp Sci & Engn Coimbatore India PSR Engn Coll Dept Elect & Commun Engn Sivakasi India Ramco Inst Technol Dept Comp Sci & Business Syst Rajapalayam India

Feature selection and classification efficiency and accuracy are key to improving decision-making regarding medical data analysis. Since the medical datasets are large and complex, they give rise to certain problematic issues such as computational complexity, limited memory space, and a lesser number of correct classifications. In order to overcome these drawbacks, the new integrated algorithm is presented here: Synergistic Kruskal-RFE Selector and distributed Multi-Kernel Classification Framework (SKR-DMKCF). The innovative architecture of SKR-DMKCF results in the reduction of dimensionality while preserving useful characteristics of the image utilizing recursive feature elimination and multi-kernel classification in a distributed environment. Detailed evaluations were performed on four broad medical datasets and established our performance advantage. The average feature reduction ratio was 89 % for the proposed method, SKR-DMKCF, which can outperform all the methods by achieving the best classification average accuracy of 85.3 %, precision of 81.5 %, and recall 84.7 %. On the efficiency calculations, it was seen that the memory usage is a 25 % reduction compared to the existing methods and the speed-up time was a significant improvement as well to assure scalability for resource-limited environments. center dot Innovative Synergistic Kruskal-RFE Selector for efficient feature selection in medical datasets. center dot distributed Multi-Kernel Classification Framework achieving superior accuracy and computational efficiency.

关键词： Medical data analysis Feature selection distributed computing Recursive feature elimination and Classification

来源：评论

学校读者我要写书评

暂无评论

EFFECTIVE GRAPH REPRESENTATION SUPPORTING MULTI-AGENT distributed computing

引用

INTERNATIONAL JOURNAL OF INNOVATIVE computing INFORMATION AND CONTROL 2014年第1期10卷 101-113页

作者： Sedziwy, Adam AGH Univ Sci & Technol Dept Appl Comp Sci Al Mickiewicza 30 PL-30059 Krakow Poland

The parallel processing is an effective approach to solving those high complexity problems which may be represented as a set of independent or loosely coupled subproblems. In the latter case, however, the critical factor for a computation time is an overhead generated by communication among particular subtasks. The decomposition of a graph-based computational problem allows transforming it into a set of subproblems to be processed in parallel. A decomposition method should guarantee a good performance of Parallel computations with respect to communication and synchronization among agents managing a distributed representation of a considered system. In this paper we present the novel method of a decomposition, reducing coupling among subproblems and thus minimizing a required cooperation among agents. Comparison and performance tests are also included.

关键词： Graph Slashed form distributed computing Multi-agent system Lighting computations

来源：评论

学校读者我要写书评

暂无评论

High-Performance Simulations for Urban Planning: Implementing Parallel distributed Multi-Agent Systems in MATSim 23

High-Performance Simulations for Urban Planning: Implementin...

引用

23rd International Symposium on Parallel and distributed computing (ISPDC)

作者： Laudan, Janek Heinrich, Paul Nagel, Kai Tech Univ Berlin Transport Syst Planning Berlin Germany

ISBN: (纸本)9798350369205;9798350369199

As semiconductor design approaches physical limits, computer processing speeds are stagnating. This poses significant challenges for traffic simulations, which are becoming more and more computationally demanding. To maintain fast execution times while accommodating more complex simulations, it is essential to utilize the parallel computing capabilities of modern hardware. This paper discusses the need for an updated architectural design in the MATSim traffic simulation framework to take advantage of parallel computing infrastructures. We introduce a prototype that adapts the existing traffic simulation logic to a distributed parallel algorithm. Extensive benchmarks have been conducted to evaluate the prototype's performance and identify its limitations. The results demonstrate that the prototype performs up to 100 times faster than the current implementation. Based on these findings, we advocate for the integration of a distributed traffic simulation within the MATSim framework and outline necessary steps to enhance the prototype.

关键词： Mulit-Agent Transport Simulation distributed computing MPI Parallel computing

来源：评论

学校读者我要写书评

暂无评论

An O(1)-rounds Deterministic distributed Approximation Algorithm for the Traveling Salesman Problem in Congested Clique 20

An O(1)-rounds Deterministic Distributed Approximation Algor...

引用

20th Annual International Conference on distributed computing in Smart Sensor Systems and the Internet of Things (DCOSS-IoT)

作者： Saikia, Parikshit Natl Inst Technol Silchar Assam India

ISBN: (纸本)9798350369458;9798350369441

We study the Traveling Salesman Problem (TSP) in the Congested Clique Model (CCM) of distributed computing. We present a deterministic distributed algorithm that computes a tour for the TSP using O(1) rounds and O(m) messages for a given undirected weighted complete graph of n nodes and m edges with an approximation factor 2 of the optimal. The TSP has wide applications in logistics, planning, manufacturing and testing microchips, DNA sequencing etc., and we claim that our proposed O(1)-rounds approximation algorithm to the TSP, which is fast and efficient, can also be used to minimize the energy consumption in Wireless Sensor Networks.

关键词： Traveling Salesman Problem distributed computing Approximation Congested Clique

来源：评论

学校读者我要写书评

暂无评论

Release 2.0-NEMSIM-RT: A real-time distributed spiking neural network simulator

引用

SOFTWAREX 2024年 26卷

作者： Quintana, Fernando M. de la Torre, Juan C. Barcena-Gonzalez, Guillermo Guerrero-Lebrero, Maria P. Guerrero, Elisa Univ Cadiz Comp Sci Dept Intelligent Syst Res Grp Cadiz Spain

NESIM-RT is a specialized tool designed for simulating neuromorphic systems. In this new release we extend its capabilities to include state-of-the art models like the AdexLIF and Izhikevich, and to incorporate dynamic synaptic mechanisms such as Spike-Timing Dependent Plasticity (STDP). With these new features, researchers can now observe in real -time how different parameters influence these models and learning rules, thereby gaining deeper insights into neuronal function and network dynamics.

关键词： Spiking neural network Neuromorphic systems distributed computing Synaptic plasticity

来源：评论

学校读者我要写书评

暂无评论

distributed Matrix Computations With Low-Weight Encodings

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY

引用

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2023年 4卷 363-378页

作者： Das, Anindya Bijoy Ramamoorthy, Aditya Love, David J. Brinton, Christopher G. Purdue Univ Sch Elect & Comp Engn W Lafayette IN 47906 USA Iowa State Univ Dept Elect & Comp Engn Ames IA 50011 USA

Straggler nodes are well-known bottlenecks of distributed matrix computations which induce reductions in computation/communication speeds. A common strategy for mitigating such stragglers is to incorporate Reed-Solomon based MDS (maximum distance separable) codes into the framework;this can achieve resilience against an optimal number of stragglers. However, these codes assign dense linear combinations of submatrices to the worker nodes. When the input matrices are sparse, these approaches increase the number of non-zero entries in the encoded matrices, which in turn adversely affects the worker computation time. In this work, we develop a distributed matrix computation approach where the assigned encoded submatrices are random linear combinations of a small number of submatrices. In addition to being well suited for sparse input matrices, our approach continues to have the optimal straggler resilience in a certain range of problem parameters. Moreover, compared to recent sparse matrix computation approaches, the search for a "good" set of random coefficients to promote numerical stability in our method is much more computationally efficient. We show that our approach can efficiently utilize partial computations done by slower worker nodes in a heterogeneous system which can enhance the overall computation speed. Numerical experiments conducted through Amazon Web Services (AWS) demonstrate up to 30% reduction in per worker node computation time and 100x faster encoding compared to the available methods.

关键词： distributed computing MDS codes stragglers condition number sparsity

来源：评论

学校读者我要写书评

暂无评论

SimE4KG: distributed and Explainable Multi-Modal Semantic Similarity Estimation for Knowledge Graphs

引用

INTERNATIONAL JOURNAL OF SEMANTIC computing 2023年第2期17卷 199-221页

作者： Draschner, Carsten Felix Jabeen, Hajira Lehmann, Jens Univ Bonn Machine Learning & Articial Intelligence Lab Bonn Germany GESIS Leibniz Inst Social Sci D-50667 Cologne Germany Amazon Alexa AI Dresden Germany

In recent years, exciting sources of data have been modeled as knowledge graphs (KGs). This modeling represents both structural relationships and the entity-specific multi-modal data in KGs. In various data analytics pipelines and machine learning (ML), the task of semantic similarity estimation plays a significant role. Assigning similarity values to entity pairs is needed in recommendation systems, clustering, classification, entity matching/disambiguation and many others. Efficient and scalable frameworks are needed to handle the quadratic complexity of all-pair semantic similarity on Big Data KGs. Moreover, heterogeneous KGs demand multi-modal semantic similarity estimation to cover the versatile contents like categorical relations between classes or their attribute literals like strings, timestamps or numeric data. In this paper, we propose the SimE4KG framework as a resource providing generic open-source modules that perform semantic similarity estimation in multi-modal KGs. To justify the computational costs of similarity estimation, the SimE4KG generates reproducible, reusable and explainable results. The pipeline results are a native semantic RDF KG, including the experiment results, hyper-parameter setup and explanation of the results, like the most influential features. For fast and scalable execution in memory, we implemented the distributed approach using Apache Spark. The entire development of this framework is integrated into the holistic distributed Semantic ANalytics StAck (SANSA).

关键词： Semantic similarity knowledge graphs distributed computing explainable artificial intelligence scalable semantic processing RDF Apache Spark machine learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：