检索结果-内蒙古大学图书馆

2023 IEEE International Symposium on Information Theory, ISIT 2023

作者： Liu, Hsuan-Po Soleymani, Mahdi Mahdavifar, Hessam University of Michigan Department of Electrical Engineering and Computer Science Ann ArborMI48109 United States Halicioglu Data Science Institute University of California San Diego La Jolla CA92093 United States

ISBN: (纸本)9781665475549

Distributed computing has attracted significant recent attention for speeding up large-scale computations by disseminating computational jobs from a central master node across several worker nodes/servers. However, worker nodes are often untrusted and can also collude to gain unauthorized access to sensitive data. Hence, sharing sensitive data with them raises data privacy concerns. Coded computing has emerged as a promising framework for speeding up distributed computing and can be also adapted to address security and privacy concerns utilizing tools from secret sharing and multi-party computing. However, ensuring perfect information-theoretic privacy imposes a strict threshold on the maximum number of colluding workers the protocol can tolerate and, also, necessitates quantizing/mapping data to finite fields. Differential privacy is a widely accepted practical measure to capture the privacy leakage of the shared data. The mainstream approach is then to add perturbations to the data via randomized mechanisms. In this paper, we revisit coded computing, and especially when it is adapted to handle real-valued data, and analyze the privacy guarantees through the lens of differential privacy in terms of the (?,d)-differential privacy metric, for the first time in the literature. All the computations are done over the field of real/complex numbers and data privacy, in terms of differential privacy, is attained by adding noise terms in a certain structured way. In particular, the noise is added through the secret sharing mechanism (which can be, in principle, decoded and cancelled out at the master) as means of ensuring differential privacy. Furthermore, we propose a differentially private distributed matrix multiplication protocol for matrix multiplications that keeps the privacy of data in the worst adversarial case. © 2023 IEEE.

关键词： Sensitive data

来源：评论

学校读者我要写书评

暂无评论

Topology-Informed Pre-training of Graph Neural Networks

Topology-Informed Pre-training of Graph Neural Networks

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Peiyu Liang Yulia R. Gel Yuzhou Chen Department of Computer and Information Sciences Temple University USA Department of Statistics Virginia Tech USA National Science Foundation Department of Statistics University of California Riverside USA

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Pre-training has emerged as a dominant paradigm in graph representation learning to address data scarcity and generalization challenges. The majority of existing methods primarily focus on refining fine-tuning and prompting techniques to extract information from pre-trained models. However, the effectiveness of these approaches is contingent upon the quality of the pre-trained knowledge (i.e., latent representations). Inspired by the recent success in topological representation learning, we propose a novel pre-training strategy to capture and learn topological information of graphs. The key to the success of our strategy is to pre-train expressive Graph Neural Networks (GNNs) at the levels of individual nodes while accounting for the key topological characteristics of a graph so that GNNs become sufficiently powerful to effectively encode input graph information. The proposed model is designed to be seamlessly integrated with various downstream graph representation learning tasks. data and code are available at https://***/pyliang-graph/Topology-Informed-Pre-training-of-Graph-Neural-Networks.

关键词： Representation learning Codes Refining Signal processing Performance gain Graph neural networks Acoustics data mining Speech processing

来源：评论

学校读者我要写书评

暂无评论

Stable differentiable causal discovery 24

Stable differentiable causal discovery

引用

Proceedings of the 41st International Conference on Machine Learning

作者： Achille Nazaret Justin Hong Elham Azizi David Blei Department of Computer Science and Irving Institute for Cancer Dynamics Columbia University New Yor Department of Computer Science and Irving Institute for Cancer Dynamics and Department of Biomedical Engineering Columbia University New York Department of Computer Science and Department of Statistics Columbia University New York

Inferring causal relationships as directed acyclic graphs (DAGs) is an important but challenging problem. Differentiable Causal Discovery (DCD) is a promising approach to this problem, framing the search as a continuous optimization. But existing DCD methods are numerically unstable, with poor performance beyond tens of variables. In this paper, we propose Stable Differentiable Causal Discovery (SDCD), a new method that improves previous DCD methods in two ways: (1) It employs an alternative constraint for acyclicity; this constraint is more stable, both theoretically and empirically, and fast to compute. (2) It uses a training procedure tailored for sparse causal graphs, which are common in real-world scenarios. We first derive SDCD and prove its stability and correctness. We then evaluate it with both observational and interventional data and in both small-scale and large-scale settings. We find that SDCD outperforms existing methods in convergence speed and accuracy, and can scale to thousands of variables.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Short Note on Solving Partial Differential Equations Using Convolutional Neural Networks 27th

A Short Note on Solving Partial Differential Equations Using...

引用

27th International Conference on Domain Decomposition Methods in science and Engineering, DD 2022

作者： Grimm, Viktor Heinlein, Alexander Klawonn, Axel Department of Mathematics and Computer Science University of Cologne Weyertal 86-90 Koln50931 Germany Delft University of Technology Delft Institute of Applied Mathematics Mekelweg 4 Delft2628 CD Netherlands Center for Data and Simulation Science University of Cologne Cologne Germany

ISBN: (纸本)9783031507687

Solving partial differential equations (PDEs) is a common task in numerical mathematics and scientific computing. Typical discretization schemes, for example, finite element (FE), finite volume (FV), or finite difference (FD) methods, have the disadvantage that the computations have to be repeated once the boundary conditions (BCs) or the geometry change slightly;typical examples requiring the solution of many similar problems are time-dependent and inverse problems or uncertainty quantification. © 2024, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Boundary conditions

来源：评论

学校读者我要写书评

暂无评论

Approaches to the Development of an Effective Regulatory Ecosystem and Policies for Big data Analytics in the Republic of Kazakhstan

Approaches to the Development of an Effective Regulatory Eco...

引用

2022 International Conference on Smart Information Systems and Technologies, SIST 2022

作者： Akatov, Kamil Yedilkhan, Didar Urnaliyev, Alibi Astana It University Department Of Computational And Data Science Nur-Sultan Kazakhstan Astana It University Department Of Computer Engineering Nur-Sultan Kazakhstan

ISBN: (数字)9781665467902

ISBN: (纸本)9781665467902

This paper studies the Big data ecosystem and regulatory approaches for the technological issues used in Kazakhstan. Key points are considered in terms of the importance and necessity of managing data at various levels for decision-making and facilitating their introduction into economic circulation, including an analysis of international approaches and cases in the use of data by authorities and organizations. The main aim of this paper is to make analysis and prepare model for improving policies for the development of Big data policies included analytics at national level. © 2022 IEEE.

关键词： Information management

来源：评论

学校读者我要写书评

暂无评论

Tree Trace Reconstruction - Reductions to String Trace Reconstruction

Tree Trace Reconstruction - Reductions to String Trace Recon...

引用

IEEE International Symposium on Information Theory

作者： Thomas Jacob Maranzatto Department of Mathematics Statistics and Computer Science University of Illinois Chicago Chicago IL USA

ISBN: (数字)9798350382846

ISBN: (纸本)9798350382853

In this paper we consider recovering combinatorial objects from many noisy observations. The first part of the paper concerns reconstructing trees from traces in the tree edit distance model. Previous work focused on reconstructing various classes of labelled trees, while our work gives reductions from the classic string reconstruction setting to unlabelled trees. In the second part of the paper we discuss combinatorial identities of the binary deletion channel on finite and infinite strings. We link probabilities of observing bits in a trace to derivatives of certain generating functions. We also give identities for the deletion channel, conditioned on traces having fixed length.

关键词： Noise measurement Information theory

来源：评论

学校读者我要写书评

暂无评论

Proposed Hybrid AI-Driven Framework for Stock Market Prediction

Proposed Hybrid AI-Driven Framework for Stock Market Predict...

引用

Advancements in Electrical, Electronics, Communication, Computing and Automation (ICAECA), International Conference on

作者： Rajini S Shreya Vijay Mayuri G Moses John Wesley Jyothi Lakshmi N Department of Computer Science Engineering Kumaraguru College of Technology Coimbatore India Department of Artificial Intelligence and Data Science Kumaraguru College of Technology Coimbatore India

ISBN: (数字)9798331525439

ISBN: (纸本)9798331525446

The Stock Market has developed as a revolutionary force, transforming global economies through democratizing wealth creation. With the use of Artificial Intelligence and Machine Learning, investors can navigate complicated financial terrain. While improving predictive accuracy, the underlying complexities and varied market opinions can render decision-making effective. This study emphasizes trend analysis and prediction of future values, enabling users to make smart investment decisions with little effort. Using a Hybrid Machine Learning Model, the method brings together several top-performing algorithms to improve predictive power and trend observation. The most important components of the project are Market data Retrieval, Time Series Forecasting, Sentiment Analysis and data Visualization, all of which assist in creating thorough reports that aid strategic investment decisions.

关键词： Accuracy Navigation Time series analysis data visualization Machine learning data retrieval Complexity theory Stock markets Forecasting Investment

来源：评论

学校读者我要写书评

暂无评论

Optimal Private and Communication Constraint Distributed Goodness-of-Fit Testing for Discrete Distributions in the Large Sample Regime

arXiv

引用

arXiv 2024年

作者： Vuursteen, Lasse Department of Statistics and Data Science The Wharton School University of Pennsylvania PhiladelphiaPA19104 United States

We study distributed goodness-of-fit testing for discrete distribution under bandwidth and differential privacy constraints. Information constraint distributed goodness-of-fit testing is a problem that has received considerable attention recently. The important case of discrete distributions is theoretically well understood in the classical case where all data is available in one "central" location. In a federated setting, however, data is distributed across multiple "locations" (e.g. servers) and cannot readily be shared due to e.g. bandwidth or privacy constraints that each server needs to satisfy. We show how recently derived results for goodness-of-fit testing for the mean of a multivariate Gaussian model extend to the discrete distributions, by leveraging Le Cam’s theory of statistical equivalence. In doing so, we derive matching minimax upper- and lower-bounds for the goodness-of-fit testing for discrete distributions under bandwidth or privacy constraints in the regime where the number of samples held locally is *** Codes 62G10, 62G10, 62G10, 62G10, 62G10 © 2024, CC BY.

关键词： Differential privacy

来源：评论

学校读者我要写书评

暂无评论

When Large Language Models Meet Vector databases: A Survey

When Large Language Models Meet Vector Databases: A Survey

引用

Artificial Intelligence x Multimedia (AIxMM), Conference on

作者： Zhi Jing Yongye Su Yikun Han School of Computer Science Carnegie Mellon University Pittsburgh US Department of Computer Science Purdue University West Lafayette US Department of Statistics University of Michigan Ann Arbor US

ISBN: (数字)9798350390971

ISBN: (纸本)9798350390988

This survey explores the synergistic potential of Large Language Models (LLMs) and Vector databases (VecDBs), a burgeoning but rapidly evolving research area. With the proliferation of LLMs comes a host of challenges, including hallucinations, outdated knowledge, prohibitive commercial application costs, and memory issues. VecDBs emerge as a compelling solution to these issues by offering an efficient means to store, retrieve, and manage the high-dimensional vector representations intrinsic to LLM operations. Through this nuanced review, we delineate the foundational principles of LLMs and VecDBs and critically analyze their integration’s impact on enhancing LLM functionalities. This discourse extends into a discussion on the speculative future developments in this domain, aiming to catalyze further research into optimizing the confluence of LLMs and VecDBs for advanced data handling and knowledge extraction capabilities.

关键词： Surveys Costs Large language models data handling Retrieval augmented generation Prototypes Multimedia databases Vectors data mining Systematic literature review

来源：评论

学校读者我要写书评

暂无评论

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

arXiv

引用

arXiv 2023年

作者： Schwienhorst, Benedikt Lütke Kock, Lucas Klein, Nadja Nott, David J. Department of Mathematics University of Hamburg Germany Department of Statistics and Data Science National University of Singapore Singapore Research Center Trustworthy Data Science and Security Department of Statistics Technische Universität Dortmund Germany

Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double exponential families, for which the dispersion parameter can vary with the features. A theoretical analysis shows that dropout regularization prefers rare but important features in both the mean and dispersion, generalizing an earlier result for conventional generalized linear models. To illustrate, we apply dropout to adaptive smoothing with B-splines, where both the mean and dispersion parameters are modeled flexibly. The important B-spline basis functions can be thought of as rare features, and we confirm in experiments that dropout is an effective form of regularization for mean and dispersion parameters that improves on a penalized maximum likelihood approach with an explicit smoothness penalty. An application to traffic detection data from Berlin further illustrates the benefits of our method. Copyright © 2023, The Authors. All rights reserved.

关键词： Splines

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：