检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

295 篇 期刊文献
158 篇 会议
6 册 图书

馆藏范围

459 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

344 篇 工学
- 272 篇 计算机科学与技术...
- 190 篇 软件工程
- 45 篇 控制科学与工程
- 44 篇 信息与通信工程
- 35 篇 光学工程
- 30 篇 生物工程
- 21 篇 生物医学工程（可授...
- 18 篇 电气工程
- 18 篇 电子科学与技术（可...
- 15 篇 机械工程
- 11 篇 化学工程与技术
- 9 篇 材料科学与工程（可...
- 8 篇 土木工程
- 7 篇 力学（可授工学、理...
- 6 篇 仪器科学与技术
- 6 篇 建筑学
- 6 篇 安全科学与工程
180 篇 理学
- 100 篇 数学
- 60 篇 物理学
- 44 篇 统计学（可授理学、...
- 34 篇 生物学
- 20 篇 系统科学
- 17 篇 化学
- 7 篇 地球物理学
45 篇 管理学
- 28 篇 管理科学与工程(可...
- 22 篇 图书情报与档案管...
- 17 篇 工商管理
16 篇 法学
- 16 篇 社会学
7 篇 经济学
- 7 篇 应用经济学
6 篇 农学
6 篇 医学
3 篇 教育学
2 篇 文学
1 篇 哲学

主题

15 篇 reinforcement le...
10 篇 semantics
9 篇 deep learning
8 篇 approximation al...
7 篇 decoding
7 篇 machine learning
7 篇 stochastic syste...
6 篇 computer science
6 篇 bayesian inferen...
5 篇 adversarial mach...
5 篇 speech recogniti...
5 篇 complexity theor...
5 篇 artificial intel...
5 篇 accuracy
4 篇 quantum control
4 篇 deep neural netw...
4 篇 quantum algorith...
4 篇 neural networks
4 篇 optimization
4 篇 computational li...

机构

71 篇 google deepmind ...
48 篇 google
28 篇 google deepmind
26 篇 google research ...
25 篇 mpi for intellig...
21 篇 google research
16 篇 google united st...
13 篇 google inc.
13 篇 deepmind united ...
10 篇 department of co...
10 篇 google inc. unit...
9 篇 department of co...
9 篇 google research ...
8 篇 department of el...
8 篇 department of co...
8 篇 department of co...
7 篇 department of co...
7 篇 deepmind
7 篇 heidelberg
6 篇 department of el...

作者

36 篇 bernhard schölko...
35 篇 kevin murphy
8 篇 müller klaus-rob...
7 篇 farhi edward
6 篇 jiang zhang
6 篇 bakas spyridon
6 篇 leibo joel z.
6 篇 søgaard anders
6 篇 menze bjoern
6 篇 montavon grégoir...
5 篇 summers ronald m...
5 篇 baumgartner mich...
5 篇 veličković petar
5 篇 antonelli michel...
5 篇 kopp-schneider a...
5 篇 sadigh dorsa
5 篇 isensee fabian
5 篇 xia fei
5 篇 demaine erik d.
4 篇 kreshuk anna

语言

395 篇 英文
63 篇 其他
1 篇 中文

检索条件"机构=Google DeepMind and Department of Computer Science and Technology"

共 459 条记录，以下是1-10 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Alignment of MPNNs and Graph Transformers 1

Alignment of MPNNs and Graph Transformers

引用

1st Geometry-Grounded Representation Learning and Generative Modeling Workshop, GRaM 2024 at the 41st International Conference on Machine Learning, ICML 2024

作者： Nguyen, Bao Yodaiken, Anjana Veličković, Petar Department of Computer Science and Technology United Kingdom Google DeepMind United Kingdom

As the complexity of machine learning (ML) model architectures increases, it is important to understand to what degree simpler and more efficient architectures can align with their complex counterparts. In this paper, we investigate the degree to which a Message Passing Neural Network (MPNN) can operate similarly to a Graph Transformer. We do this by training an MPNN to align with the intermediate embeddings of a Relational Transformer (RT). Throughout this process, we explore variations of the standard MPNN and assess the impact of different components on the degree of alignment. Our findings suggest that an MPNN can align to RT and the most important components that affect the alignment are the MPNN’s permutation invariant aggregation function, virtual node and layer normalisation. Copyright 2024 by the author(s).

关键词：

来源：评论

学校读者我要写书评

暂无评论

Commute-Time-Optimised Graphs for GNNs 1

Commute-Time-Optimised Graphs for GNNs

引用

1st Geometry-Grounded Representation Learning and Generative Modeling Workshop, GRaM 2024 at the 41st International Conference on Machine Learning, ICML 2024

作者： Sterner, Igor Su, Shiye Veličković, Petar Department of Computer Science and Technology University of Cambridge United Kingdom Google DeepMind United Kingdom

We explore graph rewiring methods that optimise commute time. Recent graph rewiring approaches facilitate long-range interactions in sparse graphs, making such rewirings commute-time-optimal on average. However, when an expert prior exists on which node pairs should or should not interact, a superior rewiring would favour short commute times between these privileged node pairs. We construct two synthetic datasets with known priors reflecting realistic settings, and use these to motivate two bespoke rewiring methods that incorporate the known prior. We investigate the regimes where our rewiring improves test performance on the synthetic datasets. Finally, we perform a case study on a real-world citation graph to investigate the practical implications of our work. Copyright 2024 by the author(s).

关键词： Graph theory

来源：评论

学校读者我要写书评

暂无评论

Experts Don't Cheat: Learning What You Don't Know By Predicting Pairs 41

Experts Don't Cheat: Learning What You Don't Know By Predict...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Johnson, Daniel D. Tarlow, Daniel Duvenaud, David Maddison, Chris J. Google DeepMind United Kingdom University of Toronto Department of Computer Science ON Canada

Identifying how much a model pˆθY|X knows about the stochastic real-world process pY|X it was trained on is important to ensure it avoids producing incorrect or "hallucinated" answers or taking unsafe actions. But this is difficult for generative models because probabilistic predictions do not distinguish between per-response noise (aleatoric uncertainty) and lack of knowledge about the process (epistemic uncertainty), and existing epistemic uncertainty quantification techniques tend to be overconfident when the model underfits. We propose a general strategy for teaching a model to both approximate pY|X and also estimate the remaining gaps between pˆθY|X and pY|X: train it to predict pairs of independent responses drawn from the true conditional distribution, allow it to "cheat" by observing one response while predicting the other, then measure how much it cheats. Remarkably, we prove that being good at cheating (i.e. cheating whenever it improves your prediction) is equivalent to being second-order calibrated, a principled extension of ordinary calibration that allows us to construct provably-correct frequentist confidence intervals for pY|X and detect incorrect responses with high probability. We demonstrate empirically that our approach accurately estimates how much models don't know across ambiguous image classification, (synthetic) language modeling, and partially-observable navigation tasks, outperforming existing techniques. Copyright 2024 by the author(s)

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

FRAPPÉ: A Group Fairness Framework for Post-Processing Everything 41

FRAPPÉ: A Group Fairness Framework for Post-Processing Ever...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Ţifrea, Alexandru Lahoti, Preethi Packer, Ben Halpern, Yoni Beirami, Ahmad Prost, Flavien Department of Computer Science ETH Zurich Switzerland Google DeepMind United Kingdom

Despite achieving promising fairness-error tradeoffs, in-processing mitigation techniques for group fairness cannot be employed in numerous practical applications with limited computation resources or no access to the training pipeline of the prediction model. In these situations, post-processing is a viable alternative. However, current methods are tailored to specific problem settings and fairness definitions and hence, are not as broadly applicable as in-processing. In this work, we propose a framework that turns any regularized in-processing method into a post-processing approach. This procedure prescribes a way to obtain post-processing techniques for a much broader range of problem settings than the prior post-processing literature. We show theoretically and through extensive experiments that our framework preserves the good fairness-error trade-offs achieved with in-processing and can improve over the effectiveness of prior post-processing methods. Finally, we demonstrate several advantages of a modular mitigation strategy that disentangles the training of the prediction model from the fairness mitigation, including better performance on tasks with partial group labels.1 Copyright 2024 by the author(s)

关键词：

来源：评论

学校读者我要写书评

暂无评论

BAGEL: Bootstrapping Agents by Guiding Exploration with Language 41

BAGEL: Bootstrapping Agents by Guiding Exploration with Lang...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Murty, Shikhar Manning, Christopher D. Shaw, Peter Joshi, Mandar Lee, Kenton Department of Computer Science Stanford University United States Google Deepmind United Kingdom

Following natural language instructions by executing actions in digital environments (e.g. web-browsers and REST APIs) is a challenging task for language model (LM) agents. Unfortunately, LM agents often fail to generalize to new environments without human demonstrations. This work presents BAGEL, a method for bootstrapping LM agents without human supervision. BAGEL converts a seed set of randomly explored trajectories or synthetic instructions, into demonstrations, via round-trips between two noisy LM components: an LM labeler which converts a trajectory into a synthetic instruction, and a zero-shot LM agent which maps the synthetic instruction into a refined trajectory. By performing these round-trips iteratively, BAGEL quickly converts the initial distribution of trajectories towards those that are well-described by natural language. We use BAGEL demonstrations to adapt a zero shot LM agent at test time via in-context learning over retrieved demonstrations, and find improvements of over 2-13% absolute on ToolQA and MiniWob++, with up to 13× reduction in execution failures. Copyright 2024 by the author(s)

关键词： Demonstrations

来源：评论

学校读者我要写书评

暂无评论

The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models 38

The Poisson Midpoint Method for Langevin Dynamics: Provably ...

引用

38th Conference on Neural Information Processing Systems, NeurIPS 2024

作者： Kandasamy, Saravanan Nagaraj, Dheeraj Department of Computer Science Cornell University United States Google DeepMind United Kingdom

Langevin Dynamics is a Stochastic Differential Equation (SDE) central to sampling and generative modeling and is implemented via time discretization. Langevin Monte Carlo (LMC), based on the Euler-Maruyama discretization, is the simplest and most studied algorithm. LMC can suffer from slow convergence - requiring a large number of steps of small step-size to obtain good quality samples. This becomes stark in the case of diffusion models where a large number of steps gives the best samples, but the quality degrades rapidly with smaller number of steps. Randomized Midpoint Method has been recently proposed as a better discretization of Langevin dynamics for sampling from strongly log-concave distributions. However, important applications such as diffusion models involve non-log concave densities and contain time varying drift. We propose its variant, the Poisson Midpoint Method, which approximates a small step-size LMC with large step-sizes. We prove that this can obtain a quadratic speed up of LMC under very weak assumptions. We apply our method to diffusion models for image generation and show that it maintains the quality of DDPM with 1000 neural network calls with just 50-80 neural network calls and outperforms ODE based methods with similar compute. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Interpretability Illusions in the Generalization of Simplified Models 41

Interpretability Illusions in the Generalization of Simplifi...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Friedman, Dan Lampinen, Andrew Dixon, Lucas Chen, Danqi Ghandeharioun, Asma Department of Computer Science Princeton University United States Google DeepMind United Kingdom Google Research United States

A common method to study deep learning systems is to use simplified model representations-for example, using singular value decomposition to visualize the model's hidden states in a lower dimensional space. This approach assumes that the results of these simplifications are faithful to the original model. Here, we illustrate an important caveat to this assumption: even if the simplified representations can accurately approximate the full model on the training set, they may fail to accurately capture the model's behavior out of distribution. We illustrate this by training Transformer models on controlled datasets with systematic generalization splits, including the Dyck balanced-parenthesis languages and a code completion task. We simplify these models using tools like dimensionality reduction and clustering, and then explicitly test how these simplified proxies match the behavior of the original model. We find consistent generalization gaps: cases in which the simplified proxies are more faithful to the original model on the in-distribution evaluations and less faithful on various tests of systematic generalization. This includes cases where the original model generalizes systematically but the simplified proxies fail, and cases where the simplified proxies generalize better. Together, our results raise questions about the extent to which mechanistic interpretations derived using tools like SVD can reliably predict what a model will do in novel situations. Copyright 2024 by the author(s)

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

CASPR: COMBINING AXES PRECONDITIONERS THROUGH KRONECKER APPROXIMATION FOR DEEP LEARNING 12

CASPR: COMBINING AXES PRECONDITIONERS THROUGH KRONECKER APPR...

引用

12th International Conference on Learning Representations, ICLR 2024

作者： Duvvuri, Sai Surya Devvrit, Fnu Anil, Rohan Hsieh, Cho-Jui Dhillon, Inderjit S. Department of Computer Science The University of Texas Austin United States Google DeepMind United Kingdom CS Department UCLA Google United States Google United States

Adaptive regularization based optimization methods such as full-matrix Adagrad which use gradient second-moment information hold significant potential for fast convergence in deep neural network (DNN) training, but are memory intensive and computationally demanding for large neural nets. We develop a technique called Combining AxeS PReconditioners (CASPR), which optimizes matrix-shaped DNN parameters by finding different preconditioners for each mode/axis of the parameter and combining them using a Kronecker-sum based approximation. The Kronecker-sum based combination allows us to show that CASPR is ordered between a well-known Kronecker product based combination, Shampoo, and full-matrix Adagrad preconditioners in Loewner order, as a result, it is nearer to full-matrix Adagrad than Shampoo. We also show tighter convergence guarantees in stochastic optimization compared to Shampoo. Furthermore, our experiments demonstrates that CASPR approximates the gradient second-moment matrix in full-matrix Adagrad more accurately, and shows significant improvement in training and generalization performance compared to existing practical adaptive regularization based methods such as Shampoo and Adam in a variety of tasks including graph neural network on OGBG-molpcba, Transformer on a universal dependencies dataset and auto-regressive large language modeling on C4 dataset. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models 41

When Linear Attention Meets Autoregressive Decoding: Towards...

引用

41st International Conference on Machine Learning, ICML 2024

作者： You, Haoran Fu, Yichao Wang, Zheng Yazdanbakhsh, Amir Lin, Yingyan School of Computer Science Georgia Institute of Technology Atlanta United States Google DeepMind Mountain View United States

Autoregressive Large Language Models (LLMs) have achieved impressive performance in language tasks but face two significant bottlenecks: (1) quadratic complexity in the attention module as the number of tokens increases, and (2) limited efficiency due to the sequential processing nature of autoregressive LLMs during generation. While linear attention and speculative decoding offer potential solutions, their applicability and synergistic potential for enhancing autoregressive LLMs remain uncertain. We conduct the first comprehensive study on the efficacy of existing linear attention methods for autoregressive LLMs, integrating them with speculative decoding. We introduce an augmentation technique for linear attention that ensures compatibility with speculative decoding, enabling more efficient training and serving of LLMs. Extensive experiments and ablation studies involving seven existing linear attention models and five encoder/decoder-based LLMs consistently validate the effectiveness of our augmented linearized LLMs. Notably, our approach achieves up to a 6.67 reduction in perplexity on the LLaMA model and up to a 2× speedup during generation compared to prior linear attention methods. Codes and models are available at https://***/GATECH-EIC/Linearized-LLM. Copyright 2024 by the author(s)

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

Group Fairness in Multilingual Speech Recognition Models

Group Fairness in Multilingual Speech Recognition Models

引用

2024 Findings of the Association for Computational Linguistics: NAACL 2024

作者： van Zee, Anna Katrine van Zee, Marc Søgaard, Anders Department of Computer Science Denmark University of Copenhagen Denmark Google Deepmind United Kingdom Center for Philosophy of AI Denmark

ISBN: (纸本)9798891761193

We evaluate the performance disparity of the Whisper and MMS families of ASR models across the VoxPopuli and Common Voice multilingual datasets, with an eye toward intersectionality. Our two most important findings are that model size, surprisingly, correlates logarithmically with worst-case performance disparities, meaning that larger (and better) models are less fair. We also observe the importance of intersectionality. In particular, models often exhibit significant performance disparity across binary gender for adolescents. © 2024 Association for Computational Linguistics.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共46页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：