检索结果-内蒙古大学图书馆

您好，读者！请登录

内蒙古大学图书馆

首页
概况
党建
资源
服务
科研支持
- 论文收录引用证明
- 科技查新
知识产权
档案馆
帮助

咨询与建议

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

您的常用邮箱：*

您的手机号码：*

问题描述：

当前已输入0个字，您还可以输入200个字

全部搜索
期刊论文
图书
学位论文
标准
纸本馆藏
外文资源发现
数据库导航
超星发现

高级检索

时间限定

出版年份：

文献类型

图书期刊文献学位论文多媒体

馆藏选择

电子馆藏纸本馆藏

核心期刊

全部期刊 SCI 收录期刊 SSCI 收录期刊 EI 收录期刊 CSCD 收录期刊 CSSCI 收录期刊

语言

中文英文

文献类型

期刊文献图书学位论文标准纸本馆藏

帮助

文字说明：

T=题名（书名、题名），A=作者（责任者），K=主题词，P=出版物名称，PU=出版社名称，O=机构（作者单位、学位授予单位、专利申请人），L=中图分类号，C=学科分类号，U=全部字段，Y=年（出版发行年、学位年度、标准发布年）

检索规则说明：

AND代表“并且”；OR代表“或者”；NOT代表“不包含”；(注意必须大写,运算符两边需空一格)

检索范例：

范例一：(K=图书馆学 OR K=情报学) AND A=范并思 AND Y=1982-2016
范例二：P=计算机应用与软件 AND (U=C++ OR U=Basic) NOT K=Visual AND Y=2011-2016

分类表

所选分类

>> <<

限定检索结果

文献类型

452 篇 会议
27 篇 期刊文献

馆藏范围

479 篇 电子文献
0 种 纸本馆藏

日期分布

学科分类号

376 篇 工学
- 255 篇 计算机科学与技术...
- 233 篇 控制科学与工程
- 86 篇 电气工程
- 73 篇 软件工程
- 51 篇 机械工程
- 30 篇 石油与天然气工程
- 20 篇 生物工程
- 17 篇 信息与通信工程
- 15 篇 力学（可授工学、理...
- 12 篇 生物医学工程（可授...
- 9 篇 动力工程及工程热...
- 8 篇 电子科学与技术（可...
- 8 篇 交通运输工程
- 6 篇 材料科学与工程（可...
- 6 篇 土木工程
- 6 篇 安全科学与工程
- 5 篇 化学工程与技术
- 5 篇 环境科学与工程（可...
- 4 篇 建筑学
- 4 篇 船舶与海洋工程
84 篇 理学
- 40 篇 数学
- 36 篇 生物学
- 28 篇 系统科学
- 20 篇 统计学（可授理学、...
- 15 篇 物理学
- 5 篇 化学
33 篇 管理学
- 28 篇 管理科学与工程(可...
- 12 篇 工商管理
10 篇 教育学
- 10 篇 教育学
9 篇 医学
3 篇 军事学
2 篇 经济学
2 篇 法学
1 篇 农学

主题

38 篇 reinforcement le...
21 篇 machine learning
18 篇 neural networks
15 篇 heuristic algori...
13 篇 adaptive control
12 篇 vehicle dynamics
12 篇 control systems
12 篇 dynamics
10 篇 optimization
10 篇 sliding mode con...
10 篇 trajectory
10 篇 deep reinforceme...
9 篇 optimal control
9 篇 robustness
8 篇 deep learning
8 篇 simulation
8 篇 robotics
7 篇 model predictive...
7 篇 nonlinear system...
7 篇 data models

机构

5 篇 mit cambridge ma...
4 篇 georgia inst tec...
4 篇 univ calif berke...
3 篇 univ calif san d...
3 篇 univ penn philad...
3 篇 stanford univ st...
3 篇 univ michigan de...
2 篇 ohio state univ ...
2 篇 duke univ durham...
2 篇 natl renewable e...
2 篇 school of electr...
2 篇 katholieke univ ...
2 篇 univ penn dept e...
2 篇 durban univ tech...
2 篇 delft univ techn...
2 篇 college of autom...
2 篇 univ bonn autono...
2 篇 zhejiang univ de...
2 篇 school of law he...
2 篇 carnegie mellon ...

作者

3 篇 vamvoudakis kyri...
3 篇 zavlanos michael...
3 篇 zhang baosen
3 篇 li na
3 篇 wang cong
3 篇 hazan elad
3 篇 cong wang
2 篇 gokdag mustafa
2 篇 soffker dirk
2 篇 michiels w.
2 篇 nakahira yorie
2 篇 cui wenqi
2 篇 zico kolter j.
2 篇 tomizuka masayos...
2 篇 fradkov alexande...
2 篇 pravesh
2 篇 levine sergey
2 篇 minasyan edgar
2 篇 pavlichenko dmyt...
2 篇 fan chuchu

语言

466 篇 英文
9 篇 其他
5 篇 中文

检索条件"任意字段=5th Annual Conference on Learning for Dynamics and Control"

共 479 条记录，以下是21-30 订阅

全选清除本页清除全部题录导出标记到"检索档案"

详细简洁

排序：

Equilibria of Fully Decentralized learning in Networked Systems 5

Equilibria of Fully Decentralized Learning in Networked Syst...

引用

5th annual conference on learning for dynamics and control

作者： Jiang, Yan Cui, Wenqi Zhang, Baosen Cortes, Jorge Univ Washington Dept Elect & Comp Engn Seattle WA 98195 USA Univ Calif San Diego Dept Mech & Aerosp Engn San Diego CA 92093 USA

Existing settings of decentralized learning either require players to have full information or the system to have certain special structure that may be hard to check and hinder their applicability to practical systems. To overcome this, we identify a structure that is simple to check for linear dynamical system, where each player learns in a fully decentralized fashion to minimize its cost. We first establish the existence of pure strategy Nash equilibria in the resulting noncooperative game. We then conjecture that the Nash equilibrium is unique provided that the system satisfies an additional requirement on its structure. We also introduce a decentralized mechanism based on projected gradient descent to have agents learn the Nash equilibrium. Simulations on a 5-player game validate our results.

关键词： Decentralized control multi-agent learning Nash equilibrium noncooperative game

来源：评论

学校读者我要写书评

暂无评论

CT-DQN: control-Tutored Deep Reinforcement learning 5

CT-DQN: Control-Tutored Deep Reinforcement Learning

引用

5th annual conference on learning for dynamics and control

作者： De Lellis, Francesco Coraggio, Marco Russo, Giovanni Musolesi, Mirco di Bernardo, Mario Univ Naples Federico II Naples Italy Scuola Super Meridionale Naples Italy Univ Salerno Salerno Italy UCL London England Univ Bologna Bologna Italy

One of the major challenges in Deep Reinforcement learning for control is the need for extensive training to learn a policy. Motivated by this, we present the design of the control-Tutored Deep Q-Networks (CT-DQN) algorithm, a Deep Reinforcement learning algorithm that leverages a control tutor, i.e., an exogenous control law, to reduce learning time. the tutor can be designed using an approximate model of the system, without any assumption about the knowledge of the system dynamics. there is no expectation that it will be able to achieve the control objective if used standalone. During learning, the tutor occasionally suggests an action, thus partially guiding exploration. We validate our approach on three scenarios from OpenAI Gym: the inverted pendulum, lunar lander, and car racing. We demonstrate that CT-DQN is able to achieve better or equivalent data efficiency with respect to the classic function approximation solutions.

关键词： Reinforcement learning based control deep reinforcement learning feedback control

来源：评论

学校读者我要写书评

暂无评论

Full Gradient Deep Reinforcement learning for Average-Reward Criterion 5

Full Gradient Deep Reinforcement Learning for Average-Reward...

引用

5th annual conference on learning for dynamics and control

作者： Pagare, Tejas Borkar, Vivek Avrachenkov, Konstantin Indian Inst Technol Dept Elect Engn Mumbai 400076 Maharashtra India INRIA Sophia Antipolis 2004 Route LuciolesBP93 F-06902 Valbonne France

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al. (2021) to average reward problems. We experimentally compare widely used RVI Q-learning with recently proposed Differential Q-learning in the neural function approximation setting with Full Gradient DQN and DQN. We also extend this to learn Whittle indices for Markovian restless multi-armed bandits. We observe a better convergence rate of the proposed Full Gradient variant across different tasks.(1)

关键词： average reward Markov decision processes Full Gradient DQN algorithm restless bandits Whittle index

来源：评论

学校读者我要写书评

暂无评论

CLAS: Coordinating Multi-Robot Manipulation with Central Latent Action Spaces 5

CLAS: Coordinating Multi-Robot Manipulation with Central Lat...

引用

5th annual conference on learning for dynamics and control

作者： Aljalbout, Elie Karl, Maximilian van der Smagt, Patrick Volkswagen Grp Machine Learning Res Lab Munich Germany

Multi-robot manipulation tasks involve various control entities that can be separated into dynamically independent parts. A typical example of such real-world tasks is dual-arm manipulation. learning to naively solve such tasks with reinforcement learning is often unfeasible due to the sample complexity and exploration requirements growing with the dimensionality of the action and state spaces. Instead, we would like to handle such environments as multi-agent systems and have several agents control parts of the whole. However, decentralizing the generation of actions requires coordination across agents through a channel limited to information central to the task. this paper proposes an approach to coordinating multi-robot manipulation through learned latent action spaces that are shared across different agents. We validate our method in simulated multi-robot manipulation tasks and demonstrate improvement over previous baselines in terms of sample efficiency and learning performance.

关键词： Multi-robot manipulation latent action spaces reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Agile Catching with Whole-Body MPC and Blackbox Policy learning 5

Agile Catching with Whole-Body MPC and Blackbox Policy Learn...

引用

5th annual conference on learning for dynamics and control

作者： Abeyruwan, Saminda Bewley, Alex Boffi, Nicholas M. Choromanski, Krzysztof D'Ambrosio, David Jain, Deepali Sanketi, Pannag Shankar, Anish Sindhwani, Vikas Singh, Sumeet Slotine, Jean-Jacques Tu, Stephen Google Robot Mountain View CA 94043 USA

We address a benchmark task in agile robotics: catching objects thrown at high-speed. this is a challenging task that involves tracking, intercepting, and cradling a thrown object with access only to visual observations of the object and the proprioceptive state of the robot, all within a fraction of a second. We present the relative merits of two fundamentally different solution strategies: (i) Model Predictive control using accelerated constrained trajectory optimization, and (ii) Reinforcement learning using zeroth-order optimization. We provide insights into various performance tradeoffs including sample efficiency, sim-to-real transfer, robustness to distribution shifts, and wholebody multimodality via extensive on-hardware experiments. We conclude with proposals on fusing "classical" and "learning-based" techniques for agile robot control. Videos of our experiments may be found here: https://***/view/agile-catching.

关键词： Model predictive control

来源：评论

学校读者我要写书评

暂无评论

the Impact of the Geometric Properties of the Constraint Set in Safe Optimization with Bandit Feedback 5

The Impact of the Geometric Properties of the Constraint Set...

引用

5th annual conference on learning for dynamics and control

作者： Hutchinson, Spencer Turan, Berkay Alizadeh, Mahnoosh Univ Calif Santa Barbara Santa Barbara CA 93106 USA

We consider a safe optimization problem with bandit feedback in which an agent sequentially chooses actions and observes responses from the environment, with the goal of maximizing an arbitrary function of the response while respecting stage-wise constraints. We propose an algorithm for this problem, and study how the geometric properties of the constraint set impact the regret of the algorithm. In order to do so, we introduce the notion of the sharpness of a particular constraint set, which characterizes the difficulty of performing learning within the constraint set in an uncertain setting. this concept of sharpness allows us to identify the class of constraint sets for which the proposed algorithm is guaranteed to enjoy sublinear regret. Simulation results for this algorithm support the sublinear regret bound and provide empirical evidence that the sharpness of the constraint set impacts the performance of the algorithm.

关键词： Safe learning Bandits Optimization

来源：评论

学校读者我要写书评

暂无评论

Lie Group Forced Variational Integrator Networks for learning and control of Robot Systems 5

Lie Group Forced Variational Integrator Networks for Learnin...

引用

5th annual conference on learning for dynamics and control

作者： Duruisseaux, Valentin Duong, thai Leok, Melvin Atanasov, Nikolay Univ Calif San Diego Dept Math La Jolla CA 92093 USA Univ Calif San Diego Dept Elect & Comp Engn La Jolla CA 92093 USA

Incorporating prior knowledge of physics laws and structural properties of dynamical systems into the design of deep learning architectures has proven to be a powerful technique for improving their computational efficiency and generalization capacity. learning accurate models of robot dynamics is critical for safe and stable control. Autonomous mobile robots, including wheeled, aerial, and underwater vehicles, can be modeled as controlled Lagrangian or Hamiltonian rigid-body systems evolving on matrix Lie groups. In this paper, we introduce a new structure-preserving deep learning architecture, the Lie group Forced Variational Integrator Network (LieFVIN), capable of learning controlled Lagrangian or Hamiltonian dynamics on Lie groups, either from position-velocity or position-only data. By design, LieFVINs preserve both the Lie group structure on which the dynamics evolve and the symplectic structure underlying the Hamiltonian or Lagrangian systems of interest. the proposed architecture learns surrogate discrete-time flow maps allowing accurate and fast prediction without numerical-integrator, neural-ODE, or adjoint techniques, which are needed for vector fields. Furthermore, the learnt discrete-time dynamics can be utilized with computationally scalable discrete-time (optimal) control strategies.

关键词： dynamics learning Variational Integrators Symplectic Integrators Structure-Preserving Neural Networks Physics-Informed Machine learning Predictive control Lie Group dynamics

来源：评论

学校读者我要写书评

暂无评论

Satellite Navigation and Coordination with Limited Information Sharing 5

Satellite Navigation and Coordination with Limited Informati...

引用

5th annual conference on learning for dynamics and control

作者： Dolan, Sydney Nayak, Siddharth Balakrishnan, Hamsa MIT Cambridge MA 02139 USA

We explore space traffic management as an application of collision-free navigation in multi-agent systems where vehicles have limited observation and communication ranges. We investigate the effectiveness of transferring a collision avoidance multi-agent reinforcement (MARL) model trained on a ground environment to a space one. We demonstrate that the transfer learning model outperforms a model that is trained directly on the space environment. Furthermore, we find that our approach works well even when we consider the perturbations to satellite dynamics caused by the Earth's oblateness. Finally, we show how our methods can be used to evaluate the benefits of information-sharing between satellite operators in order to improve coordination.

关键词： transfer learning multi-agent reinforcement learning graph neural networks space traffic management

来源：评论

学校读者我要写书评

暂无评论

Can learning Deteriorate control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online learning 5

Can Learning Deteriorate Control? Analyzing Computational De...

引用

5th annual conference on learning for dynamics and control

作者： Dai, Xiaobing Lederer, Armin Yang, Zewen Hirche, Sandra Tech Univ Munich Chair Informat Oriented Control D-80333 Munich Germany

When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations.

关键词： Gaussian process regression learning-based control computational delay event-triggered learning online learning

来源：评论

学校读者我要写书评

暂无评论

Filter-Aware Model-Predictive control 5

Filter-Aware Model-Predictive Control

引用

5th annual conference on learning for dynamics and control

作者： Kayalibay, Baris Mirchev, Atanas Agha, Ahmed van der Smagt, Patrick Bayer, Justin Volkswagen Grp Machine Learning Res Lab Munich Germany

Partially-observable problems pose a trade-off between reducing costs and gathering information. they can be solved optimally by planning in belief space, but that is often prohibitively expensive. Model-predictive control (MPC) takes the alternative approach of using a state estimator to form a belief over the state, and then plan in state space. this ignores potential future observations during planning and, as a result, cannot actively increase or preserve the certainty of its own state estimate. We find a middle-ground between planning in belief space and completely ignoring its dynamics by only reasoning about its future accuracy. Our approach, filter-aware MPC, penalises the loss of information by what we call "trackability", the expected error of the state estimator. We show that model-based simulation allows condensing trackability into a neural network, which allows fast planning. In experiments involving visual navigation, realistic every-day environments and a two-link robot arm, we show that filter-aware MPC vastly improves regular MPC.

关键词： model-predictive control partially-observable dynamic programming

来源：评论

学校读者我要写书评

暂无评论

没有更多数据了...

全选清除本页清除全部题录导出标记到“检索档案”

共48页 << < 1 2 3 4 5 6 7 8 9 10 > >>

检索报告对象比较合并检索0

隐藏清空

合并搜索

回到顶部

执行限定条件

内容：

评分：

请选择保存的检索档案：

请选择收藏分类：

订阅名称：

通借通还

温馨提示：

图书名称：

借书校区：

取书校区：

手机号码：

邮箱地址：

一卡通帐号：

电话和邮箱必须正确填写，我们会与您联系确认。

联系人：

所在院系：

联系邮箱：

联系电话：

内蒙古自治区呼和浩特市赛罕区大学西街235号邮编: 010021

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：