咨询与建议

限定检索结果

文献类型

  • 2 篇 期刊文献
  • 1 篇 会议

馆藏范围

  • 3 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 2 篇 工学
    • 1 篇 电气工程
    • 1 篇 计算机科学与技术...
    • 1 篇 土木工程
    • 1 篇 交通运输工程

主题

  • 3 篇 policy-gradient ...
  • 1 篇 momdps
  • 1 篇 reinforcement le...
  • 1 篇 approximation al...
  • 1 篇 cooperative adap...
  • 1 篇 multiobjective e...
  • 1 篇 performance crit...
  • 1 篇 multiobjective m...
  • 1 篇 morl approaches
  • 1 篇 neural networks
  • 1 篇 approximation th...
  • 1 篇 policies
  • 1 篇 performance metr...
  • 1 篇 optimization
  • 1 篇 multiobjective o...
  • 1 篇 multiobjective r...
  • 1 篇 fine-tuning
  • 1 篇 approximation me...
  • 1 篇 evolutionary alg...
  • 1 篇 pareto following

机构

  • 1 篇 univ laval dept ...
  • 1 篇 faculty of engin...
  • 1 篇 fpt technology r...
  • 1 篇 politecn milan d...
  • 1 篇 university of en...

作者

  • 1 篇 massimo piccardi
  • 1 篇 pirotta matteo
  • 1 篇 parisi simone
  • 1 篇 smacchia nicola
  • 1 篇 chaib-draa brahi...
  • 1 篇 restelli marcell...
  • 1 篇 desjardins charl...
  • 1 篇 bascetta luca
  • 1 篇 xuan-bang nguyen
  • 1 篇 xuan-hieu phan

语言

  • 3 篇 英文
检索条件"主题词=Policy-gradient algorithms"
3 条 记 录,以下是1-10 订阅
排序:
policy gradient Approaches for Multi-Objective Sequential Decision Making: A Comparison
Policy Gradient Approaches for Multi-Objective Sequential De...
收藏 引用
IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)
作者: Parisi, Simone Pirotta, Matteo Smacchia, Nicola Bascetta, Luca Restelli, Marcello Politecn Milan Dept Elect Informat & Bioengn Piazza Leonardo da Vinci 32 I-20133 Milan Italy
This paper investigates the use of policy gradient techniques to approximate the Pareto frontier in Multi-Objective Markov Decision Processes (MOMDPs). Despite the popularity of policy-gradient algorithms and the fact... 详细信息
来源: 评论
Fine-tuning text-to-SQL models with reinforcement-learning training objectives
Natural Language Processing Journal
收藏 引用
Natural Language Processing Journal 2025年 10卷
作者: Xuan-Bang Nguyen Xuan-Hieu Phan Massimo Piccardi University of Engineering and Technology Vietnam National University Hanoi Viet Nam FPT Technology Research Institute FPT University Hanoi Viet Nam Faculty of Engineering and Information Technology University of Technology Sydney Broadway NSW 2007 Australia
Text-to-SQL is an important natural language processing task that helps users automatically convert natural language queries into formal SQL code. While transformer-based models have pushed text-to-SQL to unprecedente... 详细信息
来源: 评论
Cooperative Adaptive Cruise Control: A Reinforcement Learning Approach
收藏 引用
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS 2011年 第4期12卷 1248-1260页
作者: Desjardins, Charles Chaib-draa, Brahim Univ Laval Dept Comp Sci & Software Engn Quebec City PQ G1K 7P4 Canada
Recently, improvements in sensing, communicating, and computing technologies have led to the development of driver-assistance systems (DASs). Such systems aim at helping drivers by either providing a warning to reduce... 详细信息
来源: 评论