检索结果-内蒙古大学图书馆

Multi-Agent Reinforcement learning with Information-sharing Constrained Policy Optimization for Global Cost Environment

引用

IFAC-PapersOnLine 2023年第2期56卷 1558-1565页

作者： Yoshihiro Okawa Hayato Dan Natsuki Morita Masatoshi Ogawa Artificial intelligence laboratory Fujitsu Limited Kawasaki Kanagawa 211-8588 Japan

Multi-agent Reinforcement learning (MARL) is a machine learning method that solves problems by using multiple learning agents in a data-driven manner. Because of the advantage of utilizing multiple agents simultaneously, MARL has become an efficient solution to large-scale problems in a wide range of fields. However, as with general single-agent reinforcement learning, MARL requires trial and error to acquire the appropriate policies for each agent in the learning process. Therefore, how to guarantee performance and constraint satisfaction in MARL is a critical issue for application to real-world problems. In this study, we propose an Information-sharing Constrained Policy Optimization (IsCPO) method for MARL that guarantees constraint satisfaction during learning. In detail, IsCPO sequentially updates the policies of multiple agents in random order while sharing information of the surrogate costs and KL-divergence for evaluating the current and updated policies to the next agent. In addition, if there are no candidates of policies to be updated in accordance with the shared information, IsCPO skips updating the policies of the rest of the agents until the next iteration. As a result, IsCPO makes it possible to acquire the individual suboptimal policies of agents, satisfying constraints on global costs related to the state of the environment and the actions from multiple agents. We also introduce a practical algorithm for IsCPO that simplifies its implementation by adopting several mathematical approximations. Finally, we show the validity and effectiveness through simulation results on a multiple cart-pole problem and base station sleep control problem in a mobile network.

关键词： Multi-agent reinforcement learning learning algorithm Constrained MDP Multi-agent system Base station sleep control Mobile network

来源：评论

学校读者我要写书评

暂无评论

Machine learning facilitated business intelligence (Part I) Neural networks learning algorithms and applications

引用

INDUSTRIAL MANAGEMENT & DATA SYSTEMS 2019年第1期120卷 164-195页

作者： Khan, Waqar Ahmed Chung, S. H. Awan, Muhammad Usman Wen, Xin Hong Kong Polytech Univ Dept Ind & Syst Engn Kowloon Hong Kong Peoples R China Univ Punjab Inst Qual & Technol Management Lahore Pakistan

Purpose The purpose of this paper is to conduct a comprehensive review of the noteworthy contributions made in the area of the Feedforward neural network (FNN) to improve its generalization performance and convergence rate (learning speed);to identify new research directions that will help researchers to design new, simple and efficient algorithms and users to implement optimal designed FNNs for solving complex problems;and to explore the wide applications of the reviewed FNN algorithms in solving real-world management, engineering and health sciences problems and demonstrate the advantages of these algorithms in enhancing decision making for practical operations. Design/methodology/approach The FNN has gained much popularity during the last three decades. Therefore, the authors have focused on algorithms proposed during the last three decades. The selected databases were searched with popular keywords: "generalization performance," "learning rate," "overfitting" and "fixed and cascade architecture." Combinations of the keywords were also used to get more relevant results. Duplicated articles in the databases, non-English language, and matched keywords but out of scope, were discarded. Findings The authors studied a total of 80 articles and classified them into six categories according to the nature of the algorithms proposed in these articles which aimed at improving the generalization performance and convergence rate of FNNs. To review and discuss all the six categories would result in the paper being too long. Therefore, the authors further divided the six categories into two parts (i.e. Part I and Part II). The current paper, Part I, investigates two categories that focus on learning algorithms (i.e. gradient learning algorithms for network training and gradient-free learning algorithms). Furthermore, the remaining four categories which mainly explore optimization techniques are reviewed in Part II (i.e. optimization algorithms for learning rate, bias and varian

关键词： Data analytics Machine learning learning algorithm Feedforward neural network Industrial management

来源：评论

学校读者我要写书评

暂无评论

On the Development and Performance Evaluation of Improved Radial Basis Function Neural Networks

引用

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS 2022年第6期52卷 3873-3884页

作者： Panda, Sashmita Panda, Ganapati Indian Inst Technol Kharagpur Dept GS Sanyal Sch Telecommun Kharagpur 721302 W Bengal India CV Raman Global Univ Dept Elect & Telecommun Engn Bhubaneswar 752054 India

This article deals with the development of four modified radial basis function neural network (RBFNN) models. The corresponding learning algorithms associated with the updating of internal parameters of the models are derived. The conventional inputs are used in the first and second modified RBFNN models (models 3 and 4) whereas exponential nonlinear inputs are used in the fifth and sixth RBFNN models to provide additional nonlinearity for achieving a better solution of nonlinear classification, and direct and inverse modeling problems. To assess and compare the performance potentiality of the proposed four new RBFNN models, one classification problem, one direct modeling problem, and one inverse modeling problem are solved through computer simulation-based experiments. For comparison and to assign the performance rank of each of the four modified RBFNN models, two conventional and commonly used RBFNN models (models 1 and 2) are also simulated. To access the performance of different models during the training phase of Examples 1 and 2, the root mean-square error (RMSE) value, mean absolute deviation (MAD), and the number of iterations required to achieve convergence are obtained. For the third example, only the first two performance measures are found. During the testing or validation phase, the output responses of the different models of Example 2 are compared with the desired response analysis. For Example 3, the bit-error rate (BER) plots are compared. The observation of all the results demonstrates consistent ranks of all models in the case of all three examples. It is, in general, found that the ranks of the models 1-6 are 6, 4, 3, 2, 5, and 1, respectively. In essence, in terms of all performance measures, model M-6 with an exponential version of inputs with weights on both layers occupies the first position whereas model M-4 with conventional inputs as the second position.

关键词： Training Computational modeling Mathematical model Biological system modeling Inverse problems Adaptation models Radial basis function networks Bit-error rate (BER) classification direct modeling inverse modeling learning algorithm mean absolute deviation (MAD) radial basis function neural network (RBFNN) root mean-square error (RMSE)

来源：评论

学校读者我要写书评

暂无评论

MPPT Control in an Offshore Wind Turbine Optimized with Genetic algorithms and Unsupervised Neural Networks 19th

MPPT Control in an Offshore Wind Turbine Optimized with Gene...

引用

19th International Conference on Artificial Intelligence Applications and Innovations (AIAI)

作者： Munoz-Palomeque, Eduardo Enrique Sierra-Garcia, Jesus Santos, Matilde Univ Burgos Electromech Engn Dept Burgos Spain Univ Complutense Madrid Inst Knowledge Technol Madrid Spain

ISBN: (纸本)9783031341090;9783031341076;9783031341069

In this work, a control operation of a 1.5 MW offshore wind turbine (WT) formaximum power point tracking (MPPT) whenwind speed is below-rated, is studied. The implemented controller is designed using the general Direct Speed Control (DSC) scheme in which artificial neural networks (ANN) are incorporated to close the control loop. The neural controller acts in an unsupervised mode updating its weights with the incorporation of a learning algorithm. The optimal configuration parameters of the controller are determined by genetic algorithms. With this intelligent control strategy, the generator speed is regulated by varying the electromagnetic torque while adapting to the external phenomena in real time. Then, the output power, through the power coefficient (Cp), reaches the maximum wind power generation in that region. The offshore WT model is subjected to external loads due to wind and waves, which increase the system complexity and produce tower vibrations, negatively impacting the control efficiency. Despite that, it is shown that the proposed controller is able to operate with satisfactory results in terms of power generation and even reducing vibration, and it has been compared to the OpenFAST embedded torque control for the sameWT providing better results.

关键词： DSC Neural Networks learning algorithm MPPT Offshore Wind Turbine Genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

Multi-agent reinforcement learning vibration control and trajectory planning of a double flexible beam coupling system

引用

MECHANICAL SYSTEMS AND SIGNAL PROCESSING 2023年第1期200卷

作者： Qiu, Zhi-cheng Hu, Jun-fei Zhang, Xian-min South China Univ Technol Sch Mech & Automot Engn Guangzhou 510641 Peoples R China

A multi-agent reinforcement learning vibration controller is designed for active vibration suppression of a movable double piezoelectric flexible beam coupling system, and the motion trajectory is optimized to minimize vibration excitation during motion and residual vibration after motion. The finite element method is used to model the system dynamics, then, the actual model parameters are identified by combining wavelet and intelligent optimization algorithm. The corrected piezoelectric driving model is used to train the counterfactual multi-agent reinforcement learning (COMARL) algorithm, and an excellent nonlinear controller for vibration control of piezoelectric actuators is obtained. The motion trajectory of the double flexible beam coupling system is designed by using the corrected motor-driven model. The optimal vibration suppression trajectory is obtained by using tabu search algorithm. The simulation and experimental results show that the optimized trajectory greatly reduces the vibration excitation. The controller trained by the COMARL algorithm fully considers the influence of either beam in the system, and infers the contribution of piezoelectric actuators to the completion of the overall task through counterfactual thinking. The control effect is better than that of PD control, especially the small amplitude vibration suppression. The effectiveness of the COMARL controller is further verified by simultaneous piezoelectric control during trajectory motion. Vibrations during translational motion and at the end of motion are suppressed quickly.

关键词： Double piezoelectric flexible beam coupling system Model identification Trajectory Planning Vibration control Counterfactual Multi-Agent reinforcement learning algorithm

来源：评论

学校读者我要写书评

暂无评论

Development and evaluation of adaptive metacognitive scaffolding for algorithm-learning system

引用

IET SOFTWARE 2019年第4期13卷 305-312页

作者： Hidayah, Indriana Adji, Teguh Bharata Setiawan, Noor Akhmad Univ Gadjah Mada Fac Engn Dept Elect Engn Jl Grafika 2 Yogyakarta Indonesia

Adaptive metacognitive scaffolding is developed to provide learning assistance on an as-needed basis;thus, advances the effectiveness of computer-based learning systems. Metacognitive scaffoldings have been developed for some science subjects;however, not for algorithm-learning. The learning algorithm is different from learning science as it is more oriented to problem-solving;therefore, this study is aimed to describe the modelling, development, and evaluation of the adaptive metacognitive scaffolding which is dedicated for encouraging algorithm-learning. In addition, the authors present a new approach for learner modelling to find students' metacognitive state. Adaptivity of the scaffolding is based on the learner modelling. To evaluate the effectiveness of the developed system, it is deployed in a real algorithm-learning classroom of 38 students. The class is randomly divided into two groups: experiment and control. Two parameters are measured from both groups, i.e. academic success and academic satisfaction. Non-parametric statistical test, i.e. Mann-Whitney U-test (significance level 0.01) rejects the null hypothesis (U-value = 86.5 and U-critical = 101). This result verifies that the academic success of the experiment group is significantly higher than that of the control group. In addition, an academic satisfaction survey shows that adaptive scaffolding is valid in assisting students while learning with the system.

关键词： computer aided instruction statistical testing adaptive metacognitive scaffolding learning assistance learning algorithm learner modelling algorithm-learning classroom algorithm-learning system computer-based learning systems academic success academic satisfaction nonparametric statistical test Mann-Whitney U-test

来源：评论

学校读者我要写书评

暂无评论

A new enhanced learning approach to automatic image classification based on Salp Swarm algorithm

引用

COMPUTER SYSTEMS SCIENCE AND ENGINEERING 2019年第2期34卷 91-100页

作者： Nejad, Mohammad Behrouzian Shiri, Mohammad Ebrahim Islamic Azad Univ Borujerd Branch Dept Comp Engn Borujerd Iran Amirkabir Univ Dept Math & Comp Sci Tehran Iran

In this paper we propose a new image classification technique. According to this note that most research focuses on extraction of features in the frequency domain, location, and reduction of feature dimensions, in this research we focused on learning step in image classification. The main aim is to use the heuristic methods to increase the function of the estimator of the learning algorithm and continue to achieve the desired state, as well as categorization without user interference and automatically performed by the model produced from the above steps. So, in this paper, a new learning approach based on the Salp Swarm algorithm was proposed that was implemented and evaluated on learning algorithm Decision Tree, K-Nearest Neighbors and Naive Bayes. The results demonstrate the improvement of the performance of learning algorithms in all the achieved criteria by using the SSA algorithm in comparison with traditional learning algorithms. In the accuracy, sensitivity, classification error and F1 criterion, the best performance of the proposed model is using the Decision Tree learning method with values of 99.17%, 100%, 0.83% and 95.65% respectively. In the specificity and precision criterion, the best performance of the proposed model is based on K-Nearest Neighbors learning method with values of 100%.

关键词： Image Mining Image Classification learning algorithm Salp Swarm algorithm

来源：评论

学校读者我要写书评

暂无评论

learning species-definite features from digital microscopic leather images

引用

EXPERT SYSTEMS WITH APPLICATIONS 2023年第1期224卷

作者： Varghese, Anjli Jawahar, Malathy Prince, A. Amalin Birla Inst Technol & Sci KK Birla Goa Campus Zuarinagar 403726 Goa India Cent Leather Res Inst Chennai 600020 India

In the leather industry, identifying species of leather holds a significant step toward consistent global leather trade. Intertwining image processing and a learning algorithm with leather science can enhance the predictability of leather species. Hence, this paper aims to learn the pore-pattern variability between each species from digital microscopic leather images. These images undergo image pre-processing to generate leather images with highlighted pores, less susceptible to noise. This work also proposes an Entropy-based Otsu's thresholding with Component-area-histogram Analysis (EOCA) to achieve an adequate hair-pore segmentation, irrespective of any species. Goodness and discrepancy measures validate the generosity of the proposed EOCA method. Morphological, geometrical, and statistical features estimate the pattern-variability of each species. To ascertain the discriminatory behavior of these features, this work performs the feature classification using KNN, NB, DT, SVM, and MLP classifiers. The classification accuracy signifies the efficiency of the pre-processing and the proposed EOCA method in estimating species-definite features. The performance comparison determines MLP with 98.75% accuracy as an appropriate leather species learning model. Thus, the present work contributes to automatic leather species identification by learning and interpreting species-definite features. It also lays the design and development of a human-machine interactive platform to revolutionize the leather trade.

关键词： Entropy-based Otsu?s thresholding with Component-area-histogram Analysis(EOCA) Finished leather Image processing learning algorithm Leather species prediction MGS features

来源：评论

学校读者我要写书评

暂无评论

On Solving Fractional Higher-Order Equations via Artificial Neural Networks

引用

IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY TRANSACTION A-SCIENCE 2022年第2期46卷 535-545页

作者： Jafarian, Ahmad Rezaei, Rezvan Golmankhaneh, Alireza Khalili Islamic Azad Univ Dept Math Urmia Branch Orumiyeh Iran Islamic Azad Univ Dept Phys Urmia Branch Orumiyeh Iran

Recently, different structures of artificial neural networks (shortly ANNs) have been proposed for the modeling and simulation of many real-world complex phenomena. The current research is devoted to the numerical study of an ordinary linear fractional-order integro-differential equation of Volterra type. By substituting the unknown function with a suitable three-layered feed-forward neural architecture, this initial value fractional problem is converted approximately to a system of nonlinear minimization equations. Due to the complexity of the achieved problem, the back-propagation algorithm is employed by making small adjustments in the learning process. In other words, an iterative optimization algorithm based on the gradient descent method is constructed to approximate the solution of the origin fractional problem. Moreover, some examples consist of computer simulations are provided to demonstrate the accuracy and ability of the indicated iterative technique. The obtained numerical results show the efficiency and capability of the ANNs approach in comparison with traditional methods.

关键词： Higher-order linear integro-differential equation Artificial neural networks approach Caputo fractional derivative learning algorithm Cost function

来源：评论

学校读者我要写书评

暂无评论

Harnessing the Power of the GPT Model to Generate Adversarial Examples

Harnessing the Power of the GPT Model to Generate Adversaria...

引用

2023 Congress in Computer Science, Computer Engineering, and Applied Computing, CSCE 2023

作者： Jones, Rebet Omar, Marwan Mohammed, Derek College of Computing Capital Technology University MD United States College of Computing Illinois Institute of Technology Chicago United States Saint Leo University Cs Department Saint Leo United States

ISBN: (纸本)9798350327595

In this paper, we propose a method for generating adversarial examples in the text domain using GPT-2, a state-of-the-art language model. Our method employs an iterative algorithm to produce perturbations to input text samples, creating adversarial examples capable of fooling sentiment analysis models. We evaluate our approach on three widely-used benchmark datasets for sentiment analysis: Yelp, MR, and IMDB. Our results show that our approach can generate highly effective adversarial examples that significantly degrade the performance of sentiment analysis models. Specifically, we achieved a decrease in accuracy of up to 67.3% on the Yelp dataset, 68.1% on the MR dataset, and 52.5% on the IMDB dataset. We also discuss the limitations of our approach and the open challenges in this field. Overall, our study demonstrates the potential of GPT-2 for generating effective adversarial examples in natural language processing tasks. © 2023 IEEE.

关键词： Accuracy Adversarial examples GPT IMDB learning algorithm MR Yelp

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：