Multiple instance learning (MIL) is considered a generalization of traditional supervised learning which deals with uncertainty in the information. Together with the fact that, as in any other learning framework, the ...
详细信息
Multiple instance learning (MIL) is considered a generalization of traditional supervised learning which deals with uncertainty in the information. Together with the fact that, as in any other learning framework, the classifier performance evaluation maintains a trade-off relationship between different conflicting objectives, this makes the classification task less straightforward. This paper introduces a multi-objective proposal that works in a MIL scenario to obtain well-distributed Pareto solutions to multi-instance problems. The algorithm developed, Multi-Objective grammar guided genetic programming for Multiple Instances (MOG3P-MI), is based on grammar-guidedgeneticprogramming, which is a robust tool for classification. Thus, this proposal combines the advantages of the grammar-guidedgeneticprogramming with benefits provided by multi-objective approaches. First, a study of multi-objective optimization for MIL is carried out. To do this, three different extensions of MOG3P-MI are designed and implemented and their performance is compared. This study allows us on the one hand, to check the performance of multi-objective techniques in this learning paradigm and on the other hand, to determine the most appropriate evolutionary process for MOG3P-MI. Then, MOG3P-MI is compared with some of the most significant proposals developed throughout the years in MIL. Computational experiments show that MOG3P-MI often obtains consistently better results than the other algorithms, achieving the most accurate models. Moreover, the classifiers obtained are very comprehensible.
Association rule mining, an important data mining technique, has been widely focused on the extraction of frequent patterns. Nevertheless, in some application domains it is interesting to discover patterns that do not...
详细信息
Association rule mining, an important data mining technique, has been widely focused on the extraction of frequent patterns. Nevertheless, in some application domains it is interesting to discover patterns that do not frequently occur, even when they are strongly related. More specifically, this type of relation can be very appropriate in e-learning domains due to its intrinsic imbalanced nature. In these domains, the aim is to discover a small but interesting and useful set of rules that could barely be extracted by traditional algorithms founded in exhaustive search-based techniques. In this paper, we propose an evolutionary algorithm for mining rare class association rules when gathering student usage data from a Moodle system. We analyse how the use of different parameters of the algorithm determine the rule characteristics, and provides some illustrative examples of them to show their interpretability and usefulness in e-learning environments. We also compare our approach to other existing algorithms for mining both rare and frequent association rules. Finally, an analysis of the rules mined is presented, which allows information about students' unusual behaviour regarding the achievement of bad or good marks to be discovered.
The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest a...
详细信息
The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest among patterns. Despite the mining of association rules being an area of great interest for many researchers, the search for well-grouped continuous values is still a challenge, discovering rules that do not comprise patterns which represent unnecessary ranges of values. Existing algorithms for mining association rules in continuous domains are mainly based on a non-deterministic search, requiring a high number of parameters to be optimised. These parameters hinder the mining process, and the algorithms themselves must be known to those data mining experts that want to use them. We therefore present a grammar guided genetic programming algorithm that does not require as many parameters as other existing approaches and enables the discovery of quantitative association rules comprising small-size gaps. The algorithm is verified over a varied set of data, comparing the results to other association rule mining algorithms from several paradigms. Additionally, some resulting rules from different paradigms are analysed, demonstrating the effectiveness of our model for reducing gaps in numerical features.
An innovative technique based on multi-objective grammar guided genetic programming (MOG3P-MI) is proposed to detect the most relevant activities that a student needs to pass a course based on features extracted from ...
详细信息
ISBN:
(纸本)9783642130328
An innovative technique based on multi-objective grammar guided genetic programming (MOG3P-MI) is proposed to detect the most relevant activities that a student needs to pass a course based on features extracted from logged data in an education web-based system. A more flexible representation of the available information based on multiple instance learning is used to prevent;the appearance of a great number of missing values. Experimental results with the most relevant proposals in multiple instance learning in recent years demonstrate that MOG3P-MI successfully improves accuracy by finding a balance between specificity and sensitivity values. Moreover, simple and clear classification rules winch are markedly useful to identify the number, type and time of activities that a student should do within the web system to pass a course are provided by our proposal.
Human code is different from code generated by program search. We investigate if properties from human-generated code can guide program search to improve the qualities of the generated programs, e.g., readability and ...
详细信息
ISBN:
(数字)9783031020568
ISBN:
(纸本)9783031020568;9783031020551
Human code is different from code generated by program search. We investigate if properties from human-generated code can guide program search to improve the qualities of the generated programs, e.g., readability and performance. Here we focus on program search with grammatical evolution, which produces code that has different structure compared to human-generated code, e.g., loops and conditions are hardly used. We use a large code-corpus that was mined from the open software repository service GitHub and measure software metrics and properties describing the code-base. We use this knowledge to guide the search by incorporating a new selection scheme. Our new selection scheme favors programs that are structurally similar to the programs in the GitHub code-base. We find noticeable evidence that software metrics can help in guiding evolutionary search.
Cyber-Physical Systems (CPS) are prevalent in critical infrastructures and a prime target for cyber-attacks. Multivariate time series data generated by sensors and actuators of a CPS can be monitored for detecting cyb...
详细信息
ISBN:
(数字)9783031020568
ISBN:
(纸本)9783031020568;9783031020551
Cyber-Physical Systems (CPS) are prevalent in critical infrastructures and a prime target for cyber-attacks. Multivariate time series data generated by sensors and actuators of a CPS can be monitored for detecting cyber-attacks that introduce anomalies in those data. We use Signal Temporal Logic (STL) formulas to tightly describe the normal behavior of a CPS, identifying data instances that do not satisfy the formulas as anomalies. We learn an ensemble of STL formulas based on observed data, without any specific knowledge of the CPS being monitored. We propose an algorithm based on grammar-guidedgeneticprogramming (G3P) that learns the ensemble automatically in a single evolutionary run. We test the effectiveness of our data-driven proposal on two real-world datasets, finding that the proposed one-shot algorithm provides good detection performance.
Táto práca sa zaoberá problémom rozvrhovania a riadenia práce v logistických skladoch. Riadenie a rozvrhovanie práce je v súčastnosti často riešený problém, na ktor...
详细信息
Táto práca sa zaoberá problémom rozvrhovania a riadenia práce v logistických skladoch. Riadenie a rozvrhovanie práce je v súčastnosti často riešený problém, na ktorý neexistuje jednoduché a jednoznačné riešenie kvôli komplexnosti problému. Tento problém je nutné riešiť z dôvodu nedostatočnej efektivity práce pri vyššom zaťažení skladu, ako je tomu napr. v období vianočných sviatkov. V tejto práci sú opísané metódy využívané k riešeniu tohto problému, zameriava sa hlavne na využitie prehľadávacích algoritmov, evolučných algoritmov, konkrétne gramatikou riadeného genetického programovania. V práci je rozobraný problém riadenia a rozvrhovania práce na jednoduchom teoretickom príklade. Implementovaný algoritmus pre riešenie tohto problému bol podrobený testom inšpirovaných dátami z reálneho skladu, ako aj synteticky vytvoreným testom s vyšším počtom úloh a vyšším počtom pracovníkov. Syntetické testy boli generované náhodne. Všetky testy boli preto spustené viackrát a výsledky spriemerované. V závere práce sú uvedené výsledky úspešnosti algoritmu, ako aj optimálne nastavenia parametrov pre rôzne veľkosti problémov a požiadavky na riešenie. Použitý algoritmus bol rozšírený o výpočet vhodnosti daného jedinca s prihliadnutím na počet kolízií pri vykonávaní jednotlivých úloh, o aplikovanie prioritných pravidiel v priebehu genetického algoritmu, a niektoré časti algoritmu boli paralelizované.
暂无评论