A learning-based exploration approach is proposed to escape from the basins of attraction of converged-to optima, by selecting on what is termed the interestingness of a solution. this interestingness is based on the ...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
A learning-based exploration approach is proposed to escape from the basins of attraction of converged-to optima, by selecting on what is termed the interestingness of a solution. this interestingness is based on the modeling error made by a surrogate model that is trained on all solutions encountered earlier during the search. Compared to multiple standard optimization runs, a learning-guided restart scheme that alternates between a quality optimization phase and an exploration phase directed by interestingness finds solutions that are more diverse and of higher quality.
Due to the increasing amount of large data sets, efficient learning algorithms are necessary. Also the interpretation of the final model is desirable to draw efficient conclusions from the model results. Prototype bas...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
Due to the increasing amount of large data sets, efficient learning algorithms are necessary. Also the interpretation of the final model is desirable to draw efficient conclusions from the model results. Prototype based learning algorithms have been extended recently to proximity learners to analyze data given in non-standard data formats. the supervised methods of this type are of special interest but suffer from a large number of optimization parameters to model the prototypes. In this contribution we derive an efficient core set based preprocessing to restrict the number of model parameters to O(n/is an element of(2)) with n as the number of prototypes. Accordingly, the number of model parameters gets independent of the size of the data sets but scales withthe requested precision is an element of of the core sets. Experimental results show that our approach does not significantly degrade the performance while significantly reducing the memory complexity.
A big data benchmark suite is needed eagerly by customers, industry and academia recently. A number of prominent works in last several years are reviewed, their characteristics are introduced and shortcomings are anal...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
A big data benchmark suite is needed eagerly by customers, industry and academia recently. A number of prominent works in last several years are reviewed, their characteristics are introduced and shortcomings are analyzed. the authors also provide some suggestions on building the expected benchmark, including: component based benchmarks as well as end-to-end benchmarks should be used together to test distinct tools and test the system as a whole;workloads should be enriched with complex analytics to encompass different application scenarios;metrics other than performance metrics should also be considered.
the Vehicle Routing Problem with Time Windows is an important task in logistic planning. the expenditure on employing labor force, i.e., drivers for vehicles, accounts for most of the costs in this domain. We propose ...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
the Vehicle Routing Problem with Time Windows is an important task in logistic planning. the expenditure on employing labor force, i.e., drivers for vehicles, accounts for most of the costs in this domain. We propose an initialized Ant Colony approach, IACO-VRPTW, withthe primary goal (f(1)) to reduce the number of vehicle needed to serve the customers and the second-priority goal (f(2)) of decreasing the travel distance. Compared with methods that optimize f(2), IACO-VRPTW can reach or reduce f(1) in 8 out of 18 instances of the Solomon benchmark set, at the cost of increasing travel distance slightly. IACO-VRPTW can effectively decrease the number of vehicles, travel distance and runtime compared with an ACO without initialization.
It is difficult for Extreme learning Machine (ELM) to estimate the number of hidden nodes used to match withthe learningdata. In this paper, a novel pruning algorithm based on sensitivity analysis is proposed for EL...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
It is difficult for Extreme learning Machine (ELM) to estimate the number of hidden nodes used to match withthe learningdata. In this paper, a novel pruning algorithm based on sensitivity analysis is proposed for ELM. the measure to estimate the necessary number of hidden layer nodes is presented according to the defined sensitivity. When the measure is below the given threshold, the nodes with smaller sensitivities are removed from the existent network all together. Experimental results show that the proposed method can produce more compact neural network than some other existing similar algorithms.
State-of-the-art classification algorithms suffer when the data is skewed towards one class. this led to the development of a number of techniques to cope with unbalanced data. However, as confirmed by our experimenta...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
State-of-the-art classification algorithms suffer when the data is skewed towards one class. this led to the development of a number of techniques to cope with unbalanced data. However, as confirmed by our experimental comparison, no technique appears to work consistently better in all conditions. We propose to use a racing method to select adaptively the most appropriate strategy for a given unbalanced task. the results show that racing is able to adapt the choice of the strategy to the specific nature of the unbalanced problem and to select rapidly the most appropriate strategy without compromising the accuracy.
data mining techniques usually require a flat data table as input. For categorical attributes, there is often no canonical flat data table, since they can often be considered in different levels of granularity (like c...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
data mining techniques usually require a flat data table as input. For categorical attributes, there is often no canonical flat data table, since they can often be considered in different levels of granularity (like continent, country or local region). the choice of the best level of granularity for a data mining task can be very tedious, especially when a larger number of attributes with different levels of granularities is involved. In this paper we propose two approaches to automatically select the granularity levels in the context of a naive Bayes classifier. the two approaches are based on the. 2 independence test including correction for multiple testing and the minimum description length principle.
data mining applied to social media is gaining popularity. It is worth noticing that most e-commerce services also cause the formation of small communities not only services oriented toward socializing people. the res...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
data mining applied to social media is gaining popularity. It is worth noticing that most e-commerce services also cause the formation of small communities not only services oriented toward socializing people. the results of their analysis are easier to implement. Besides, we can expect a better perception of the business by its own users, therefore the analysis of their behavior is justified. In the paper we introduce an algorithm which identifies particular customers among not logged or not registered users of a given e-commerce service. the identification of a customer is based on datathat was given so as to accomplish selling procedure. Customers rarely use exactly the same identification data each time. In consequence, it is possible to check if customers create a group of unrelated individuals or if there are symptoms of social behavior.
the Linear Ordering Problem is a combinatorial optimization problem which has been frequently addressed in the literature due to its numerous applications in diverse fields. In spite of its popularity, little is known...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
the Linear Ordering Problem is a combinatorial optimization problem which has been frequently addressed in the literature due to its numerous applications in diverse fields. In spite of its popularity, little is known about its complexity. In this paper we analyze the linear ordering problem trying to identify features or characteristics of the instances that can provide useful insights into the difficulty of solving them. Particularly, we introduce two different metrics, insert ratio and ubiquity ratio, that measure the difficulty of solving the LOP with local search type algorithms withthe insert neighborhood system. Conducted experiments demonstrate that the proposed metrics clearly correlate withthe complexity of solving the LOP with a multistart local search algorithm.
作者:
Cai, BingqiLiu, JingXidian Univ
Minist Educ Key Lab Intelligent Percept & Image Understanding Xian 710071 Peoples R China
Four representations for resource constrained project scheduling problems (RCPSPs) are studied by making use of the fitness landscape analysis technique. the fitness distance correlation (FDC) measure is used to analy...
详细信息
ISBN:
(纸本)9783642412783;9783642412776
Four representations for resource constrained project scheduling problems (RCPSPs) are studied by making use of the fitness landscape analysis technique. the fitness distance correlation (FDC) measure is used to analyze the landscapes. In the experiments, the study on the benchmark problems J30 is first presented to investigate which distance metric is more suitable for calculating FDC for RCPSPs. then, the benchmark problems Patterson, J30, and J60 are used to evaluate the effect of the four encodings on the performance of evolutionary algorithms. Finally, a standard genetic algorithm is applied to verify the predictions made by the FDC. To the best of our knowledge, this is the first work on using FDC to study different encodings for RCPSPs.
暂无评论