When our learning task is to build a model with accurate classification, C4.5 and NB are two very important algorithms for achieving this task because of their simplicity and high performance. In this paper, we presen...
详细信息
ISBN:
(纸本)9783540921363
When our learning task is to build a model with accurate classification, C4.5 and NB are two very important algorithms for achieving this task because of their simplicity and high performance. In this paper, we present a combined classification algorithm based on C4.5 and NB, simply C4.5-NB. In C4.5-NB, the class probability estimates of C4.5 and NB are weighted according to their classification accuracy on the training data. We experimentally tested C4.5-NB in Weka system using the whole 36 UCI data sets selected by Weka, and compared it with C4.5 and NB. The experimental results show that C4.5-NB significantly outperforms C4.5 and NB in terms of classification accuracy. Besides, we also observe the ranking performance of C4.5-NB in terms of AUC (the area under the Receiver Operating Characteristics curve). Fortunately, C4.5-NB also significantly outperforms C4.5 and NB.
The ladder network parameter identification for transformer winding is crucial for the interpretation of the frequency response function data. The traditional identification method, mainly based on intelligent optimis...
详细信息
The ladder network parameter identification for transformer winding is crucial for the interpretation of the frequency response function data. The traditional identification method, mainly based on intelligent optimisation algorithm, is generally very time-consuming due to a large amount of computation. This study proposes to combine the intelligent algorithm and Gauss-Newton iteration algorithm (GNIA) to improve the optimisation efficiency notably with a sharply dropped calculation workload. These two methods are well-complementary since the intelligent algorithm holds excellent global search ability while the search of the GNIA is directional and quantitative. This study solves three key problems for the combined algorithms. The first problem is the calculation of the least-square correction solution to the network parameters in the iteration algorithm. The treatment of the ill-conditioned Jacobian matrix in the iteration algorithm is the second challenge. Another issue is the determination of the network parameter with zero sensitivity. The identification results on an isolated winding show that the combined algorithms can obtain a more precise solution with far less amount of computation.
Images and patterns through a cycle conversion T−1T are discussed and facilitated by the combined algorithms, where the transformation is T:(ξ,η)→(x,y) with the linear or nonlinear functions x = x(ξ,η) and y = y(...
详细信息
Images and patterns through a cycle conversion T−1T are discussed and facilitated by the combined algorithms, where the transformation is T:(ξ,η)→(x,y) with the linear or nonlinear functions x = x(ξ,η) and y = y(ξ,η). A new Area Method is presented for images through T and T−1T of linear transformations, and three combinations of the Splitting-Shooting Method and the Splitting-Integrating Method are proposed for images through T−1T of linear and nonlinear transformations. Furthermore, both error analysis and graphical experiments given prove the importance of those combinations to computer vision, image processing, graphs and pattern recognition.
In this paper, we address the problem of finding k-nearest neighbors (KNN) in sequence databases using the edit distance. Unlike most existing works using short and exact n-gram matchings together with a filter-and-re...
详细信息
In this paper, we address the problem of finding k-nearest neighbors (KNN) in sequence databases using the edit distance. Unlike most existing works using short and exact n-gram matchings together with a filter-and-refine framework for KNN sequence search, our new approach allows us to use longer but approximate n-gram matchings as a basis of KNN candidates pruning. Based on this new idea, we devise a pipeline framework over a two-level index for searching KNN in the sequence database. By coupling this framework together with several efficient filtering strategies, i.e. the frequency queue and the well-known combined Algorithm (CA), our proposal brings various enticing advantages over existing works, including 1) huge reduction on false positive candidates to avoid large overheads on candidate verifications;2) progressive result update and early termination;and 3) good extensibility to parallel computation. We conduct extensive experiments on three real datasets to verify the superiority of the proposed framework.
Methods for representation of real numbers by Cantor, Lurote, Engel series as well as Ostrogradsky algorithms and combined ones are considered. The combined algorithms are given a physical interpretation. Their perfor...
详细信息
Document layout analysis is an important part of document information processing systems, which is essential for many applications such as optical character recognition (OCR) systems, machine translation, information ...
详细信息
ISBN:
(纸本)9781665491822
Document layout analysis is an important part of document information processing systems, which is essential for many applications such as optical character recognition (OCR) systems, machine translation, information retrieval, and document structured data extraction, as well as for digitizing paper documents and classifying and identifying document image regions. Document-like images contain a wealth of information, and in order to automatically extract and classify regions of interest in document images, the document images are programmed to analyze the layout content for subsequent OCR and automatic transcription. However, the proposed algorithms still have more limitations due to various document layouts and variations of block positions, inter-class and within-class variations, and background noise. This paper first summarizes the traditional learning algorithms based on tour smoothing and segmentation projection, deep learning algorithms using recurrent convolutional neural networks and twin networks, and algorithms combining traditional learning and deep learning proposed in recent years. The current mainstream algorithms and common datasets in experiments for deep learning and their access are highlighted. As well as the comparison of some algorithms on benchmark datasets, and some experimental results with good robustness are given. Finally, the future research areas are prospected for further development.
As the rapid growth of personal credit business, we have always been seeking to establish an effective risk assessment model to achieve low costs and better accuracy of decision-making. Over the past few years, the so...
详细信息
ISBN:
(数字)9783642161674
ISBN:
(纸本)9783642161667
As the rapid growth of personal credit business, we have always been seeking to establish an effective risk assessment model to achieve low costs and better accuracy of decision-making. Over the past few years, the so-called combined algorithms have appeared in many fields, but they are always useless in the field of individual credit risk assessment. So we constructed a practical method based on combined algorithms, and we tested it empirically. The result shows that the application of the method can achieve better accuracy than the BP neural network.
暂无评论