The web today has millions of datasets, and the number of datasets continues to grow at a rapid pace. These datasets are not standalone entities;rather, they are intricately connected through complex relationships. Se...
详细信息
To distinguish subtle differences among fine-grained categories, a large amount of well-labeled images are typically required. However, acquiring manual annotations for fine-grained categories is an extremely difficul...
详细信息
To distinguish subtle differences among fine-grained categories, a large amount of well-labeled images are typically required. However, acquiring manual annotations for fine-grained categories is an extremely difficult task as it usually has a high demand for professional knowledge. To this end, directly leveraging web images for learning fine-grained models becomes a natural choice. Nevertheless, due to the existence of label noise, this learning paradigm tends to have a poor performance. In this work, we propose an end-to-end approach by combining dynamic loss correction and global sample selection to alleviate the problem of label noise. Specifically, we leverage the network to predict all samples, record the predictions of recent several epochs, and calculate the uncertainly-based dynamic loss for global sample selection. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our proposed approach. The source code of our approach has been released on the website: https://***/NUST-Machine-Intelligence-Laboratory/dlc.
AdaBoost is perhaps one of the most well-known ensemble learning algorithms. In simple terms, the idea in AdaBoost is to train a number of weak learners in an increamental fashion where each new learner tries to focus...
详细信息
ISBN:
(纸本)9781728110516
AdaBoost is perhaps one of the most well-known ensemble learning algorithms. In simple terms, the idea in AdaBoost is to train a number of weak learners in an increamental fashion where each new learner tries to focus more on those samples that were misclassfied by the preceding classifiers. Consequently, in the presence of noisy data samples, the new leraners will somehow memorize the data, which in turn will lead to an overfitted model. The main objective of this paper is to provide a generalized version of the AdaBoost algorithm that avoids overfitting, and performs better when the data samples are corrupted with noise. To this end, we make use of another ensemble learning algorithm called ValidBoost [15], and introduce a mechanism to dynamically determine the thresholds for both the error rate of each classifier and the error rate in each iteration. These threshholds enable us to control the error rate of the algorithm. Experimental simulations have been made on several benchmark datasets including web datasets such as "website Phishing Data Set" and "Page Blocks Classification Data Set" to evaluate the performance of our proposed algorithm.
暂无评论