The financial fraud detection problem involves analysis of the large financial datasets. Financial statement fraud detection process is concentrated on two major aspects: first, identification of the financial variabl...
详细信息
The financial fraud detection problem involves analysis of the large financial datasets. Financial statement fraud detection process is concentrated on two major aspects: first, identification of the financial variables and ratios, also termed as features. Second, applying the data mining methods to classify the organizations into two broad categories: fraudulent and non-fraudulent organizations. If the input dataset contains large number of irrelevant and correlated features, the computational load of the machine learning technique increases and the effectiveness of the classification outcomes decreases. The featureselection process selects a subset of most significant attributes or variables that can be the representative of original data. This selected subset can help in learning the pattern in data at much less time and with accuracy, in order to produce useful information for decision-making. This article briefly states the methods applied in the prior studies for selecting the features for financial statement fraud detection. This article also presents an approach to featureselection using correlation-basedfilterselectionmethods in which featureselection is performed based on ensemble model, and tests the outcome of the approach by applying the mean ratio analysis on financial data of Indian companies.
暂无评论