development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomalydetection are considered as promising candidates for this purp...
详细信息
development of new data analytical methods remains a crucial factor in the combat against insurance fraud. Methods rooted in the research field of anomalydetection are considered as promising candidates for this purpose. Commonly, a fraud data set contains both numeric and nominal attributes, where, due to the ease of expressiveness, the latter often encodes valuable expert knowledge. For this reason, an anomalydetection method should be able to handle a mixture of different data types, returning an anomaly score meaningful in the context of the business application. We propose the iForest(CAD) approach that computes conditionalanomaly scores, useful for fraud detection. More specifically, anomalydetection is performed conditionally on well-defined data partitions that are created on the basis of selected numeric attributes and distinct combinations of values of selected nominal attributes. In this way, the resulting anomaly scores are computed with respect to a reference group of interest, thus representing a meaningful score for domain experts. Given that anomalydetection is performed conditionally, this approach allows detecting anomalies that would otherwise remain undiscovered in unconditional anomaly detection. Moreover, we present a case study in which we demonstrate the usefulness of our proposed approach on real world workers' compensation claims received from a large European insurance organization. As a result, the iForest(CAD) approach is greatly accepted by domain experts for its effective detection of fraudulent claims.
In various industrial problems, sensor data are often used to detect the abnormal state of manufacturing systems. Sensor data are sometimes influenced by contextual variables that are not related to the system health ...
详细信息
In various industrial problems, sensor data are often used to detect the abnormal state of manufacturing systems. Sensor data are sometimes influenced by contextual variables that are not related to the system health status and may exhibit different behaviours depending on their values, even if the system is in a normal condition. In this case, a conditional anomaly detection method should be used to consider the effects of contextual variables. In this study, we propose a conditional anomaly detection method, particularly for high-dimensional and complex data, using a deep embedding kernel mixture network. The proposed method comprises embedding and kernel mixture networks. The embedding network learns low-dimensional embeddings from high-dimensional data, and the kernel mixture network models the distribution of the learned embeddings conditional on contextual variables. The two networks enable a flexible estimation of conditional density using the high expressive power of deep neural networks. The two networks are trained simultaneously such that the high-dimensional data are embedded into a low-dimensional space, to assist conditional density estimation. The effectiveness of the proposed model is demonstrated using real data examples from the UCI repository and a case study from a tire company.
Traditional anomalydetection causes a problem of detecting too numerous false positives in many problem *** this work,a Superimpose Rule-Based Classification algorithm(SRBCA) is proposed for conditionalanomaly *** a...
详细信息
Traditional anomalydetection causes a problem of detecting too numerous false positives in many problem *** this work,a Superimpose Rule-Based Classification algorithm(SRBCA) is proposed for conditionalanomaly *** algorithm is an enhancement of the traditional OneR *** traditional OneR can generate a set of rules from its attributes with multiple classes,compute the error rate and apply the rule to the attribute with the smallest ***,OneR has a disadvantage for one-class datasets which contains values belonging to the normal *** enhanced algorithm,SRBCA,does not embody very complex rules similar to its ***,SRBCA includes the generation and application of rules from the one-class dataset in an n-dimensional space using *** method was used to evaluate the performance of the classifiers' accuracy which involved training multiple subsets' behavioral and indicator attributes,superimposing rules and testing by using balanced and unbalanced class data to detect and label conditionalanomaly data *** paper shows the comparison between SRBCA,One-Class Support Vector Machine(OCSVM) and other anomalydetection classification algorithms for conditionalanomaly *** proves that the new method can handle one-class multivariate for conditional anomaly detection with better accuracy.
暂无评论