Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake reviews to mislead users. To maximiz...
详细信息
ISBN:
(纸本)9781450339513
Online reviews are an important source for consumers to evaluate products/services on the Internet (e.g. Amazon, Yelp, etc.). However, more and more fraudulent reviewers write fake reviews to mislead users. To maximize their impact and share effort, many spam attacks are organized as campaigns, by a group of spammers. In this paper, we propose a new two-step method to discover spammer groups and their targeted products. First, we introduce NFS (Network Footprint Score), a new measure that quantifies the likelihood of products being spam campaign targets. Second, we carefully devise GroupStrainer to cluster spammers on a 2-hop subgraph induced by top ranking products. Our approach has four key advantages: (i) unsupervised detection; both steps require no labeled data, (ii) adversarial robustness; we quantify statistical distortions in the review network, of which spammers have only a partial view, and avoid any side information that spammers can easily evade, (iii) sensemaking; the output facilitates the exploration of the nested hierarchy (i.e., organization) among the spammers, and finally (iv) scalability; both steps have complexity linear in network size, moreover, GroupStrainer operates on a carefully induced subnetwork. We demonstrate the efficiency and effectiveness of our approach on both synthetic and real-world datasets from two different domains with millions of products and reviewers. Moreover, we discover interesting strategies that spammers employ through case studies of our detected groups.
This paper examines a collection of assumptions used in the current literature on node anomalydetection in a network. The examination raises the question: What are anomalies in a network? Our attempt to answer this q...
详细信息
This paper examines a collection of assumptions used in the current literature on node anomalydetection in a network. The examination raises the question: What are anomalies in a network? Our attempt to answer this question has provided some interesting findings and led to some open questions. This is the first paper which formally defines anomalies in a network, and introduces the concept of self-verifiability of a detector without ground-truths in a network. They enable existing detectors to be categorized into two types along the line whether they are self-verifiable or not. We suggest a method to evaluate self-verifiable detectors without ground-truths, as an alternative to the existing evaluation method that relies on ground-truths.
暂无评论