data intensive large-scale distributed systems like peer-to-peer (p2p) networks are finding large number of applications for social networking, file sharing networks, etc. Global datamining in such p2p environments m...
详细信息
data intensive large-scale distributed systems like peer-to-peer (p2p) networks are finding large number of applications for social networking, file sharing networks, etc. Global datamining in such p2p environments may be very costly due to the high scale and the asynchronous nature of the p2p networks. The cost further increases in the distributed data stream scenario where peers receive continuous sequence of transactions rapidly. In this paper, we develop an efficient local algorithm, p2p-FISM, for discovering of the network-wide recent frequent itemsets. The algorithm works in a completely asynchronous manner, imposes low communication overhead, a necessity for scalability, transparently tolerates network topology changes, and quickly adapts to changes in the data stream. The paper demonstrates experimental results to corroborate the theoretical claims. (C) 2013 Elsevier B.V. All rights reserved.
Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spa...
详细信息
Spam appears in various forms and the current trend in spamming is moving towards multimedia spam objects. Image spam is a new type of spam attacks which attempts to bypass the spam filters that mostly text-based. Spamming attacks the users in many ways and these are usually countered by having a server to filter the spammers. This paper provides a fully-distributed pattern recognition system within p2p networks using the distributed associative memory tree (DASMET) algorithm to detect spam which is cost-efficient and not prone to a single point of failure, unlike the server-based systems. This algorithm is scalable for large and frequently updated data sets, and specifically designed for data sets that consist of similar occurring *** have evaluated our system against centralised state-of-the-art algorithms (NN, k-NN, naive Bayes, BpNN and RBFN) and distributed p2p-based algorithms (Ivote-DpV, ensemble k-NN, ensemble naive Bayes, and p2p-GN). The experimental results show that our method is highly accurate with a 98 to 99% accuracy rate, and incurs a small number of messages-in the best-case, it requires only two messages per recall test. In summary, our experimental results show that the DAS-MET performs best with a relatively small amount of resources for the spam detection compared to other distributed methods.
暂无评论