Web services are commonly perceived as an environment of both offering opportunities and threats. In this environment, one way to minimize threats is to use reputation evaluation, which can be computed, for example, t...
详细信息
Web services are commonly perceived as an environment of both offering opportunities and threats. In this environment, one way to minimize threats is to use reputation evaluation, which can be computed, for example, through transaction feedback. However, the current feedback-based approach is inaccurate and ineffective because of its inner limitations (e.g., feedback quality problem). As the main source of feedback, the qualities of existing on-line reviews are often varied greatly from low to high, the main reasons include: (1) they have no standard expression formats, (2) dishonest comments may exist among these reviews due to malicious attacking. Up to present, the quality problem of review has not been well solved, which greatly degrades their importance on service reputation evaluation. Therefore, we firstly present a novel evaluation approach for review quality in terms of multiple metrics. Then, we make a further improvement in service reputation evaluation based on those filtered reviews. Experimental results show the effectiveness and efficiency of our proposed approach compared with the naive feedback-based approaches.
On the internet, all-round lawyer information is located at separated information sources, which prevent web users from effective information acquisition. In order to build a unified view of separated, heterogeneous, ...
详细信息
Privacy-preserving data publication problem has attracted more and more attentions in recent years. A lot of related research works have been done towards dataset with single sensitive attribute. However, usually, ori...
详细信息
With the system becoming more complex and workloads becoming more fluctuating, it is very hard for DBA to quickly analyze performance data and optimize the system, self optimization is a promising technique. A data mi...
详细信息
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of pa...
详细信息
The paper describes the details of using J-SIM in main memory database parallel recovery simulation. In update intensive main memory database systems, I/O is still the dominant performance bottleneck. A proposal of parallel recovery scheme for large-scale update intensive main memory database systems is presented. Simulation provides a faster way of evaluating the new idea compared to actual system implementation. J-SIM is an open source discrete time simulation software package. The simulation implementation using J-SIM is elaborated in terms of resource modeling, transaction processing system modeling and workload modeling. Finally, with simulation results analyzed, the effectiveness of the parallel recovery scheme is verified and the feasibility of J-SIM's application in main memory database system simulation is demonstrated.
Recent research has focused on density queries for moving objects in highly dynamic scenarios. An area is dense if the number of moving objects it contains is above some threshold. Monitoring dense areas has applicati...
详细信息
In this paper, the author defines Generalized Unique Game Problem (GUGP), where weights of the edges are allowed to be negative. Two special types of GUGP are illuminated, GUGP-NWA, where the weights of all edges are ...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation th...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation that employs proper label constraints for training models while simultaneously performing *** existing partial label learning approaches that only leverage similarities in the feature space without utilizing label constraints,our pseudo-labeling process leverages similarities and differences in the feature space using the same candidate label constraints and then disambiguates noise *** experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts on classification prediction.
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of t...
详细信息
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in *** this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training *** efficient model reuse algorithm AdaReuse is developed in the OLML ***,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected ***,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by *** evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is *** on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained *** studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks.
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the lea...
详细信息
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the leakage of sensitive *** segment pairs method(SPM),multiple-channel method(MCM)and prefix extending method(PEM)are three known LDP protocols for heavy hitter identification as well as the frequency oracle(FO)problem with large ***,the low scalability of these three LDP algorithms often limits their ***,communication and computation strongly affect their ***,excessive grouping or sharing of privacy budgets makes the results *** address the abovementioned problems,this study proposes independent channel(IC)and mixed independent channel(MIC),which are efficient LDP protocols for FO with a large *** design a flexible method for splitting a large domain to reduce the number of ***,we employ the false positive rate with interaction to obtain an accurate *** experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.
暂无评论