In search engines, different users may search for different information by issuing the same query. To satisfy more users with limited search results, search result diversification re-ranks the results to cover as many...
详细信息
In search engines, different users may search for different information by issuing the same query. To satisfy more users with limited search results, search result diversification re-ranks the results to cover as many user intents as possible. Most existing intent-aware diversification algorithms recognize user intents as subtopics, each of which is usually a word, a phrase, or a piece of description. In this paper, we leverage query facets to understand user intents in diversification, where each facet contains a group of words or phrases that explain an underlying intent of a query. We generate subtopics based on query facets and propose faceted diversification approaches. Experimental results on the public TREC 2009 dataset show that our faceted approaches outperform state-of-the-art diversification models.
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separa...
详细信息
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separate these tables into clusters that represent different topics. Moreover, as a schema can be very big, the schema summary needs to be structured into multiple levels, to further improve the usability. In this paper, we introduce a new schema summarization approach utilizing the techniques of community detection in social networks. Our approach contains three steps. First, we use a community detection algorithm to divide a database schema into subject groups, each representing a specific subject. Second, we cluster the subject groups into abstract domains to form a multi-level navigation structure. Third, we discover representative tables in each cluster to label the schema summary. We evaluate our approach on Freebase, a real world large-scale database. The results show that our approach can identify subject groups precisely. The generated abstract schema layers are very helpful for users to explore database.
This paper proposes an effective fusion of Analytic Hierarchy Process (AHP) and Grey Relational Analysis (GRA) approach for the risk evaluation in Mobile Commerce (MC) development. The hybrid method employs the comple...
详细信息
Unmanned aerial vehicles (UAVs) have emerged as the potential aerial base stations (BSs) to improve terrestrial communications. However, the limited onboard energy and antenna power of a UAV restrict its communication...
详细信息
The rapid development of the low-altitude economy (LAE) has significantly increased the utilization of autonomous aerial vehicles (AAVs) in various applications, necessitating efficient and secure communication method...
详细信息
The rapid development of the low-altitude economy (LAE) has significantly increased the utilization of autonomous aerial vehicles (AAVs) in various applications, necessitating efficient and secure communication methods among AAV swarms. In this work, we aim to introduce distributed collaborative beamforming (DCB) into AAV swarms and handle the eavesdropper collusion by controlling the corresponding signal distributions. Specifically, we consider a two-way DCB-enabled aerial communication between two AAV swarms and construct these swarms as two AAV virtual antenna arrays. Then, we minimize the two-way known secrecy capacity and maximum sidelobe level to avoid information leakage from the known and unknown eavesdroppers, respectively. Simultaneously, we also minimize the energy consumption of AAVs when constructing virtual antenna arrays. Due to the conflicting relationships between secure performance and energy efficiency, we consider these objectives by formulating a multi-objective optimization problem, which is NP-hard and with a large number of decision variables. Accordingly, we design a novel generative swarm intelligence (GenSI) framework to solve the problem with less overhead, which contains a conditional variational autoencoder (CVAE)-based generative method and a proposed powerful swarm intelligence algorithm. In this framework, CVAE can collect expert solutions obtained by the swarm intelligence algorithm in other environment states to explore characteristics and patterns, thereby directly generating high-quality initial solutions in new environment factors for the swarm intelligence algorithm to search solution space efficiently. Simulation results show that the proposed swarm intelligence algorithm outperforms other state-of-the-art baseline algorithms, and the GenSI can achieve similar optimization results by using far fewer iterations than the ordinary swarm intelligence algorithm. Experimental tests demonstrate that introducing the CVAE mechanism ach
Due to the rapid growth of social net services (SNSs), research into SNSs continuance usage has recently emerged as an important issue in information systems adaption. This study develops an integrated model based on ...
详细信息
In RFID application systems with multiple packaging layers, labeling packaging relationship of objects in different packaging layers by encoding methods is a important technology field. Prefix-based labeling scheme is...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation th...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation that employs proper label constraints for training models while simultaneously performing *** existing partial label learning approaches that only leverage similarities in the feature space without utilizing label constraints,our pseudo-labeling process leverages similarities and differences in the feature space using the same candidate label constraints and then disambiguates noise *** experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts on classification prediction.
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of t...
详细信息
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in *** this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training *** efficient model reuse algorithm AdaReuse is developed in the OLML ***,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected ***,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by *** evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is *** on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained *** studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks.
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the lea...
详细信息
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the leakage of sensitive *** segment pairs method(SPM),multiple-channel method(MCM)and prefix extending method(PEM)are three known LDP protocols for heavy hitter identification as well as the frequency oracle(FO)problem with large ***,the low scalability of these three LDP algorithms often limits their ***,communication and computation strongly affect their ***,excessive grouping or sharing of privacy budgets makes the results *** address the abovementioned problems,this study proposes independent channel(IC)and mixed independent channel(MIC),which are efficient LDP protocols for FO with a large *** design a flexible method for splitting a large domain to reduce the number of ***,we employ the false positive rate with interaction to obtain an accurate *** experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.
暂无评论