With the widespread adoption of flash memory, the Flash Friendly File System (F2FS) designed to flash memory characteristics has become widely-used in large data centers. However, F2FS encounters from significant clea...
详细信息
In recent years, research has illuminated the potency of implicit data processing in enhancing user preferences. Nevertheless, barriers remain in breaking through the constraints of implicit information. This study ai...
详细信息
In the modern digital landscape, integrating geographic locations and textual descriptions within a geo-textual dataset enhances location-based services (LBS) via spatial keyword queries, as these queries combine spat...
详细信息
Three-dimensional magnetic recording (3DMR) is a highly promising approach to achieving ultra-large data storage capacity in hard disk drives. One of the greatest challenges for 3DMR lies in performing sequential and ...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation th...
详细信息
Partial label learning is a weakly supervised learning framework in which each instance is associated with multiple candidate labels,among which only one is the ground-truth *** paper proposes a unified formulation that employs proper label constraints for training models while simultaneously performing *** existing partial label learning approaches that only leverage similarities in the feature space without utilizing label constraints,our pseudo-labeling process leverages similarities and differences in the feature space using the same candidate label constraints and then disambiguates noise *** experiments on artificial and real-world partial label datasets show that our approach significantly outperforms state-of-the-art counterparts on classification prediction.
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of t...
详细信息
Deep learning has shown significant improvements on various machine learning tasks by introducing a wide spectrum of neural network ***,for these neural network models,it is necessary to label a tremendous amount of training data,which is prohibitively expensive in *** this paper,we propose OnLine Machine Learning(OLML)database which stores trained models and reuses these models in a new training task to achieve a better training effect with a small amount of training *** efficient model reuse algorithm AdaReuse is developed in the OLML ***,AdaReuse firstly estimates the reuse potential of trained models from domain relatedness and model quality,through which a group of trained models with high reuse potential for the training task could be selected ***,multi selected models will be trained iteratively to encourage diverse models,with which a better training effect could be achieved by *** evaluate AdaReuse on two types of natural language processing(NLP)tasks,and the results show AdaReuse could improve the training effect significantly compared with models training from scratch when the training data is *** on AdaReuse,we implement an OLML database prototype system which could accept a training task as an SQL-like query and automatically generate a training plan by selecting and reusing trained *** studies are conducted to illustrate the OLML database could properly store the trained models,and reuse the trained models efficiently in new training tasks.
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the lea...
详细信息
Local differential privacy(LDP),which is a technique that employs unbiased statistical estimations instead of real data,is usually adopted in data collection,as it can protect every user’s privacy and prevent the leakage of sensitive *** segment pairs method(SPM),multiple-channel method(MCM)and prefix extending method(PEM)are three known LDP protocols for heavy hitter identification as well as the frequency oracle(FO)problem with large ***,the low scalability of these three LDP algorithms often limits their ***,communication and computation strongly affect their ***,excessive grouping or sharing of privacy budgets makes the results *** address the abovementioned problems,this study proposes independent channel(IC)and mixed independent channel(MIC),which are efficient LDP protocols for FO with a large *** design a flexible method for splitting a large domain to reduce the number of ***,we employ the false positive rate with interaction to obtain an accurate *** experiments demonstrate that IC outperforms all the existing solutions under the same privacy guarantee while MIC performs well under a small privacy budget with the lowest communication cost.
Latent Dirichlet allocation(LDA)is a topic model widely used for discovering hidden semantics in massive text *** Gibbs sampling(CGS),as a widely-used algorithm for learning the parameters of LDA,has the risk of priva...
详细信息
Latent Dirichlet allocation(LDA)is a topic model widely used for discovering hidden semantics in massive text *** Gibbs sampling(CGS),as a widely-used algorithm for learning the parameters of LDA,has the risk of privacy ***,word count statistics and updates of latent topics in CGS,which are essential for parameter estimation,could be employed by adversaries to conduct effective membership inference attacks(MIAs).Till now,there are two kinds of methods exploited in CGS to defend against MIAs:adding noise to word count statistics and utilizing inherent *** two kinds of methods have their respective *** sampled from the Laplacian distribution sometimes produces negative word count statistics,which render terrible parameter estimation in *** inherent privacy could only provide weak guaranteed privacy when defending against *** is promising to propose an effective framework to obtain accurate parameter estimations with guaranteed differential *** key issue of obtaining accurate parameter estimations when introducing differential privacy in CGS is making good use of the privacy budget such that a precise noise scale is *** is the first time that R′enyi differential privacy(RDP)has been introduced into CGS and we propose RDP-LDA,an effective framework for analyzing the privacy loss of any differentially private ***-LDA could be used to derive a tighter upper bound of privacy loss than the overestimated results of existing differentially private CGS obtained byε-*** RDP-LDA,we propose a novel truncated-Gaussian mechanism that keeps word count statistics *** we propose distribution perturbation which could provide more rigorous guaranteed privacy than utilizing inherent *** validate that our proposed methods produce more accurate parameter estimation under the JS-divergence metric and obtain lower precision and recall when defending against MIAs.
Multi-rotor unmanned aerial vehicles (UAVs) have been widely employed in various sensing tasks, e.g., environmental monitoring and disaster rescuing, many of which often require full coverage of terrestrial regions by...
详细信息
Urbanization has resulted in growing ecological pressures on cities,necessitating assessments of urban ecological ***-term characterization of regional dynamics and drivers is critical for environmental *** study prop...
详细信息
Urbanization has resulted in growing ecological pressures on cities,necessitating assessments of urban ecological ***-term characterization of regional dynamics and drivers is critical for environmental *** study proposes an enhanced ecological quality model(MRSEI)incorporating vegetation cover and EVI rather than just *** MRSEI model was applied to analyse ecological quality in Yulin City during 2000-2018 using Landsat TM/OLI data on Google Earth *** detectors also quantified anthropo-genic and environmental influences on the study *** results are summarized as follows:(1)MRSEI showed an average correlation coefficient of 0.840 with other indices,demonstrating higher representativeness than indi-vidual *** principal component analysis indicated a 12.88%increase in explained *** also exhibited significantly improved identification of roads,villages,and unused lands over RSEI,better matching ground conditions,and suitability for regional ecological assessment.(2)During 2000-2020,the average MRSEI in Yulin City was 0.481,peaking at 0.518 in 2018,indicating general ecological improvement over ***,conditions were better in the southeast than *** 38.81%of the area showed significant improvement,10.15%exhibited significant deterioration,concentrated in western Dingbian and Jingbian counties,highlighting areas requiring enhanced protection.(3)Ecological conditions in Yulin City remained stable over ***-high clusters were concentrated in eastern counties(Qingjian,Wubao,Jia,Fugu)and central lower-altitude areas near Yokoyama and ***-low clusters predominated in the northern Yuyang desert and high-altitude western Dingbian regions.(4)Enhanced vegetation cover had the greatest influence in improving Yulin’s ecological *** was the most impactful environmental driver,while precipitation and land use change interactions showed the strongest combined *** cont
暂无评论