machinelearning (ML) technology is advancing rapidly but the existing development process lacks standardized process, and the quality of machinelearning system development is difficult to guarantee. Requirement mode...
详细信息
Federated Average algorithm (FEDAVG) is the preferred algorithm for federated learning (FL) because of its simplicity and low communication cost However, if all clients's local data aren't independent and equa...
详细信息
Due to the rapid spread of rumors on social media, which has a detrimental effect on our lives, it is becoming increasingly important to detect rumors. It has been proved that the study of dynamic graphs is helpful to...
详细信息
Due to the rapid spread of rumors on social media, which has a detrimental effect on our lives, it is becoming increasingly important to detect rumors. It has been proved that the study of dynamic graphs is helpful to capture the temporal change of information transmission and understand the evolution trend and pattern change of events. However, the dynamic learning methods currently studied do not fully consider the interaction characteristics of the evolutionary process. Therefore, it is difficult to fully capture the structural and semantic differences between them. In order to fully exploit the potential correlations of such temporal information, we propose a novel model named dynamic evolution characteristics learning (DECL) method for rumor detection. First, we partition the temporal snapshot sequences based on the propagation structure of rumors. Secondly, a multi-task graph contrastive learning method is adopted to enable the graph encoder to capture the essential features of rumors, and to fully explore the temporal structural differences and semantic similarities between true rumor and false rumor events. Experimental results on three real-world social media datasets confirm the effectiveness of our model for rumor detection tasks.
Major app stores have introduced privacy labels (e.g., Google Play's data safety section since July 2022), requiring app developers to provide their privacy disclosures, including data types collected and shared b...
详细信息
ISBN:
(纸本)9798400705946
Major app stores have introduced privacy labels (e.g., Google Play's data safety section since July 2022), requiring app developers to provide their privacy disclosures, including data types collected and shared by their apps and third-party SDKs they use. Third-party SDK providers have published guidance pages instructing app developers what data types their SDKs use and thus must be declared to the data safety section. Availability and correctness of the guidance pages are critical issues but have yet to receive any attention. This paper presents the first study of the guidance pages. *** attempted to collect the guidance pages of 175 commercial SDKs widely used in Android apps and did not obtain them for 63% of the SDKs, suggesting that the majority of them have not provided guidance pages. Further, we develop a system that detects inconsistencies between the guidance pages and the actual data collection of SDKs. It uses machinelearning and dynamic taint analysis to extract privacy practices from the guidance pages and SDKs and analyzes the outcomes to detect the critical gap. We construct datasets of 47 guidance pages and 43 SDKs' 159 sample apps and evaluate the system. The system uncovered discrepancies related to location and identifiers in the guidance pages of eight SDKs. We also evaluate the machinelearning model's accuracy for unknown guidance page contents. The results show that the model performs satisfactorily for updated guidance pages, and the accuracy for newly posted ones increases as the model learns more. This study exposes the critical issues of the guidance pages and also contributes to tools and datasets for facilitating further research on guidance pages and privacy labels.
Financial fraud is a widespread problem that can cause significant economic losses. Traditional fraud detection methods often rely on manual audits and rules-based systems, which can be time-consuming and error-prone....
详细信息
In this study, the relationship between the operating conditions and the product yields and a control framework of the hydrocracking process was developed. The data were collected from a hydrocracking unit in a Chines...
详细信息
In this study, the relationship between the operating conditions and the product yields and a control framework of the hydrocracking process was developed. The data were collected from a hydrocracking unit in a Chinese refinery. Principal component analysis was used to decrease the number of input variables. Then support vector machine, Gaussian process regression (GPR), and decision tree regression models were developed to establish the relationship above. The best model is GPR, whose Pearson correlation coefficient between the prediction value and the actual value is greater than 0.97 for all the product yields. Shapley additive explanations were performed to interpret the results of the GPR models. A control framework of the hydrocracking unit was then proposed based on the results above. The results show that the machinelearning method is a valuable tool for predicting the yield of hydrocracking products, and the control framework proposed helps optimize hydrocracking product yields.
machinelearning is a type of artificial intelligence where computers solve issues by considering examples of real-world data. Within machinelearning, there are various types of techniques or tasks such as supervised...
详细信息
ISBN:
(纸本)9781665467544
machinelearning is a type of artificial intelligence where computers solve issues by considering examples of real-world data. Within machinelearning, there are various types of techniques or tasks such as supervised, unsupervised, reinforcement, and many hyperparameters have to be tuned to have high accuracy especially in image classification. The batch size refers to the total number of images required to train a single reverse and forward pass. It is one of the most essential hyperparameters. In our paper, we have studied the supervised task with image classification by changing batch size with epoch. The characterization effect of increasing the batch size on training time and how this relationship varies with the training model have been studied, which leads to extremely large variation between them. According to our results, a larger batch size does not always result in high accuracy.
Optical Character Recognition (OCR) has become quite well known in the last few years, because it has applications in many sectors. In this paper, we look at the basics of OCR and discuss a few popular datasets that c...
详细信息
The most popular form of official communication for business purposes is email. Despite the existence of other communication methods, email usage is still the largest. Today's environment necessitates automated em...
详细信息
A new engine for advancing college education reform and development is big data. The construction of a college education big data application is the premise and the core in order to fully utilize the value of educatio...
详细信息
暂无评论