To deal with the challenge of information overload, in this paper, we propose a financial news recommendation algorithm which help users find the articles that are interesting to read. To settle the ambiguity problem,...
详细信息
To deal with the challenge of information overload, in this paper, we propose a financial news recommendation algorithm which help users find the articles that are interesting to read. To settle the ambiguity problem, a new presented OF-IDF method is employed to represent the unstructured text data in the form of key concepts, synonyms and synsets which are all stored in the domain ontology. For users, the recommendation algorithm build the profiles based on their behaviors to detect the genuine interests and predict current interests automatically and in real time by applying the thinking of relevance feedback. Finally, the experiment conducted on a financial news dataset demonstrates that the proposed algorithm significantly outperforms the performance of a traditional recommender.
Adding to societal changes today, are the miscellaneous bigdata produced in different fields. Coupled with these data is the appearance of risk management. Admittedly, to predict future trend by using these data is c...
详细信息
Adding to societal changes today, are the miscellaneous bigdata produced in different fields. Coupled with these data is the appearance of risk management. Admittedly, to predict future trend by using these data is conducive to make everything more efficient and easy. Now, no matter companies or individuals, they increasingly focus on identifying risks and managing them before risks. Effective risk management will lead them to deal with potential problems. This thesis focuses on risk management of flight delay area using big real time data. It proposes two different prediction models, one is called General Long Term Departure Prediction Model and the other is named as Improved Real Time Arrival Prediction Model. By studying the main factors lead to flight delay, this thesis takes weather, carrier, National Aviation System, security and previous late aircraft as analysis factors. By utilizing our models can do not only long time but also short term flight delay predictions. The results demonstrate goodness of fit. Besides the theory part, it also presents a practical and beautiful web application for real time flight arrival prediction based on our second model.
High blood pressure is a highly heritable and modifiable risk factor for cardiovascular disease. We report the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to dat...
详细信息
High blood pressure is a highly heritable and modifiable risk factor for cardiovascular disease. We report the largest genetic association study of blood pressure traits (systolic, diastolic and pulse pressure) to date in over 1 million people of European ancestry. We identify 535 novel blood pressure loci that not only offer new biological insights into blood pressure regulation but also highlight shared genetic architecture between blood pressure and lifestyle exposures. Our findings identify new biological pathways for blood pressure regulation with potential for improved cardiovascular disease prevention in the future.
Knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of Knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and r...
详细信息
Knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of Knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and relations, or dynamic KGs with temporal characteristics, faltering in their generalization to constantly evolving KGs with possible irregular entity drift. Thus, in this paper, we propose a novel link prediction model based on the embedding representation to handle the incompleteness of KGs with entity drift, termed as DCEL. Unlike traditional link prediction, DCEL could generate precise embeddings for drifted entity without imposing any regular temporal characteristic. The drifted entity is added into the KG with its links to the existing entity predicted in an incremental fashion with no requirement to retrain the whole KG for computational efficiency. In terms of DCEL model, it fully takes advantages of unstructured textual description, and is composed of four modules, namely MRC (Machine Reading Comprehension), RCAA (Relation Constraint Attentive Aggregator), RSA (Relation Specific Alignment) and RCEO (Relation Constraint Embedding Optimization). Specifically, the MRC module is first employed to extract short texts from long and redundant descriptions. Then, RCAA is used to aggregate the embeddings of textual description of drifted entity and the pre-trained word embeddings learned from corpus to a single text-based entity embedding while shielding the impact of noise and irrelevant information. After that, RSA is applied to align the text-based entity embedding to graph-based space to obtain the corresponding graph-based entity embedding, and then the learned embeddings are fed into the gate structure to be optimized based on the RCEO to improve the accuracy of representation learning. Finally, the graph-based model TransE is used to perform link prediction for drifted entity. Extensive experiments conducted on benchmark datasets in terms of evaluat
Distributed Collaborative Machine Learning (DCML) has emerged in artificial intelligence-empowered edge computing environments, such as the Industrial Internet of Things (IIoT), to process tremendous data generated by...
详细信息
Distributed Collaborative Machine Learning (DCML) has emerged in artificial intelligence-empowered edge computing environments, such as the Industrial Internet of Things (IIoT), to process tremendous data generated by smart devices. However, parallel DCML frameworks require resource-constrained devices to update the entire Deep Neural Network (DNN) models and are vulnerable to reconstruction attacks. Concurrently, the serial DCML frameworks suffer from training efficiency problems due to their serial training nature. In this paper, we propose a Model Pruning-enabled Federated Split Learning framework (MP-FSL) to reduce resource consumption with a secure and efficient training scheme. Specifically, MP-FSL compresses DNN models by adaptive channel pruning and splits each compressed model into two parts that are assigned to the client and the server. Meanwhile, MP-FSL adopts a novel aggregation algorithm to aggregate the pruned heterogeneous models. We implement MP-FSL with a real FL platform to evaluate its performance. The experimental results show that MP-FSL outperforms the state-of-the-art frameworks in model accuracy by up to 1.35%, while concurrently reducing storage and computational resource consumption by up to 32.2% and 26.73%, respectively. These results demonstrate that MP-FSL is a comprehensive solution to the challenges faced by DCML, with superior performance in both reduced resource consumption and enhanced model performance.
The Anchor-based Multi-view Subspace Clustering (AMSC) has turned into a favourable tool for large-scale multi-view clustering. However, there still exist some limitations to the current AMSC approaches. First, they t...
详细信息
The Anchor-based Multi-view Subspace Clustering (AMSC) has turned into a favourable tool for large-scale multi-view clustering. However, there still exist some limitations to the current AMSC approaches. First, they typically recover anchor graph structure in the original linear space, restricting their feasibility for nonlinear scenarios. Second, they usually overlook the potential benefits of jointly capturing the inter-view and intra-view information for enhancing the anchor representation learning. Third, these approaches mostly perform anchor-based subspace learning by a specific matrix norm, neglecting the latent high-order correlation across different views. To overcome these limitations, this paper presents an efficient and effective approach termed Large-scale Tensorized Multi-view Kernel Subspace Clustering (LTKMSC). Different from the existing AMSC approaches, our LTKMSC approach exploits both inter-view and intra-view awareness for anchor-based representation building. Concretely, the low-rank tensor learning is leveraged to capture the high-order correlation (i.e., the inter-view complementary information) among distinct views, upon which the \(l_{1,2}\) norm is imposed to explore the intra-view anchor graph structure in each view. Moreover, the kernel learning technique is leveraged to explore the nonlinear anchor-sample relationships embedded in multiple views. With the unified objective function formulated, an efficient optimization algorithm that enjoys low computational complexity is further designed. Extensive experiments on a variety of multi-view datasets have confirmed the efficiency and effectiveness of our approach when compared with the other competitive approaches.
暂无评论