With the growth of social networks, a wide range of methodologies have been developed to describe users' personalities only based on their language and social media use habits. Persona prediction is very popular t...
With the growth of social networks, a wide range of methodologies have been developed to describe users' personalities only based on their language and social media use habits. Persona prediction is very popular these days. It examines consumer behaviour and records the user's ideas, emotions, and so on. There has been a sufficient amount of research in this field over the years. This survey provides an overview of the numerous methods tried to predict personality and behaviour from the use of the social media content. The capacity to anticipate the personality attributes of users can help create a variety of specialised goods or services. The concluding phase then provides the future characteristics and directives.
作者:
Wang, FeiyuZhou, Jian-TaoCollege of Computer Science
Inner Mongolia University Inner Mongolia Hohhot China Engineering Research Center of Ecological Big Data
Ministry of Education Natl. Loc. Jt. Eng. Research Center of Intelligent Information Processing Technology for Mongolian Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software Inner Mongolia Key Laboratory of Social Computing and Data Processing Inner Mongolia Engineering Laboratory for Big Data Analysis Technology China
Cloud storage services have been used by most businesses and individual users. However, data loss, service interruptions and cyber attacks often lead to cloud storage services not being provided properly, and these in...
Inductive link prediction (ILP) is to predict links for unseen entities in emerging knowledge graphs (KGs), considering the evolving nature of KGs. A more challenging scenario is that emerging KGs consist of only unse...
Inductive link prediction (ILP) is to predict links for unseen entities in emerging knowledge graphs (KGs), considering the evolving nature of KGs. A more challenging scenario is that emerging KGs consist of only unseen entities without any edge connected to original KGs, called as disconnected emerging KGs (DEKGs). Existing studies for DEKGs only focus on predicting enclosing links, i.e., predicting links inside the emerging KG. The bridging links, which carry the evolutionary information from the original KG to DEKG, have not been investigated by previous work so far. To fill in the gap, we propose a novel model entitled DEKG-ILP (Disconnected Emerging Knowledge Graph Oriented Inductive Link Prediction) that consists of the following two components. (1) The module CLRM (Contrastive Learning-based Relation-specific Feature Modeling) is developed to extract global relation-based semantic features that are shared between original KGs and DEKGs with a novel sampling strategy. (2) The module GSM (GNN-based Subgraph Modeling) is proposed to extract the local subgraph topological information around each link in KGs. The extensive experiments conducted on several benchmark datasets demonstrate that DEKG-ILP has obvious performance improvements compared with state-of-the-art methods for both enclosing and bridging link prediction.
This work focuses on diverse machine learning methods for big data analytics, which works to leverage predictive performance. The techniques like data preprocessing, dimensionality reduction, feature selection, model ...
详细信息
data Reduction without the removal of exact, correct rows is a crucial pre-processing step. Large datasets make it difficult to model data effectively or forecast results accurately. Additionally, they demand lengthy ...
data Reduction without the removal of exact, correct rows is a crucial pre-processing step. Large datasets make it difficult to model data effectively or forecast results accurately. Additionally, they demand lengthy processing times, sophisticated complexity software and thorough data cleaning. The incorrect and irrelevant rows may produce inaccurate results that impair the performance of the model. For better and more accurate outcomes it is crucial to properly detect and remove inaccurate data. The proposed algorithm calculates the Initial Recall value of the dataset. It eliminates least correlated features using the Correlation Matrix. Using Gaussian Curve, for all the columns it identifies and eliminates rows having values which lie beyond (μ ± 3σ). Furthermore, it takes into account the column with the highest Standard Deviation, selects the nearest 50% Left and 50% Right values from that column's mean. It selects only those rows and calculates the Final Recall value. Negligible difference between Initial and Final Recall values implies that the removed rows had no or minimal impact on the dataset's final result. This algorithm is implemented on 3 Standard Medical datasets - Pima Diabetes, Heart Attack and Breast Cancer. For the Breast Cancer dataset, this algorithm eliminated the highest number of rows that is 231.
Recently, Machine Learning (ML) has become a widely accepted method for significant progress that is rapidly evolving. Since it employs computational methods to teach machines and produce acceptable answers. The signi...
详细信息
The key-value separation is renowned for its significant mitigation of the write amplification inherent in traditional LSM trees. However, KV separation potentially increases performance overhead in the management of ...
ISBN:
(纸本)9781939133458
The key-value separation is renowned for its significant mitigation of the write amplification inherent in traditional LSM trees. However, KV separation potentially increases performance overhead in the management of Value region, especially for garbage collection (GC) operation that is used to reduce the redundant space occupation. In response, many efforts have been made to optimize the GC mechanism for KV separation. However, our analysis indicates that such solution based on trade-offs between CPU and I/O overheads cannot simultaneously satisfy the three requirements of KV separated systems in terms of throughput, tail latency, and space usage. This limitation hinders their real-world *** this paper, we introduce AegonKV, a "three-birds-one-stone" solution that comprehensively enhances the throughput, tail latency, and space usage of KV separated systems. AegonKV first proposes a SmartSSD-based GC offloading mechanism to enable asynchronous GC operations without competing with LSM read/write for bandwidth or CPU. AegonKV leverages offload-friendly data structures and hardware/ software execution logic to address the challenges of GC offloading. Experiments demonstrate that AegonKV achieves the largest throughput improvement of 1.28-3.3 times, a significant reduction of 37%-66% in tail latency, and 15%-85% in space overhead compared to existing KV separated systems.
Machine learning-based improvements in anomaly detection, visualization, and segmentation are made possible by the growing digitization of medical imaging, which reduces the workload for medical specialists. Neverthel...
详细信息
The traditional domain adaptation task is not sufficient in mining inter domain image context. We propose a solution from the perspective of image semantic level retrieval. Image semantic level retrieval is based on t...
详细信息
Effective communication is crucial for success in various interactions, including personal and online interviews. The work proposed is to refine the communication effectiveness and extend the understanding of interact...
详细信息
ISBN:
(数字)9798350374957
ISBN:
(纸本)9798350374964
Effective communication is crucial for success in various interactions, including personal and online interviews. The work proposed is to refine the communication effectiveness and extend the understanding of interactions by incorporating new system functions of real-time emotion and body language capture. Facial and body landmarks are analyzed using Google’s Mediapipe to ensure accurate identification of the frequently performed emotions and frequently posed postures. The proposed system uses three classifiers, namely Random Forest, Gradient Boosting, and Ridge Classifier, all integrated in one pipeline; this makes our work distinct. In the testing phase the classifiers are checked to avoid misidentification of emotions or incorrect posture in different situations. One of the major novelties of this work is its ability to address users of different languages, thus it is relevant anywhere in the world. This dynamic and robust system provides accurate emotion analysis achieving the accuracy of 92%, enhancing understanding in various communication contexts and setting a new standard for interview and interaction analysis.
暂无评论