As social media platforms become increasingly strict in censorship of user posts, aggressive and insulting language has gradually shifted to implicit expressions using homophones, metaphors, and other forms of camoufl...
ISBN:
(数字)9783031402869
ISBN:
(纸本)9783031402852;9783031402869
As social media platforms become increasingly strict in censorship of user posts, aggressive and insulting language has gradually shifted to implicit expressions using homophones, metaphors, and other forms of camouflage. This not only intensifies the satirical attacks of negative posts but also leads to a significant decrease in the effectiveness of models designed to detect offensive speech. This paper aims to achieve high accurate detection of implicit offensive speech. We conducted targeted investigations on the speech characteristics on Weibo, which is one of the largest social media platform in China. Based on the identified features of implicit offensive speech, including semantic, emotional, metaphorical, and fallacy characteristics, this paper constructs a BERT-based Multi-Task learning model named BMA (BERT-Mate-Ambiguity) to accurately detect the implicit offensive speech in the real world. Additionally, this paper establishes a dataset based on posts of Weibo that contain implicit offensive speech and conducts various comparative, robustness, and ablation experiments. The effectiveness of the model is demonstrated by comparing it to existing models that perform well in this field. Finally, this paper discusses some of its limitations and proposes future research work.
Personalized document-level sentiment analysis (PDSA) is important in various fields. Although various deep learning models for PDSA have been proposed, they failed to consider the correlations of rating behaviors bet...
ISBN:
(纸本)9789819947515;9789819947522
Personalized document-level sentiment analysis (PDSA) is important in various fields. Although various deep learning models for PDSA have been proposed, they failed to consider the correlations of rating behaviors between different users. It can be observed that in the real-world users may give different rating scores for the same product, but their rating behaviors tend to be correlated over a range of products. However, mining user correlation is very challenging due to real-world data sparsity, and a model is lacking to utilize user correlation for PDSA so far. To address these issues, we propose an architecture named User Correlation Mining (UCM). Specifically, UCM contains two components, namely Similar User Cluster Module (SUCM) and Triple Attributes BERT Model (TABM). SUCM is responsible for user clustering. It consists of two modules, namely Latent Factor Model based on Neural Network (LFM-NN) and Spectral Clustering based on PearsonCorrelationCoefficient (SC-PCC). LFM-NN predicts themissing values of the sparse user-product ratingmatrix. SC-PCC clusters users with high correlations to get the user cluster IDs. TABMis designed to classify the users' sentiment based on user cluster IDs, user IDs, product IDs, and user reviews. To evaluate the performance of UCM, extensive experiments are conducted on the three real-world datasets, i.e., IMDB, Yelp13, and Yelp14. The experiment results show that our proposed architecture UCM outperforms other baselines.
To recognize road intersections using cycling trajectories accurately is vital to the quality of the digital map that cycling navigation apps use. However, the existing approaches mainly identify road intersections ba...
ISBN:
(纸本)9783031434297;9783031434303
To recognize road intersections using cycling trajectories accurately is vital to the quality of the digital map that cycling navigation apps use. However, the existing approaches mainly identify road intersections based on motor vehicles' trajectories, and they fail to tackle unique challenges posed by cycling trajectories: (i) Cycling trajectories of minor intersections and their adjacent road segments are quite sparse. (ii) Turning behaviors occur at different areas in intersections of various sizes. To address the above challenges, in this paper, we propose a precision-enhanced road intersection recognition method using cycling trajectories, called PICT. Initially, to enhance the representations of minor intersections, a grid topology representation module is designed to extract intersection topology. Then an intersection inference module based on multi-scale feature learning is put forward to identify the intersections of different scales correctly. Finally, extensive comparative experiments on two real-world datasets demonstrate that PICT significantly outperforms the stateof-the-art methods by 52.13% in the F1-score of intersection recognition.
This paper presents the approach employed by the team RoboBreizh to win the championship in the 2022 RoboCup@Home Social Standard Platform League (SSPL). RoboBreizh decided to limit itself to an entirely embedded syst...
ISBN:
(纸本)9783031284687;9783031284694
This paper presents the approach employed by the team RoboBreizh to win the championship in the 2022 RoboCup@Home Social Standard Platform League (SSPL). RoboBreizh decided to limit itself to an entirely embedded system with no connection to the internet and external devices. This article describes the design of embedded solutions including manager, navigation, dialog and perception. We present results from the competition showing up the value of our proposal.
This paper presents the team AutonOHM and their solutions to the challenges of the RoboCup@Work league. The hardware section covers the robot setup of Ohmn3, which was developed using knowledge from previous robots us...
ISBN:
(纸本)9783031284687;9783031284694
This paper presents the team AutonOHM and their solutions to the challenges of the RoboCup@Work league. The hardware section covers the robot setup of Ohmn3, which was developed using knowledge from previous robots used by the team. Custom solution approaches for the @Work navigation, perception, and manipulation tasks are discussed in the software section, as well as a control architecture for the autonomous task completion.
The manifold tangent space-based algorithm has emerged as a promising approach for processing and recognizing high-dimensional data. In this study, we propose a new algorithm based on the manifold tangent space, calle...
ISBN:
(数字)9789819947522
ISBN:
(纸本)9789819947515;9789819947522
The manifold tangent space-based algorithm has emerged as a promising approach for processing and recognizing high-dimensional data. In this study, we propose a new algorithm based on the manifold tangent space, called the manifold tangent space-based 2D-DLPP algorithm. This algorithm embeds the covariance matrix into the tangent space of the SPD manifold and utilizes Log-Euclidean Metric Learning (LEM) to fully extract feature information, thus enhancing the discriminative ability of 2D-DLPP. Comparative experiments were conducted to evaluate the algorithm, and the results showed superior recognition ability compared to other existing algorithms. Experiments also demonstrate that the algorithm can retain the local nonlinear structure of the manifold and improve the class separability of samples.
Improving fairness by manipulating the preprocessing stages of classification pipelines is an active area of research, closely related to AutoML. We propose a genetic optimisation algorithm, FairPipes, which optimises...
ISBN:
(纸本)9783031334979;9783031334986
Improving fairness by manipulating the preprocessing stages of classification pipelines is an active area of research, closely related to AutoML. We propose a genetic optimisation algorithm, FairPipes, which optimises for user-defined combinations of fairness and accuracy and for multiple definitions of fairness, providing flexibility in the fairness-accuracy trade-off. FairPipes heuristically searches through a large space of pipeline configurations, achieving near-optimality efficiently, presenting the user with an estimate of the solutions' Pareto front. We also observe that the optimal pipelines differ for different datasets, suggesting that no "universal best" pipeline exists and confirming that FairPipes fills a niche in the fairness-aware AutoML space.
Geometric aspects of knowledge graph embedding models directly impact their capability to preserve knowledge from the original graph to the vector space. For example, the capability to preserve structural patterns suc...
ISBN:
(纸本)9783031434174;9783031434181
Geometric aspects of knowledge graph embedding models directly impact their capability to preserve knowledge from the original graph to the vector space. For example, the capability to preserve structural patterns such as hierarchies, loops, and paths present as relational structures in a knowledge graph depends on the underlying geometry. In these years, temporal information has gained lots of attention from researchers. While non-Euclidean geometry, e.g. Hyperbolic Geometry, has been shown to work well in static knowledge graph embedding models for such relational structures, this does not hold for temporal information in knowledge graphs. This is due to the different characteristics of temporal information: time can be seen mostly as a linear construct and using a geometry that is not suitable for this can adversely affect performance. To address this research gap, we provide a novel temporal knowledge graph embedding model that combines different geometries: the non-temporal part of the knowledge is mapped to a hyperbolic space and the temporal part is mapped to a Euclidean space. Our extensive evaluations on several benchmark datasets show a significant performance improvement in comparison to state-of-the-art models.
Nowadays, there is an increase in the use of renewable energies to fight against climatic change. One of the most popular energy is solar one, which could have two different produced energies: thermal and electrical. ...
ISBN:
(纸本)9783031407246;9783031407253
Nowadays, there is an increase in the use of renewable energies to fight against climatic change. One of the most popular energy is solar one, which could have two different produced energies: thermal and electrical. The case study used in this research is an installation located in the University of A Coruna, in Ferrol, and it is a photovoltaic array with five rows of 12 solar panels each one, with a total peak power of 12,9 kW. The installation is correctly oriented to the South, with an inclination of 35. to achieve the theoretical performance of 99,82%. The model created in this research predicts the accumulated daily energy produced by the installation base on the solar hours predicted by the meteorological service. The other inputs of the model are the real solar hours and the energy produced the day before the prediction. A hybrid model is created by dividing the dataset with a clustering technique to create groups. Then, each cluster trains a regression algorithm to increase the global prediction performance. K-Means are used to create the clusters and artificial Neural Networks, Support Vector Machines for Regression and Polynomial Regression are used to create the local models for clusters.
Finding information online is hard, even more so once you get into the domain of argumentation. There have been developments around the specialized argumentation machines that incorporate structural features of argume...
ISBN:
(纸本)9783031401763;9783031401770
Finding information online is hard, even more so once you get into the domain of argumentation. There have been developments around the specialized argumentation machines that incorporate structural features of arguments, but all current approaches share one pitfall: They operate on a corpora of limited sizes. Consequently, it may happen that a user searches for a rather general term like cost increases, but the machine is only able to serve arguments concerned with rent increases. We aim to bridge this gap by introducing approaches to generalize/specialize a found argument using a combination of WordNet and Large Language Models. The techniques are evaluated on a new benchmark dataset with diverse queries using our fully featured implementation. Both the dataset and the code are publicly available on GitHub.
暂无评论