Sequential recommendation (SR) aims to provide appropriate items a user will click according to the user's historical behavior sequence. Conventional SR models are trained under the next item prediction task, and ...
ISBN:
(数字)9783031434273
ISBN:
(纸本)9783031434266;9783031434273
Sequential recommendation (SR) aims to provide appropriate items a user will click according to the user's historical behavior sequence. Conventional SR models are trained under the next item prediction task, and thus should deal with two challenges, including the data sparsity of user feedback and the variability and irregularity of user behaviors. Different from natural language sequences in NLP, user behavior sequences in recommendation are much more personalized, irregular, and unordered. Therefore, the current user preferences extracted from user historical behaviors may also have correlations with the next-k (i.e., future clicked) items besides the classical next-1 (i.e., current clicked) item to be predicted. Inspired by this phenomenon, we propose a novel Future augmentation with self-distillation in recommendation (FASRec). It considers future clicked items as augmented positive signals of the current clicks in training, which addresses both data sparsity and behavior irregularity and variability issues. To denoise these augmented future clicks, we further adopt a self-distillation module with the exponential moving average strategy, considering soft labels of self-distillation as confidence for more accurate augmentations. In experiments, FASRec achieves significant and consistent improvements on both offline and online evaluations with different base SR models, confirming its effectiveness and universality. FASRec has been deployed on a widely-used recommendation feed in Tencent. The source codes are in https://***/FASRec/FASRec.
Counterfactual explanation is a form of interpretable machine learning that generates perturbations on a sample to achieve the desired outcome. The generated samples can act as instructions to guide end users on how t...
ISBN:
(纸本)9783031333767;9783031333774
Counterfactual explanation is a form of interpretable machine learning that generates perturbations on a sample to achieve the desired outcome. The generated samples can act as instructions to guide end users on how to observe the desired results by altering samples. Although state-of-the-art counterfactual explanation methods are proposed to use variational autoencoder (VAE) to achieve promising improvements, they suffer from two major limitations: 1) the counterfactuals generation is prohibitively slow, which prevents algorithms from being deployed in interactive environments;2) the counterfactual explanation algorithms produce unstable results due to the randomness in the sampling procedure of variational autoencoder. In this work, to address the above limitations, we design a robust and efficient counterfactual explanation framework, namely CeFlow, which utilizes normalizing flows for the mixed-type of continuous and categorical features. Numerical experiments demonstrate that our technique compares favorably to state-of-the-art methods. We release our source code (https://***/tridungduong16/***) for reproducing the results.
In this paper, we propose a dictionary screening method for embedding compression in text classification. The key point is to evaluate the importance of each keyword in the dictionary. To this end, we first train a pr...
ISBN:
(纸本)9783031333736;9783031333743
In this paper, we propose a dictionary screening method for embedding compression in text classification. The key point is to evaluate the importance of each keyword in the dictionary. To this end, we first train a pre-specified recurrent neural network-based model using a full dictionary. This leads to a benchmark model, which we use to obtain the predicted class probabilities for each sample in a dataset. Next, to evaluate the impact of each keyword in affecting the predicted class probabilities, we develop a novel method for assessing the importance of each keyword in a dictionary. Consequently, each keyword can be screened, and only the most important keywords are reserved. With these screened keywords, a new dictionary with a considerably reduced size can be constructed. Accordingly, the original text sequence can be substantially compressed. The proposed method leads to significant reductions in terms of parameters, average text sequence, and dictionary size. Meanwhile, the prediction power remains very competitive compared to the benchmark model. Extensive numerical studies are presented to demonstrate the empirical performance of the proposed method.
Cryptocurrencies have drawn the interest of both scholars and professionals due to their decentralised, unique payment system supported by blockchain technology and their autonomy from sovereign governments, centralis...
ISBN:
(纸本)9783031234910;9783031234927
Cryptocurrencies have drawn the interest of both scholars and professionals due to their decentralised, unique payment system supported by blockchain technology and their autonomy from sovereign governments, centralised organisations, and banking systems. Numerous works have studied, on the one hand, the behavior of cryptocurrencies, and on the other hand, the multifractal model in financial markets. Nevertheless, the limitations of existing models exist, and the literature calls for more research into multifractal analysis techniques applied to finance, as the methodology widely used in previous research is the regression model and machine learning methods. This study introduces a new model for predicting unexpected situations of speculative attacks in the cryptocurrency market, applying the method of Multiscale Multifractal Detrended Fluctuation Analysis, which shows excellent precision results. Our approach has a high impact potential on the forecast of possible speculative actions over the value of cryptocurrencies and against the risks derived from the control of cryptocurrencies by private entities, so the question of the possible effect on the financial system is of great importance.
The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change. Previous work focused on detecting topic shifts using pre-trained models to encode ...
ISBN:
(纸本)9789819947515;9789819947522
The goal of dialogue topic shift detection is to identify whether the current topic in a conversation has changed or needs to change. Previous work focused on detecting topic shifts using pre-trained models to encode the utterance, failing to delve into the various levels of topic granularity in the dialogue and understand dialogue contents. To address the above issues, we take a prompt-based approach to fully extract topic information from dialogues at multiple-granularity, i.e., label, turn, and topic. Experimental results on our annotated Chinese Natural Topic Dialogue dataset CNTD and the publicly available English TIAGE dataset show that the proposed model outperforms the baselines. Further experiments show that the information extracted at different levels of granularity effectively helps the model comprehend the conversation topics.
Legal text is significantly different from English text (e.g. Wikipedia, News) used for training most natural language processing (NLP) algorithms. As a result, the state of the art algorithms (e.g. GPT3, BERT derivat...
ISBN:
(纸本)9783031361890;9783031361906
Legal text is significantly different from English text (e.g. Wikipedia, News) used for training most natural language processing (NLP) algorithms. As a result, the state of the art algorithms (e.g. GPT3, BERT derivatives), need additional effort (e.g. fine-tuning and further pre-training) to achieve optimal performance on legal text. Hence there is a need to create separate NLP data sets and benchmarks for legal text which are challenging and focus on tasks specific to legal systems. This will spur innovation in applications of NLP for legal text and will benefit AI community and legal fraternity. This paper focuses on an empirical review of the existing work in the use of NLP in Indian legal text and proposes ideas to create new benchmarks for Indian Legal NLP.
Understanding texts written in natural language is a challenging task. Semantic Web technologies, in particular ontologies, can be used to represent knowledge from a specific domain and reason like a human. Ontology p...
ISBN:
(数字)9783031409608
ISBN:
(纸本)9783031409592;9783031409608
Understanding texts written in natural language is a challenging task. Semantic Web technologies, in particular ontologies, can be used to represent knowledge from a specific domain and reason like a human. Ontology population from texts aims to transform textual contents into ontological assertions. This paper deals with an approach of automatic ontology population from French textual descriptions. This approach has been designed to be domain-independent, as long as a domain ontology is provided. It relies on text-based and knowledge-based analyses, which are fully explained. Experiments performed on French classified advertisements are discussed and provide encouraging results.
Generalized Few-Shot Learning (GFSL) applies the model trained with the base classes to predict the samples from both base classes and novel classes, where each novel class is only provided with a few labeled samples ...
ISBN:
(纸本)9783031434143;9783031434150
Generalized Few-Shot Learning (GFSL) applies the model trained with the base classes to predict the samples from both base classes and novel classes, where each novel class is only provided with a few labeled samples during testing. Limited by the severe data imbalance between base and novel classes, GFSL easily suffers from the prediction shift issue that most test samples tend to be classified into the base classes. Unlike the existing works that address this issue by either multi-stage training or complicated model design, we argue that extracting both discriminative and generalized feature representations is all GFSL needs, which could be achieved by simply scattering the intra-class distribution during training. Specifically, we introduce two self-supervised auxiliary tasks and a label permutation task to encourage the model to learn more image-level feature representations and push the decision boundary from novel towards base classes during inference. Our method is one-stage and could perform online inference. Experiments on the miniImageNet and tieredImageNet datasets show that the proposed method achieves comparable performance with the state-of-the-art multi-stage competitors under both traditional FSL and GFSL tasks, empirically proving that feature representation is the key for GFSL.
Although cross-domain recommender systems (CDRSs) are promising approaches to solving the cold-start problem, most CDRSs require overlapped users, which significantly limits their applications. To remove the overlap l...
ISBN:
(纸本)9783031333798;9783031333804
Although cross-domain recommender systems (CDRSs) are promising approaches to solving the cold-start problem, most CDRSs require overlapped users, which significantly limits their applications. To remove the overlap limitation, researchers introduced domain adversarial learning and embedding attribution alignment to develop non-overlapped CDRSs. Existing non-overlapped CDRSs, however, have several drawbacks. They ignore the semantic relations between source and target items, leading to noisy knowledge transfer. Moreover, they learn knowledge from both domain-shared and domain-specific preferences and are hence easily misled by the source-domain-specific preferences. To overcome these drawbacks, we propose a novel semantic relation-based knowledge transfer framework (SRTrans). We semantically cluster the source and the target items and calculate their similarities to extract relational knowledge between domains. To transfer the relational knowledge, we develop a new two-tier graph transfer network. Last, we introduce a task-oriented knowledge distillation supervision and combine it with a prediction loss to alleviate the negative impact of the source-domain-specific preferences. Our experimental results on real-world datasets demonstrate that SRTrans significantly outperforms state-of-the-art models.
Learning to forecast spatiotemporal (ST) environmental processes from a sparse set of samples collected autonomously is a difficult task from both a sampling perspective (collecting the best sparse samples) and from a...
ISBN:
(纸本)9783031490071;9783031490088
Learning to forecast spatiotemporal (ST) environmental processes from a sparse set of samples collected autonomously is a difficult task from both a sampling perspective (collecting the best sparse samples) and from a learning perspective (predicting the next timestep). Recent work in spatiotemporal process learning focuses on using deep learning to forecast from dense samples. Moreover, collecting the best set of sparse samples is understudied within robotics. An example of this is robotic sampling for information gathering, such as using UAVs/UGVs for weather monitoring. In this work, we propose a methodology that leverages a neural methodology called Recurrent Neural Processes to learn spatiotemporal environmental dynamics for forecasting from selective samples gathered by a team of robots using a mixture of Gaussian Processes model in an online learning fashion. Thus, we combine two learning paradigms in that we use an active learning approach to adaptively gather informative samples and a supervised learning approach to capture and predict complex spatiotemporal environmental phenomena.
暂无评论