A key challenge in personalized product search is to capture user’s preferences. Recent work attempted to model sequences of user historical behaviors, i.e., product purchase histories, to build user profiles and to ...
详细信息
A key challenge in personalized product search is to capture user’s preferences. Recent work attempted to model sequences of user historical behaviors, i.e., product purchase histories, to build user profiles and to personalize results accordingly. Although these approaches have demonstrated promising retrieval performances, we notice that most of them focus solely on the intra-sequence interactions between items. However, as there is usually a small amount of historical behavior data, the user profiles learned by these approaches could be very sensitive to the noise included in it. To tackle this problem, we propose incorporating out-of-sequence external information to enhance user modeling. More specifically, we inject the external item-item relations (e.g., belonging to the same brand), and query-query relations (e.g., the semantic similarities between them), into the intra-sequence interaction to learn better user profiles. In addition, we devise two auxiliary decoders, with the historical item sequence reconstruction task and the global item similarity prediction task, to further improve the reliability of user modeling. Experimental results on two datasets from simulated and real user search logs respectively show that the proposed personalized product search method outperforms existing approaches.
knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and r...
详细信息
knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and relations, or dynamic KGs with temporal characteristics, faltering in their generalization to constantly evolving KGs with possible irregular entity drift. Thus, in this paper, we propose a novel link prediction model based on the embedding representation to handle the incompleteness of KGs with entity drift, termed as DCEL. Unlike traditional link prediction, DCEL could generate precise embeddings for drifted entity without imposing any regular temporal characteristic. The drifted entity is added into the KG with its links to the existing entity predicted in an incremental fashion with no requirement to retrain the whole KG for computational efficiency. In terms of DCEL model, it fully takes advantages of unstructured textual description, and is composed of four modules, namely MRC (Machine Reading Comprehension), RCAA (Relation Constraint Attentive Aggregator), RSA (Relation Specific Alignment) and RCEO (Relation Constraint Embedding Optimization). Specifically, the MRC module is first employed to extract short texts from long and redundant descriptions. Then, RCAA is used to aggregate the embeddings of textual description of drifted entity and the pre-trained word embeddings learned from corpus to a single text-based entity embedding while shielding the impact of noise and irrelevant information. After that, RSA is applied to align the text-based entity embedding to graph-based space to obtain the corresponding graph-based entity embedding, and then the learned embeddings are fed into the gate structure to be optimized based on the RCEO to improve the accuracy of representation learning. Finally, the graph-based model TransE is used to perform link prediction for drifted entity. Extensive experiments conducted on benchmark datasets in terms of evaluat
Iterative inference approaches have shown promising success in the task of multi-view depth estimation. However, these methods put excessive emphasis on the universal inter-view correspondences while neglecting the co...
详细信息
Iterative inference approaches have shown promising success in the task of multi-view depth estimation. However, these methods put excessive emphasis on the universal inter-view correspondences while neglecting the correspondence ambiguity in regions of low texture and depth discontinuous areas. Thus, they are prone to produce inaccurate or even erroneous depth estimations, which is further exacerbated cumulative errors especially in the iterative pipeline, providing unreliable information in many real-world scenarios. In this paper, we revisit this issue from the intra-view Contextual Hints and introduce a novel enhancing iterative approach, named EnIter. Concretely, at the beginning of each iteration, we present a Depth Intercept (DI) modulator to provide more accurate depth by aggregating neighbor uncertainty, correlation volume of reference and normal. This plug and play modulator is effective at intercepting the erroneous depth estimations with implicit guidance from the universal correlation contextual hints, especially for the challenging regions. Furthermore, at the end of each iteration, we refine the depth map with another plug and play modulator termed as Depth Refine (DR). It mines the latent structure knowledge of reference Contextual Hints and establishes one-way dependency using local attention from reference features to depth, yielding delicate depth in details. Extensive experiment demonstrates that our method not only achieves state-of-the-art performance over existing models but also exhibits remarkable universality in popular iterative pipelines, e.g., CasMVS, UCSNet, TransMVS, UniMVS.
作者:
Lisi WeiLibo ZhaoXiaoli ZhangCollege of Computer Science and Technology
Jilin University China College of Artificial Intelligence and Big Data Hulunbuir University China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University China College of Computer Science and Technology
Jilin University China and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education Jilin University China
Due to the limitations of imaging sensors, obtaining a medical image that simultaneously captures both functional metabolic data and structural tissue details remains a significant challenge in clinical diagnosis. To ...
详细信息
Due to the limitations of imaging sensors, obtaining a medical image that simultaneously captures both functional metabolic data and structural tissue details remains a significant challenge in clinical diagnosis. To address this, Multimodal Medical Image Fusion (MMIF) has emerged as an effective technique for integrating complementary information from multimodal source images, such as CT, PET, and SPECT, which is critical for providing a comprehensive understanding of both anatomical and functional aspects of the human body. One of the key challenges in MMIF is how to exchange and aggregate this multimodal information. This paper rethinks MMIF by addressing the harmony of modality gaps and proposes a novel Modality-Aware Interaction Network (MAINet), which leverages cross-modal feature interaction and progressively fuses multiple features in graph space. Specifically, we introduce two key modules: the Cascade Modality Interaction (CMI) module and the Dual-Graph Learning (DGL) module. The CMI module, integrated within a multi-scale encoder with triple branches, facilitates complementary multimodal feature learning and provides beneficial feedback to enhance discriminative feature learning across modalities. In the decoding process, the DGL module aggregates hierarchical features in two distinct graph spaces, enabling global feature interactions. Moreover, the DGL module incorporates a bottom-up guidance mechanism, where deeper semantic features guide the learning of shallower detail features, thus improving the fusion process by enhancing both scale diversity and modality awareness for visual fidelity results. Experimental results on medical image datasets demonstrate the superiority of the proposed method over existing fusion approaches in both subjective and objective evaluations. We also validated the performance of the proposed method in applications such as infrared-visible image fusion and medical image segmentation.
Smart agriculture which integrates the agriculture with Internet of Things (IoT) has attracted attention since it contributes to increase the productivity and quality of crops, reduce energy consumption and facilitate...
详细信息
Smart agriculture which integrates the agriculture with Internet of Things (IoT) has attracted attention since it contributes to increase the productivity and quality of crops, reduce energy consumption and facilitate the farmers. Wireless sensor networks (WSNs) and unmanned aerial vehicles (UAVs) are two most commonly deployed devices that are used for enabling the smart agriculture. In this paper, we design a collaborative WSN-UAV system, wherein different clusters of sensor nodes form different sensor-based virtual antenna arrays (SVAAs) to transmit the collected data towards different receiver UAVs via adopting collaborative beamforming (CB), then the receiver UAVs will take the collected data back to the ground control station (GCS). We formulate a transmission rate and battery energy bi-objective optimization problem (TRBEBOP) to simultaneously maximize the total transmission rate of the sensor-based CB clusters and the total remaining battery energy of the selected sensor nodes, by selecting appropriate sensor nodes in each cluster that can form a predominant SVAA, determining suitable receiver UAVs and optimizing the excitation current weights of the selected sensor nodes. To handle the formulated TRBEBOP that is demonstrated to be non-convex and NP-hard, an enhanced non-dominated sorting genetic algorithm II (ENSGA-II) with several specific designs is presented. Simulation results validate the effectiveness of the proposed ENSGA-II for solving the formulated TRBEBOP. Moreover, compared with other benchmark algorithms, the superiority of the proposed ENSGA-II is demonstrated. In addition, the impacts of several fortuitous circumstances on the system are estimated, and the results illustrate the robustness of the proposed scheme. Finally, the discussion about several mechanisms to deal with the interference induced by the sidelobe levels and the impact of UAV movement on receiving rate are provided.
Nowadays, research on session-based recommender systems (SRSs) is one of the hot spots in the recommendation domain. Existing methods make recommendations based on the user’s current intention (also called short-term...
详细信息
Nowadays, research on session-based recommender systems (SRSs) is one of the hot spots in the recommendation domain. Existing methods make recommendations based on the user’s current intention (also called short-term preference) during a session, often overlooking the specific preferences associated with these intentions. In reality, users usually exhibit diverse preferences for different intentions, and even for the same intention, individual preferences can vary significantly between users. As users interact with items throughout a session, their intentions can shift accordingly. To enhance recommendation quality, it is crucial not only to consider the user’s intentions but also to dynamically learn their varying preferences as these intentions change. In this paper, we propose a novel Intention-sensitive Preference Learning Network (IPLN) including three main modules: intention recognizer, preference detector, and prediction layer. Specifically, the intention recognizer infers the user’s underlying intention within his/her current session by analyzing complex relationships among items. Based on the acquired intention, the preference detector learns the intention-specific preference by selectively integrating latent features from items in the user’s historical sessions. Besides, the user’s general preference is utilized to refine the obtained preference to reduce the potential noise carried from historical records. Ultimately, the fine-tuned preference and intention collaborate to instruct the next-item recommendation in the prediction layer. To prove the effectiveness of the proposed IPLN, we perform extensive experiments on two real-world datasets. The experiment results demonstrate the superiority of IPLN compared with other state-of-the-art models.
The Anchor-based Multi-view Subspace Clustering (AMSC) has turned into a favourable tool for large-scale multi-view clustering. However, there still exist some limitations to the current AMSC approaches. First, they t...
详细信息
The Anchor-based Multi-view Subspace Clustering (AMSC) has turned into a favourable tool for large-scale multi-view clustering. However, there still exist some limitations to the current AMSC approaches. First, they typically recover anchor graph structure in the original linear space, restricting their feasibility for nonlinear scenarios. Second, they usually overlook the potential benefits of jointly capturing the inter-view and intra-view information for enhancing the anchor representation learning. Third, these approaches mostly perform anchor-based subspace learning by a specific matrix norm, neglecting the latent high-order correlation across different views. To overcome these limitations, this paper presents an efficient and effective approach termed Large-scale Tensorized Multi-view Kernel Subspace Clustering (LTKMSC). Different from the existing AMSC approaches, our LTKMSC approach exploits both inter-view and intra-view awareness for anchor-based representation building. Concretely, the low-rank tensor learning is leveraged to capture the high-order correlation (i.e., the inter-view complementary information) among distinct views, upon which the \(l_{1,2}\) norm is imposed to explore the intra-view anchor graph structure in each view. Moreover, the kernel learning technique is leveraged to explore the nonlinear anchor-sample relationships embedded in multiple views. With the unified objective function formulated, an efficient optimization algorithm that enjoys low computational complexity is further designed. Extensive experiments on a variety of multi-view datasets have confirmed the efficiency and effectiveness of our approach when compared with the other competitive approaches.
暂无评论