Most of the existing slam algorithms are designed based on the assumption of a static environment, this strong assumption limits the practical application of most slam systems. The main reason is that moving objects w...
详细信息
In edge computing (EC), resource allocation is to allocate computing, storage and networking resources on the edge nodes (ENs) efficiently and reasonably to tasks generated by users. Due to the resource-limitation of ...
详细信息
Current inference scaling methods, such as Self-consistency and Best-of-N, have proven effective in improving the accuracy of LLMs on complex reasoning tasks. However, these methods rely heavily on the quality of cand...
详细信息
Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many important scenarios where the capacity of DRAM is limited, such as memory-constrained systems. Non-volatile memory (NVM) is a novel storage technology that combines the advantage of reading per-formance and byte addressability of DRAM with the durability of traditional storage devices like SSD and HDD. Unfortunately, no research demonstrates how to use NVM to reduce DRAM utilization in compressed data analytics. In this paper, we propose N-TADOC, which substitutes DRAM with NVM while maintaining TADOC's analytics performance and space savings. Utilizing an NVM block device to reduce DRAM utilization presents two challenges, including poor data locality in traversing datasets and auxiliary data structure reconstruction on NVM. We develop novel designs to solve these challenges, including a pruning method with NVM pool management, bottom-up upper bound estimation, correspondent data structures, and persistence strategy at different levels of cost. Experimental results show that on four real-world datasets, N-TADOC achieves 2.04× performance speedup compared to the processing directly on the uncompressed data and 70.7% DRAM space saving compared to the original TADOC.
Current large language models (LLMs) often struggle to produce accurate solutions on the first attempt for code generation. Prior research tackles this challenge by generating multiple candidate solutions and validati...
Current state-of-the-art image captioning models generate captions in a single language, requiring a combination of multiple language specific models to build a multilingual image captioning system. However, as the nu...
详细信息
Integrating multimodal data from diverse sources is crucial for enhancing various applications. Multimodal entity alignment (MMEA), which discovers equivalent entities across different sources and modalities, aims to ...
详细信息
ISBN:
(数字)9798350390155
ISBN:
(纸本)9798350390162
Integrating multimodal data from diverse sources is crucial for enhancing various applications. Multimodal entity alignment (MMEA), which discovers equivalent entities across different sources and modalities, aims to eliminate data silos for comprehensive integration. A key challenge in MMEA is effectively fusing vector representations from different modalities of the same entity for optimal entity matching. Existing fusion methods involve individual fusion operators (e.g., concatenation and summation) or the manual design of complex network structures, incurring significant human resource costs. In this paper, for the first time, we introduce the research question of automatic fusion for MMEA and propose an efficient approach from the perspective of automated architecture search. Experimental comparisons with state-of-the-art methods on real-world datasets demonstrate the effectiveness of the proposed approach.
Relation prediction in knowledge graphs (KGs) aims at predicting missing relations in incomplete triples, whereas the dominant paradigm by KG embeddings has a limitation to predict the relation between unseen entities...
详细信息
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative to the commonly used M...
详细信息
Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search i...
详细信息
Identifying semantic types for attributes in relations,known as attribute semantic type(AST)identification,plays an important role in many data analysis tasks,such as data cleaning,schema matching,and keyword search in ***,due to a lack of unified naming standards across prevalent information systems(*** islands),AST identification still remains as an open *** tackle this problem,we propose a context-aware method to figure out the ASTs for relations in this *** transform the AST identification into a multi-class classification problem and propose a schema context aware(SCA)model to learn the representation from a collection of relations associated with attribute values and schema *** on the learned representation,we predict the AST for a given attribute from an underlying relation,wherein the predicted AST is mapped to one of the labeled *** improve the performance for AST identification,especially for the case that the predicted semantic types of attributes are not included in the labeled ASTs,we then introduce knowledge base embeddings(***)to enhance the above representation and construct a schema context aware model with knowledge base enhanced(SCA-KB)to get a stable and robust *** experiments based on real datasets demonstrate that our context-aware method outperforms the state-of-the-art approaches by a large margin,up to 6.14%and 25.17%in terms of macro average F1 score,and up to 0.28%and 9.56%in terms of weighted F1 score over high-quality and low-quality datasets respectively.
暂无评论