检索结果-内蒙古大学图书馆

39th IEEE International Conference on data engineering, ICDE 2023

作者： Ge, Jiake Shi, Boyu Chai, Yanfeng Luo, Yuanhui Guo, Yunda He, Yinxuan Chai, Yunpeng Moe Key Laboratory of Data Engineering and Knowledge Engineering China Renmin University of China School of Information China

ISBN: (纸本)9798350322279

Numerous high-performance updatable learned indexes have recently been designed to support the writing requirements in practical systems. Researchers have proposed various strategies to improve the availability of updatable learned indexes. However, it is unclear which strategy is more profitable. Therefore, we deconstruct the design of learned indexes into multiple dimensions and in-depth evaluate their impacts on the overall performance, respectively. Through the in-depth exploration of learned indexes, we reckon that the approximation algorithm is the most crucial design dimension for improving the performance of the learned indexes rather than the popular works that focus on the learned index structure. Moreover, this paper makes a comprehensive end-to-end evaluation based on a high-performance key-value store to answer people's concerns about which learned index is better and whether learned indexes can outperform traditional ones. Finally, according to end-to-end and in-depth evaluation results, we give some constructive suggestions on designing a better learned index in these dimensions, especially how to design an excellent approximate algorithm to improve the lookup and insertion performance of learned indexes. © 2023 IEEE.

关键词： Approximation algorithms

来源：评论

学校读者我要写书评

暂无评论

FOSS: A Self-Learned Doctor for Query Optimizer 40

FOSS: A Self-Learned Doctor for Query Optimizer

引用

40th IEEE International Conference on data engineering, ICDE 2024

作者： Zhong, Kai Sun, Luming Ji, Tao Li, Cuiping Chen, Hong Renmin University of China China Key Laboratory of Data Engineering and Knowledge Engineering MOE China Shanghai Yunxi Technology Co. Ltd. China Engineering Research Center of Database and Business Intelligence MOE China

ISBN: (纸本)9798350317152

Various works have utilized deep learning to address the query optimization problem in database system. They either learn to construct plans from scratch in a bottom-up manner or steer the plan generation behavior of traditional optimizer using hints. While these methods have achieved some success, they face challenges in either low training efficiency or limited plan search space. To address these challenges, we introduce FOSS, a novel framework for query optimization based on deep reinforcement learning. FOSS initiates optimization from the original plan generated by a traditional optimizer and incrementally refines suboptimal nodes of the plan through a sequence of actions. Additionally, we devise an asymmetric advantage model to evaluate the advantage between two plans. We integrate it with a traditional optimizer to form a simulated environment. Leveraging this simulated environment, FOSS can bootstrap itself to rapidly generate a large amount of high-quality simulated experiences. FOSS then learns from these experiences to improve its optimization capability. We evaluate the performance of FOSS on Join Order Benchmark, TPC-DS, and Stack Overflow. The experimental results demonstrate that FOSS outperforms the state-of-the-art methods in terms of latency performance. Compared to PostgreSQL, FOSS achieves speedup ranging from 1.15x to 8.33x in total latency across different benchmarks. © 2024 IEEE.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Federated Incremental Named Entity Recognition 31

Federated Incremental Named Entity Recognition

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Liu, Zesheng Zhu, Qiannan Li, Cuiping Chen, Hong School of Information Renmin University of China Beijing China Key Laboratory of Data Engineering and Knowledge Engineering MOE China Engineering Research Center of Database and Business Intelligence MOE China School of Artificial Intelligence Beijing Normal University Beijing China Engineering Research Center of Intelligent Technology and Educational Application MOE China

ISBN: (纸本)9798891761964

Federated learning-based Named Entity Recognition (FNER) has attracted widespread attention through decentralized training on local clients. However, most FNER models assume that entity types are pre-fixed, so in practical applications, local clients constantly receive new entity types without enough storage to access old entity types, resulting in severe forgetting on previously learned knowledge. In addition, new clients collecting only new entity types may join the global training of FNER irregularly, further exacerbating catastrophic forgetting. To overcome the above challenges, we propose a Forgetting-Subdued Learning (FSL) model which solves the forgetting problem on old entity types from both intra-client and inter-client two aspects. Specifically, for intra-client aspect, we propose a prototype-guided adaptive pseudo labeling and a prototypical relation distillation loss to surmount catastrophic forgetting of old entity types with semantic shift. Furthermore, for inter-client aspect, we propose a task transfer detector. It can identify the arrival of new entity types that are protected by privacy and store the latest old global model for relation distillation. Qualitative experiments have shown that our model has made significant improvements compared to several baseline methods. © 2025 Association for Computational Linguistics.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

On-Line System of Garbage Image-Orientated Intelligent Classification, Submission and Examination 20

On-Line System of Garbage Image-Orientated Intelligent Class...

引用

20th IEEE International Conference on e-Business engineering, ICEBE 2024

作者： Tian, Jiayin Wang, Yaozhi Liu, Jiaxin Chen, Yan School of Computer Science and Technology Xi'an Jiaotong University Shaanxi Xi'an China Xi'an Jiaotong University Shaanxi Key Lab of Big Data Knowledge Engineering Shaanxi Xi'an China School of Computer Science and Technology Xi'an Jiaotong University Xi'an Jiaotong University Shaanxi Key Lab of Big Data Knowledge Engineering Shaanxi Xi'an China

ISBN: (纸本)9798350365856

In a world brimming with new products continually, novel waste types are ubiquitous. This makes current image-based garbage classification systems difficult to perform well due to the long-tailed effects of distribution of garbage types, and necessitates an urgent and efficient garbage classification with abilities of detecting new and rare wastes and class-incremental learning for environmental sustainability. Therefore, we propose a framework of Online System of Garbage Image-Oriented Intelligent Classification, Submission, and Examination, facilitating the incremental garbage classification efforts. In which, to identify novel garbage effectively, we also introduced few-shot object detection method with two key algorithms: Two-Stage Object Detection Learning Algorithm and Dynamic Query-based Incremental Few-shot Learning Algorithm. Our experiment results show that Both outperform the current existing ones in dataset, MS COCO. Then, a strategy of Class-Incremental learning based Residual Network is proposed to meet the need of new waste class-incremental learning. The experimental results support our strategy. Finally, a prototype system employed the above algorithms and the strategy is described. © 2024 IEEE.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

Towards annotation-free evaluation of cross-lingual image captioning 2

Towards annotation-free evaluation of cross-lingual image ca...

引用

2nd ACM International Conference on Multimedia in Asia, MMAsia 2020

作者： Chen, Aozhu Huang, Xinyi Lin, Hailan Li, Xirong Moe Key Lab of Data Engineering and Knowledge Engineering Renmin University of China Beijing China

ISBN: (纸本)9781450383080

Cross-lingual image captioning, with its ability to caption an unlabeled image in a target language other than English, is an emerging topic in the multimedia field. In order to save the precious human resource from re-writing reference sentences per target language, in this paper we make a brave attempt towards annotation-free evaluation of cross-lingual image captioning. Depending on whether we assume the availability of English references, two scenarios are investigated. For the first scenario with the references available, we propose two metrics, i.e., WMDRel and CLinRel. WMDRel measures the semantic relevance between a model-generated caption and machine translation of an English reference using their Word Mover's Distance. By projecting both captions into a deep visual feature space, CLinRel is a visual-oriented cross-lingual relevance measure. As for the second scenario, which has zero reference and is thus more challenging, we propose CMedRel to compute a cross-media relevance between the generated caption and the image content, in the same visual feature space as used by CLinRel. We have conducted a number of experiments to evaluate the effectiveness of the three proposed metrics. The combination of WMDRel, CLinRel and CMedRel has a Spearman's rank correlation of 0.952 with the sum of BLEU-4, METEOR, ROUGE-L and CIDEr, four standard metrics computed using references in the target language. CMedRel alone has a Spearman's rank correlation of 0.786 with the standard metrics. The promising results show high potential of the new metrics for evaluation with no need of references in the target language. © 2021 ACM.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

A Query Optimization Method Utilizing Large Language Models

arXiv

引用

arXiv 2025年

作者： Yao, Zhiming Li, Haoyang Zhang, Jing Li, Cuiping Chen, Hong School of Information Renmin University of China Beijing China Key Laboratory of Data Engineering and Knowledge Engineering MOE China Engineering Research Center of Database and Business Intelligence MOE China

Query optimization is a critical task in database systems, focused on determining the most efficient way to execute a query from an enormous set of possible strategies. Traditional approaches rely on heuristic search methods and cost predictions, but these often struggle with the complexity of the search space and inaccuracies in performance estimation, leading to suboptimal plan choices. This paper presents LLMOpt, a novel framework that leverages Large Language Models (LLMs) to address these challenges through two innovative components: (1) LLM for Plan Candidate Generation (LLMOpt(G)), which eliminates heuristic search by utilizing the reasoning abilities of LLMs to directly generate high-quality query plans, and (2) LLM for Plan Candidate Selection (LLMOpt(S)), a list-wise cost model that compares candidates globally to enhance selection accuracy. To adapt LLMs for query optimization, we propose fine-tuning pre-trained models using optimization data collected offline. Experimental results on the JOB, JOB-EXT, and Stack benchmarks show that LLMOpt(G) and LLMOpt(S) outperform state-of-the-art methods, including PostgreSQL, BAO, and HybridQO. Notably, LLMOpt(S) achieves the best practical performance, striking a balance between plan quality and inference efficiency. Copyright © 2025, The Authors. All rights reserved.

关键词： Structured Query Language

来源：评论

学校读者我要写书评

暂无评论

RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Place Recognition Network 17th

RPR-Net: A Point Cloud-Based Rotation-Aware Large Scale Plac...

引用

17th European Conference on Computer Vision, ECCV 2022

作者： Fan, Zhaoxin Song, Zhenbo Zhang, Wenping Liu, Hongyan He, Jun Du, Xiaoyong Key Laboratory of Data Engineering and Knowledge Engineering of MOE School of Information Renmin University of China Beijing100872 China Department of Management Science and Engineering Tsinghua University Beijing100084 China School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing210094 China

ISBN: (纸本)9783031250552

Point cloud-based large scale place recognition is an important but challenging task for many applications such as Simultaneous Localization and Mapping (SLAM). Taking the task as a point cloud retrieval problem, previous methods have made delightful achievements. However, how to deal with catastrophic collapse caused by rotation problems is still under-explored. In this paper, to tackle the issue, we propose a novel Point Cloud-based Rotation-aware Large Scale Place Recognition Network (RPR-Net). In particular, to solve the problem, we propose to learn rotation-invariant features in three steps. First, we design three kinds of novel Rotation-Invariant Features (RIFs), which are low-level features that can hold the rotation-invariant property. Second, using these RIFs, we design an attentive module to learn rotation-invariant kernels. Third, we apply these kernels to previous point cloud features to generate new features, which is the well-known SO(3) mapping process. By doing so, high-level scene-specific rotation-invariant features can be learned. We call the above process an Attentive Rotation-Invariant Convolution (ARIConv). To achieve the place recognition goal, we build RPR-Net, which takes ARIConv as a basic unit to construct a dense network architecture. Then, powerful global descriptors used for retrieval-based place recognition can be sufficiently extracted from RPR-Net. Experimental results on prevalent datasets show that our method achieves comparable results to existing state-of-the-art place recognition models and significantly outperforms other rotation-invariant baseline models when solving rotation problems. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Network architecture

来源：评论

学校读者我要写书评

暂无评论

Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch

Semi-Supervised Learning via Weight-aware Distillation under...

引用

International Conference on Computer Vision (ICCV)

作者： Pan Du Suyun Zhao Zisen Sheng Cuiping Li Hong Chen Key Lab of Data Engineering and Knowledge Engineering of MOE Renmin University of China Renmin University of China Beijing China

Semi-Supervised Learning (SSL) under class distribution mismatch aims to tackle a challenging problem wherein unlabeled data contain lots of unknown categories unseen in the labeled ones. In such mismatch scenarios, traditional SSL suffers severe performance damage due to the harmful invasion of the instances with unknown categories into the target classifier. In this study, by strict mathematical reasoning, we reveal that the SSL error under class distribution mismatch is composed of pseudo-labeling error and invasion error, both of which jointly bound the SSL population risk. To alleviate the SSL error, we propose a robust SSL framework called Weight-Aware Distillation (WAD) that, by weights, selectively transfers knowledge beneficial to the target task from unsupervised contrastive representation to the target classifier. Specifically, WAD captures adaptive weights and high-quality pseudo-labels to target instances by exploring point mutual information (PMI) in representation space to maximize the role of unlabeled data and filter unknown categories. Theoretically, we prove that WAD has a tight upper bound of population risk under class distribution mismatch. Experimentally, extensive results demonstrate that WAD outperforms five state-of-the-art SSL approaches and one standard baseline on two benchmark datasets, CIFAR10 and CIFAR100, and an artificial cross-dataset. The code is available at https://***/RUC-DWBI-ML/research/tree/main/WAD-master.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Enabling Efficient NVM-Based Text Analytics without Decompression

Enabling Efficient NVM-Based Text Analytics without Decompre...

引用

International Conference on data engineering

作者： Xiaokun Fang Feng Zhang Junxiang Nong Mingxing Zhang Puyun Hu Yunpeng Chai Xiaoyong Du Key Laboratory of Data Engineering and Knowledge Engineering (MOE) and School of Information Renmin University of China Department of Computer Science and Engineering Tsinghua University

ISBN: (数字)9798350317152

ISBN: (纸本)9798350317169

Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many important scenarios where the capacity of DRAM is limited, such as memory-constrained systems. Non-volatile memory (NVM) is a novel storage technology that combines the advantage of reading per-formance and byte addressability of DRAM with the durability of traditional storage devices like SSD and HDD. Unfortunately, no research demonstrates how to use NVM to reduce DRAM utilization in compressed data analytics. In this paper, we propose N-TADOC, which substitutes DRAM with NVM while maintaining TADOC's analytics performance and space savings. Utilizing an NVM block device to reduce DRAM utilization presents two challenges, including poor data locality in traversing datasets and auxiliary data structure reconstruction on NVM. We develop novel designs to solve these challenges, including a pruning method with NVM pool management, bottom-up upper bound estimation, correspondent data structures, and persistence strategy at different levels of cost. Experimental results show that on four real-world datasets, N-TADOC achieves 2.04× performance speedup compared to the processing directly on the uncompressed data and 70.7% DRAM space saving compared to the original TADOC.

关键词： Performance evaluation Upper bound data analysis Costs Nonvolatile memory Random access memory Estimation

来源：评论

学校读者我要写书评

暂无评论

Semi-Supervised Learning via Weight-aware Distillation under Class Distribution Mismatch

arXiv

引用

arXiv 2023年

作者： Du, Pan Zhao, Suyun Sheng, Zisen Li, Cuiping Chen, Hong Key Lab of Data Engineering and Knowledge Engineering of MOE Renmin University of China China Renmin University of China Beijing China

关键词： Distillation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：