检索结果-内蒙古大学图书馆

Partial multi-label learning via label-specific feature corrections

Science China(information Sciences) 2025年第3期68卷 95-109页

作者： Jun-Yi HANG Min-Ling ZHANG School of Computer Science and Engineering Southeast University Key Laboratory of Computer Network and Information Integration(Southeast University) Ministry of Education

Partial multi-label learning（PML） allows learning from rich-semantic objects with inaccurate annotations, where a set of candidate labels are assigned to each training example but only some of them are valid. Existing approaches rely on disambiguation to tackle the PML problem, which aims to correct noisy candidate labels by recovering the ground-truth labeling information ahead of prediction model induction. However, this dominant strategy might be suboptimal as it usually needs extra assumptions that cannot be fully satisfied in real-world scenarios. Instead of label correction, we investigate another strategy to tackle the PML problem, where the potential ambiguity in PML data is eliminated by correcting instance features in a label-specific manner. Accordingly, a simple yet effective approach named PASE, i.e., partial multi-label learning via label-specific feature corrections, is proposed. Under a meta-learning framework, PASElearns to exert label-specific feature corrections so that potential ambiguity specific to each class label can be eliminated and the desired prediction model can be induced on these corrected instance features with the provided candidate labels. Comprehensive experiments on a wide range of synthetic and real-world data sets validate the effectiveness of the proposed approach.

关键词： machine learning multi-label learning partial multi-label learning label-specific features feature correction

来源：评论

学校读者我要写书评

暂无评论

Foundation models for topic modeling: a case study

引用

Frontiers of computer Science 2025年第2期19卷 129-131页

作者： Han ZENG Jia-Ming SUN Chun-Shu LI Zhuying LI Tong WEI School of Computer Science and Engineering Southeast UniversityNanjing 210096China Key Laboratory of Computer Network and Information Integration Southeast UniversityNanjing 210096China

1 Introduction In Natural Language Processing(NLP),topic modeling is a class of methods used to analyze and explore textual corpora,i.e.,to discover the underlying topic structures from text and assign text pieces to different *** NLP,a topic means a set of relevant words appearing together in a particular pattern,representing some specific *** is beneficial for tracking social media trends,constructing knowledge graphs,and analyzing writing *** modeling has always been an area of extensive research in *** methods like Latent Semantic Analysis(LSA)and Latent Dirichlet Allocation(LDA),based on the“bag of words”(BoW)model,often fail to grasp the semantic nuances of the text,making them less effective in contexts involving polysemy or data noise,especially when the amount of data is small.

关键词： words semantic textual

来源：评论

学校读者我要写书评

暂无评论

Residual diverse ensemble for long-tailed multi-label text classification

引用

Science China(information Sciences) 2024年第11期67卷 92-105页

作者： Jiangxin SHI Tong WEI Yufeng LI National Key Laboratory for Novel Software Technology Nanjing University School of Artificial Intelligence Nanjing University School of Computer Science and Engineering Southeast University Key Laboratory of Computer Network and Information Integration Southeast UniversityMinistry of Education

Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previous studies have treated head and tail labels equally, resulting in unsatisfactory performance for identifying tail labels. To address this issue, this paper proposes a novel learning method that combines arbitrary models with two steps. The first step is the “diverse ensemble” that encourages diverse predictions among multiple shallow classifiers, particularly on tail labels, and can improve the generalization of tail *** second is the “error correction” that takes advantage of accurate predictions on head labels by the base model and approximates its residual errors for tail labels. Thus, it enables the “diverse ensemble” to focus on optimizing the tail label performance. This overall procedure is called residual diverse ensemble(RDE). RDE is implemented via a single-hidden-layer perceptron and can be used for scaling up to hundreds of thousands of labels. We empirically show that RDE consistently improves many existing models with considerable performance gains on benchmark datasets, especially with respect to the propensity-scored evaluation ***, RDE converges in less than 30 training epochs without increasing the computational overhead.

关键词： multi-label learning extreme multi-label learning long-tailed distribution multi-label text classification ensemble learning

来源：评论

学校读者我要写书评

暂无评论

YOLO-CORE: Contour Regression for Efficient Instance Segmentation

引用

Machine Intelligence Research 2023年第5期20卷 716-728页

作者： Haoliang Liu Wei Xiong Yu Zhang School of Computer Science and Engineering and the Key Laboratory of Computer Network and Information Integration(Ministry of Education) Southeast UniversityNanjing 211189China

Instance segmentation has drawn mounting attention due to its significant ***,high computational costs have been widely acknowledged in this domain,as the instance mask is generally achieved by pixel-level *** this paper,we present a conceptually efficient contour regression network based on the you only look once(YOLO)architecture named YOLO-CORE for instance *** mask of the instance is efficiently acquired by explicit and direct contour regression using our designed multiorder constraint consisting of a polar distance loss and a sector *** proposed YOLO-CORE yields impressive segmentation performance in terms of both accuracy and *** achieves 57.9%AP@0.5 with 47 FPS(frames per second)on the semantic boundaries dataset(SBD)and 51.1%AP@0.5 with 46 FPS on the COCO *** superior performance achieved by our method with explicit contour regression suggests a new technique line in the YOLO-based image understanding ***,our instance segmentation design can be flexibly integrated into existing deep detectors with negligible computation cost(65.86 BFLOPs(billion float operations per second)to 66.15 BFLOPs with the YOLOv3 detector).

关键词： computer vision instance segmentation object shape prediction contour regression polar distance.

来源：评论

学校读者我要写书评

暂无评论

FHGraph:A Novel Framework for Fake News Detection Using Graph Contrastive Learning and LLM

引用

computers, Materials & Continua 2025年第4期83卷 309-333页

作者： Yuanqing Li Mengyao Dai Sanfeng Zhang School of Cyber Science and Engineering Southeast UniversityNanjing210096China Key Laboratory of Computer Network and Information Integration(Southeast University) Ministry of EducationNanjing210096China

Social media has significantly accelerated the rapid dissemination of information,but it also boosts propagation of fake news,posing serious challenges to public awareness and social *** real-world contexts,the volume of trustable information far exceeds that of rumors,resulting in a class imbalance that leads models to prioritize the majority class during *** focus diminishes the model’s ability to recognize minority class ***,models may experience overfitting when encountering these minority samples,further compromising their generalization *** node-level classification tasks,fake news detection in social networks operates on graph-level samples,where traditional interpolation and oversampling methods struggle to effectively generate high-quality graph-level *** challenge complicates the identification of new instances of false *** address this issue,this paper introduces the FHGraph(Fake News Hunting Graph)framework,which employs a generative data augmentation approach and a latent diffusion model to create graph structures that align with news communication *** the few-sample learning capabilities of large language models(LLMs),the framework generates diverse texts for minority class *** comprises a hierarchical multiview graph contrastive learning module,in which two horizontal views and three vertical levels are utilized for self-supervised learning,resulting in more optimized *** results show that FHGraph significantly outperforms state-of-the-art(SOTA)graph-level class imbalance methods and SOTA graph-level contrastive learning ***,FHGraph has achieved a 2%increase in F1 Micro and a 2.5%increase in F1 Macro in the PHEME dataset,as well as a 3.5%improvement in F1 Micro and a 4.3%improvement in F1 Macro on RumorEval dataset.

关键词： Graph contrastive learning fake news detection data augmentation class imbalance LLM

来源：评论

学校读者我要写书评

暂无评论

Community-Preserving Social Graph Release with Node Differential Privacy

引用

Journal of computer Science & Technology 2023年第6期38卷 1369-1386页

作者：张森倪巍伟付楠 School of Computer Science and Engineering Southeast UniversityNanjing 211189China Key Laboratory of Computer Network and Information Integration(Southeast University) Ministry of EducationNanjing 211189China

The goal of privacy-preserving social graph release is to protect individual privacy while preserving data *** structure,which is an important global pattern of nodes,is a crucial data utility as it is fundamental to many graph analysis ***,most existing methods with differential privacy(DP)commonly fall into edge-DP to sacri-fice security in exchange for ***,they reconstruct graphs from the local feature-extraction of nodes,resulting in poor community *** by this,we develop PrivCom,a strict node-DP graph release algorithm to maximize the utility on the community structure while maintaining a higher level of *** this algorithm,to reduce the huge sensitivity,we devise a Katz index based private graph feature extraction method,which can capture global graph structure features while greatly reducing the global sensitivity via a sensitivity regulation ***,under the condition that the sensitivity is fixed,the feature captured by the Katz index,which is presented in matrix form,requires privacy budget *** a result,plenty of noise is injected,mitigating global structural *** bridge this gap,we de-sign a private eigenvector estimation method,which yields noisy eigenvectors from extracted low-dimensional ***,a dynamic privacy budget allocation method with provable utility guarantees is developed to preserve the inherent relationship between eigenvalues and eigenvectors,so that the utility of the generated noise Katz matrix is well ***,we reconstruct the synthetic graph via calculating its Laplacian with the noisy Katz *** results confirm our theoretical findings and the efficacy of PrivCom.

关键词： differential privacy social graph community structure private eigenvector

来源：评论

学校读者我要写书评

暂无评论

Adaptive VDI Session Placement via User Logoff Prediction

引用

Machine Intelligence Research 2025年第1期22卷 189-200页

作者： Wenping Fan Puhui Meng Yu Tian Min-Ling Zhang Yao Zhang School of Computer Science and Engineering Southeast UniversityNanjing210096China VMware Information Technology(China)Limited Beijing100190China Key Laboratory of Computer Network and Information Integration(Southeast University) Ministry of EducationNanjing210096China

After the global pandemic,DaaS(desktop as a service)has become the first choice of many companies’remote working *** the desktops are usually deployed in the public cloud when using DaaS,customers are more cost-sensitive which boosts the requirement of proactive power *** researches in this area focus on virtual desktop infrastructure(VDI)session logon behavior modeling,but for the remote desktop service host(RDSH)-shared desktop pools,logoff optimization is also *** systems place sessions by round-robin or in a pre-defined order without considering their logoff ***,these approaches usually suffer from the situation that few left sessions prevent RDSH servers from being powered-off which introduces cost *** this paper,we propose session placement via adaptive user logoff prediction(SODA),an innovative compound model towards proactive RDSH session ***,an ensemble machine learning model that can predict session logoff time is combined with a statistical session placement bucket model to place RDSH sessions with similar logoff time in a more centralized manner on RDSH ***,the infrastructure cost-saving can be improved by reducing the resource waste introduced by those RDSH hosts with very few hanging sessions left for a long *** on real RDSH pool data demonstrate the effectiveness of the proposed proactive session placement approach against existing static placement techniques.

关键词： Virtual desktop infrastructure(VDI)resource management remote desktop service logoff prediction adaptive modeling session placement

来源：评论

学校读者我要写书评

暂无评论

ADACQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment 31

ADACQR: Enhancing Query Reformulation for Conversational Sea...

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Lai, Yilong Wu, Jialong Zhang, Congzhi Sun, Haowen Zhou, Deyu School of Computer Science and Engineering Key Laboratory of Computer Network and Information Integration Ministry of Education Southeast University China

ISBN: (纸本)9798891761964

Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CQR through alignment. However, they are designed for one specific retrieval system, which potentially results in sub-optimal generalization. To overcome this limitation, we present a novel framework ADACQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, ADACQR enhances the generalizability of information-seeking queries among diverse retrieval environments through a two-stage training strategy. Moreover, two effective approaches are proposed to obtain superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental results on the TopiOCQA, QReCC and TREC CAsT datasets demonstrate that ADACQR outperforms the existing methods in a more efficient framework, offering both quantitative and qualitative improvements in conversational query reformulation. © 2025 Association for Computational Linguistics.

关键词： Structured Query Language

来源：评论

学校读者我要写书评

暂无评论

SEED: Accelerating Reasoning Tree Construction via Scheduled Speculative Decoding 31

SEED: Accelerating Reasoning Tree Construction via Scheduled...

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Wang, Zhenglin Wu, Jialong Lai, Yilong Zhang, Congzhi Zhou, Deyu School of Computer Science and Engineering Key Laboratory of Computer Network and Information Integration Ministry of Education Southeast University China

ISBN: (纸本)9798891761964

Large Language Models (LLMs) demonstrate remarkable emergent abilities across various tasks, yet fall short of complex reasoning and planning tasks. The tree-search-based reasoning methods address this by encouraging the exploration of intermediate steps, surpassing the capabilities of chain-of-thought prompting. However, significant inference latency is introduced due to the systematic exploration and evaluation of multiple thought paths. This paper introduces SEED, a novel and efficient inference framework to improve both runtime speed and GPU memory management concurrently. Based on a scheduled speculative execution, SEED efficiently handles multiple iterations for thought generation and state evaluation, leveraging a rounds-scheduled strategy to manage draft model dispatching. Extensive experimental evaluations on three reasoning datasets demonstrate the superior speedup performance of SEED. © 2025 Association for Computational Linguistics.

关键词： Memory management

来源：评论

学校读者我要写书评

暂无评论

Towards kernelizing the classifier for hyperbolic data

引用

Frontiers of computer Science 2024年第1期18卷 17-31页

作者： Meimei YANG Qiao LIU Xinkai SUN Na SHI Hui XUE School of Computer Science and Engineering Southeast UniversityNanjing 210096China MOE Key Laboratory of Computer Science and Information Integration(Southeast University) Nanjing 210096China

Data hierarchy,as a hidden property of data structure,exists in a wide range of machine learning applications.A common practice to classify such hierarchical data is first to encode the data in the Euclidean space,and then train a Euclidean ***,such a paradigm leads to a performance drop due to distortion of data embedding in the Euclidean *** relieve this issue,hyperbolic geometry is investigated as an alternative space to encode the hierarchical data for its higher ability to capture the hierarchical *** methods cannot explore the full potential of the hyperbolic geometry,in the sense that such methods define the hyperbolic operations in the tangent plane,causing the distortion of data *** this paper,we develop two novel kernel formulations in the hyperbolic space,with one being positive definite(PD)and another one being indefinite,to solve the classification tasks in hyperbolic *** PD one is defined via mapping the hyperbolic data to the Drury-Arveson(DA)space,which is a special reproducing kernel Hilbert space(RKHS).To further increase the discrimination of the classifier,an indefinite kernel is further defined in the Krein ***,we design a 2-layer nested indefinite kernel which first maps hyperbolic data into the DA spaces,followed by a mapping from the DA spaces to the Krein *** experiments on real-world datasets demonstrate the superiority ofthe proposed kernels.

关键词： data hierarchy hyperbolic cgeometry drury-arveson space krein space

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：