检索结果-内蒙古大学图书馆

Progressive Residual Extraction Based Pre-Training for Speech Representation Learning

IEEE Transactions on Audio, Speech and Language Processing 2025年 33卷 1825-1837页

作者： Tianrui Wang Jin Li Ziyang Ma Rui Cao Xie Chen Longbiao Wang Meng Ge Xiaobao Wang Yuguang Wang Jianwu Dang Nyima Tashi Tianjin Key Laboratory of Cognitive Computing and Application College of Intelligence and Computing Tianjin University Tianjin China MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University Shanghai China Huiyan Technology (Tianjin) Company Ltd. Tianjin China Saw Swee Hock School of Public Health National University of Singapore Singapore Shenzhen Institute of Advanced Technology Chinese Academy of Sciences Shenzhen China School of Information Science and Technology Tibet University Lhasa China State Key Laboratory of Tibetan Intelligence Lhasa China

Self-supervised learning (SSL) has garnered significant attention in speech processing, particularly excelling in linguistic tasks such as speech recognition. However, improving the performance of pre-trained models across various downstream tasks—each requiring distinct types of speech information—remains a significant challenge. To address this, we propose a progressive residual extraction based SSL method, named ProgRE. Specifically, we introduce two lightweight, specialized task modules into an encoder-style SSL backbone to enhance its ability to extract pitch variation and speaker information from speech. Furthermore, to mitigate the incompatibility between the reinforced pitch variation and speaker information and the learning of content information, we employ residual extraction, leveraging the extracted representations as references or conditioning signals to guide the subsequent modules in more effectively learning content-related information under the supervision of HuBERT-based speech masking prediction. In this manner, we can incrementally extract pitch variation, speaker, and content representations from the input speech. Finally, these multiple representations, each capturing diverse speech information, are combined using different layer weights to produce task-specific representations for various downstream tasks. Experimental results demonstrate that our ProgRE achieves significant performance improvements across several tasks, such as speaker identification, speech recognition, emotion recognition, speech enhancement, and voice conversion, outperforming excellent SSL methods like wav2vec2.0, HuBERT, and WavLM.

关键词： Data mining Feature extraction Transformers Speech recognition Training Speech enhancement Acoustics Vector quantization Self-supervised learning Germanium

来源：评论

学校读者我要写书评

暂无评论

A method to evaluate the spatial extensibility of a switching unit and network

引用

Science China(Information Sciences) 2014年第2期57卷 140-156页

作者： ZHANG Bo WU JunTing WANG BinQiang LI Hui National Digital Switching System Engineering Technological R&D Center Shenzhen Key Lab of Cloud Computing Technology and Application

Switching units and networks have been analyzed as extensible fabrics,mostly in terms of their scheduling *** traditional literature on switching extensibility has provided complexity theory only relating to the total numbers of inputs(or outputs)and exchange *** paper analyzes switching extensibility in terms of not only the scheduling algorithm and also the fabric *** is found that determining extensibility from soft complexity related to the number of inputs(or outputs)of the scheduling algorithm and the fabric extensibility in previous studies without quantization is a flawed conception.A method is thus proposed to express the spatial extensibility of a switching unit or network in terms of the connections of a switching resource and *** method calculates parameter ES(the efciency of switching)of an m×n switching unit and obtains two functions of the switching unit to describe spatial extensibility along with the number of unilateral inputs or *** is found that the range of ES is(0,1]and three types of switching unit and two types of crosspoint networks have ES=*** is calculated for banyan,Clos,parallel packet,fully interconnected and recirculation switching *** ES value for the banyan switching network is larger than that for other networks,and switching networks are classified into three types that have absolute/linear/denied spatial extensibility according to the limES *** is demonstrated that a switching network has the largest ES value when it contains only the five types of switching unit for which ES=***,a group-switching-first self-routing banyan switching network with lower blocking probability and time delay is deduced,and the ES method is contrasted with two other methods of evaluating spatial extensibility in terms of their mathematical expressions and intuitive graphics,for the five types of switching network listed above.

关键词： spatial extensibility connection state efciency of switching banyan fully interconnected recirculation switching network

来源：评论

学校读者我要写书评

暂无评论

Adaptive Backdoor Attacks with Reasonable Constraints on Graph Neural Networks

引用

IEEE Transactions on Dependable and Secure computing 2025年

作者： Dong, Xuewen Li, Jiachen Li, Shujun You, Zhichao Qu, Qiang Kholodov, Yaroslav Shen, Yulong The School of Computer Science and Technology Xidian University The Engineering Research Center of Blockchain Technology Application and Evaluation Ministry of Education China The Shaanxi Key Laboratory of Blockchain and Secure Computing Xi’an710071 China The School of Computing Kent Interdisciplinary Research Centre in Cyber Security University of Kent CanterburyCT2 7NF United Kingdom Shenzhen Institute of Advanced Technology Chinese Academy of Sciences China The Intelligent Transportation Systems Lab Innopolis University Innopolis Russia The School of Computer Science and Technology Xidian University China

Recent studies show that graph neural networks (GNNs) are vulnerable to backdoor attacks. Existing backdoor attacks against GNNs use fixed-pattern triggers and lack reasonable trigger constraints, overlooking individual graph characteristics and rendering insufficient evasiveness. To tackle the above issues, we propose ABARC, the first Adaptive Backdoor Attack with Reasonable Constraints, applying to both graph-level and node-level tasks in GNNs. For graph-level tasks, we propose a subgraph backdoor attack independent of the graph’s topology. It dynamically selects trigger nodes for each target graph and modifies node features with constraints based on graph similarity, feature range, and feature type. For node-level tasks, our attack begins with an analysis of node features, followed by selecting and modifying trigger features, which are then constrained by node similarity, feature range, and feature type. Furthermore, an adaptive edge-pruning mechanism is designed to reduce the impact of neighbors on target nodes, ensuring a high attack success rate (ASR). Experimental results show that even with reasonable constraints for attack evasiveness, our attack achieves a high ASR while incurring a marginal clean accuracy drop (CAD). When combined with the state-of-the-art defense randomized smoothing (RS) method, our attack maintains an ASR over 94%, surpassing existing attacks by more than 7%. © 2004-2012 IEEE. All rights reserved.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Characteristic-Specific Partial Fine-Tuning for Efficient Emotion and Speaker Adaptation in Codec Language Text-to-Speech Models

arXiv

引用

arXiv 2025年

作者： Wang, Tianrui Ge, Meng Gong, Cheng Qiang, Chunyu Wang, Haoyu Huang, Zikang Jiang, Yu Wang, Xiaobao Chen, Xie Wang, Longbiao Dang, Jianwu Tianjin Key Laboratory of Cognitive Computing and Application College of Intelligence and Computing Tianjin University Tianjin China Guangdong Laboratory of Artificial Intelligence and Digital Economy Guangdong China China Telecom Beijing China MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University Shanghai China Co. Ltd Tianjin China Shenzhen Institute of Advanced Technology Chinese Academy of Sciences Guangdong China

Recently, emotional speech generation and speaker cloning have garnered significant interest in text-to-speech (TTS). With the open-sourcing of codec language TTS models trained on massive datasets with large-scale parameters, adapting these general pre-trained TTS models to generate speech with specific emotional expressions and target speaker characteristics has become a topic of great attention. Common approaches, such as full and adapter-based fine-tuning, often overlook the specific contributions of model parameters to emotion and speaker control. Treating all parameters uniformly during fine-tuning, especially when the target data has limited content diversity compared to the pre-training corpus, results in slow training speed and an increased risk of catastrophic forgetting. To address these challenges, we propose a characteristic-specific partial fine-tuning strategy, short as CSP-FT. First, we use a weighted-sum approach to analyze the contributions of different Transformer layers in a pre-trained codec language TTS model for emotion and speaker control in the generated speech. We then selectively fine-tune the layers with the highest and lowest characteristic-specific contributions to generate speech with target emotional expression and speaker identity. Experimental results demonstrate that our method achieves performance comparable to, or even surpassing, full fine-tuning in generating speech with specific emotional expressions and speaker identities. Additionally, CSP-FT delivers approximately 2× faster training speeds, fine-tunes only around 8% of parameters, and significantly reduces catastrophic forgetting. Furthermore, we show that codec language TTS models perform competitively with self-supervised models in speaker identification and emotion classification tasks, offering valuable insights for developing universal speech processing models. © 2025, CC BY.

关键词： C (programming language)

来源：评论

学校读者我要写书评

暂无评论

An event summarizing algorithm based on the timeline relevance model in Sina Weibo

引用

Science China(Information Sciences) 2018年第12期61卷 184-186页

作者： Kai LEI Lizhu ZHANG Ying LIU Ying SHEN Chenwei LIU Qian YU Weitao WENG Shenzhen Key Lab for Cloud Computing Technology & Applications (SPCCTA) School of Electronics and Computer Engineering Peking University

Dear editor,Depicting superior punctuality and originality,Weibo has become increasingly critical and influential in China for online information acquisition and sharing. However, very few research has studied Weibo to investigate event summarizing even though most of the published Weibos are event-driven. Besides, we observe that the existing methods are unsuitable for process-

关键词： IDF TF HAC An event summarizing algorithm based on the timeline relevance model in Sina Weibo

来源：评论

学校读者我要写书评

暂无评论

A synchronous algorithm of network coding with hardware logic

A synchronous algorithm of network coding with hardware logi...

引用

Symposium on ICT and Energy Efficiency and Workshop on Information Theory and Security, CIICT 2012

作者： Li, Jiang Li, Yining Li, Hui Zhu, Zhipu Zhang, Huayu Chen, Fuxing Shenzhen Key Lab. of Cloud Computing Technology and Application Shenzhen Graduate School Peking University Shenzhen 518055 China

ISBN: (纸本)9781849195478

This paper presents an efficient hardware prototype for network coding (NC). First, a packet synchronization mechanism is introduced to settle the problem of packet arriving mismatch between different incoming channels. Then a high-speed lookup-table-based circuit is designed to perform dot product over Galois Field, which forms the basic calculation unit of NC operation. Taking the speed advantage of FPGA hardware, this prototype is able to perform network coding operations within several hundred nanoseconds. Thus further studies and emulations on NC are able to be carried out upon this platform in the real network scenario.

关键词： Network coding

来源：评论

学校读者我要写书评

暂无评论

Query-focused multi-document summarization based on query-sensitive feature space 12

Query-focused multi-document summarization based on query-se...

引用

21st ACM International Conference on Information and Knowledge Management, CIKM 2012

作者： Yin, Wenpeng Pei, Yulong Zhang, Fan Huang, Lian'en Shenzhen Key Lab. for Cloud Computing Technology and Application Peking University Shenzhen Graduate School Shenzhen 518055 China

ISBN: (纸本)9781450311564

Query-oriented relevance, information richness and novelty are important requirements in query-focused summarization, which, to a considerable extent, determine the summary quality. Previous work either rarely took into account all above demands simultaneously or dealt with part of them in the dynamic process of choosing sentences to generate a summary. In this paper, we propose a novel approach that integrates all these requirements skillfully by treating them as sentence features, making that the finally generated summary could fully reflect the combinational effect of these properties. Experimental results on the DUC2005 and DUC2006 datasets demonstrate the effectiveness of our approach. © 2012 ACM.

关键词： Vector spaces

来源：评论

学校读者我要写书评

暂无评论

A novel biased diversity ranking model for Query-oriented multidocument summarization

A novel biased diversity ranking model for Query-oriented mu...

引用

2013 International Conference on Vehicle and Mechanical Engineering and Information technology, VMEIT 2013

作者： Lei, Kai Zeng, Yi Fan The Shenzhen Key Lab for Cloud Computing Technology and Application Peking University Shenzhen Graduate School Shenzhen Guangdong 518055 China

ISBN: (纸本)9783037858202

Query-oriented multi-document summarization (QMDS) attempts to generate a concise piece of text byextracting sentences from a target document collection, withthe aim of not only conveying the key content of that corpus,also, satisfying the information needs expressed by that *** to its great applicable value, QMDS has been intensively studied in recent decades. Three properties are supposed crucial for a good summary, i.e., relevance, prestige and low redundancy (orso-called diversity). Unfortunately, most existing work either disregarded the concern of diversity, or handled it with non-optimized heuristics, usually based on greedy sentences election. Inspired by the manifold-ranking process, which deals with query-biased prestige, and DivRank algorithm which captures query-independent diversity ranking, in this paper, we propose a novel biased diversity ranking model, namedManifoldDivRank, for query-sensitive summarization tasks. The top-ranked sentences discovered by our algorithm not only enjoy query-oriented high prestige, more importantly, they are dissimilar with each other. Experimental results on DUC2005and DUC2006 benchmark data sets demonstrate the effectivenessof our proposal. © (2013) Trans Tech Publications, Switzerland.

关键词： Conveying

来源：评论

学校读者我要写书评

暂无评论

Adaptive Backdoor Attacks with Reasonable Constraints on Graph Neural Networks

arXiv

引用

arXiv 2025年

作者： Dong, Xuewen Li, Jiachen Li, Shujun You, Zhichao Qu, Qiang Kholodov, Yaroslav Shen, Yulong School of Computer Science and Technology Xidian University The Engineering Research Center of Blockchain Technology Application and Evaluation Ministry of Education China The Shaanxi Key Laboratory of Blockchain and Secure Computing Xi’an710071 China University of Kent CanterburyCT2 7NP United Kingdom Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences China Intelligent Transportation Systems Lab Innopolis University Innopolis Russia School of Computer Science and Technology Xidian University China The Shaanxi Key Laboratory of Network and System Security Xi’an710071 China

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Prologue: Unified polymorphic routing towards flexible architecture of reconfigurable infrastructure 9

Prologue: Unified polymorphic routing towards flexible archi...

引用

9th International ICST Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, TridentCom 2014

作者： Pan, Kai Li, Hui Liu, Weiyang Zhu, Zhipu Chen, Fuxing Zhu, Bing Shenzhen Engineering Lab of Converged Networks Technology Shenzhen Key Lab of Cloud Computing Technology and Application Peking University Shenzhen Graduate School Shenzhen518055 China

ISBN: (纸本)9783319133256

Today’s Internet architecture was designed and proposed in the 60s and 70s with the intention to interconnect several computing resources across a geographically distributed user group. With the advent of substantially various Internet businesses, traditional Internet is increasingly powerless to satisfy the unprecedented demands. This paper probed the polymorphic routing prototype based on proposed Flexible Architecture of Reconfigurable Infrastructure (FARI) which attempts to emerge as a clean-slate revolution of future Internet and resorts to centralized control manner. Routers in FARI were reconfigurable to adapt to different businesses in terms of identifier type. Moreover, a preliminary framework of FARI is proposed in the end of the article. © Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2014.

关键词： Reconfigurable architectures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：