检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Junjie Li Ruijie Tao Zexu Pan Meng Ge Shuai Wang Haizhou Li Shenzhen Research Institute of Big Data Shenzhen China Department of Electrical and Computer Engineering National University of Singapore Singapore School of Data Science The Chinese University of Hong Kong Shenzhen China

Target speaker extraction aims to extract the speech of a specific speaker from a multi-talker mixture as specified by an auxiliary reference. Most studies focus on the scenario where the target speech is highly overlapped with the interfering speech. However, this scenario only accounts for a small percentage of real-world conversations. In this paper, we aim at the sparsely overlapped scenarios in which the auxiliary reference needs to perform two tasks simultaneously: detect the activity of the target speaker and disentangle the active speech from any interfering speech. We propose an audio-visual speaker extraction model named ActiveExtract, which leverages speaking activity from audio-visual active speaker detection (ASD). The ASD directly provides the frame-level activity of the target speaker, while its intermediate feature representation is trained to discriminate speech-lip synchronization that could be used for speaker disentanglement. Experimental results show our model outperforms baselines across various overlapping ratios, achieving an average improvement of more than 4 dB in terms of SI-SNR.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Multi-Modal data Fusion Approach Using the Structured and Sparse Principal Component Analysis (MMSSCA)

Multi-Modal Data Fusion Approach Using the Structured and Sp...

引用

2023 IEEE International Conference on Paradigm Shift in Information Technologies with Innovative Applications in Global Scenario, ICPSITIAGS 2023

作者： Kandpal, Jyoti Jonitha, A. Sivakumar, Janaki Ray, Rejon Kumar School of Computing Graphic Era Hill University Uttarakhand Dehradun India Universiti Malaysia Faculty of Data Science & Computing Kelantan Malaysia Global College of Engineering and Technology Computer Science and Creative Technologies Muscat Oman Gannon University Department of Business Analytics PA Erie United States

ISBN: (纸本)9798350329773

There are many different sorts of data that can be gathered and analyzed, including pictures, videos, texts, speeches, music, and other noises, Video content, for example, generally includes minimum some types of audible sounds or any type of visual image, and often also useful text data that can be of some specific type of titles or subtitles, which may consist of some unique type text language which is different type of speech. 'Modality' is basically defines each different type of acquisition strategy. As natural phenomena tend to be quite complex, it is unusual for a single modality to provide exhaustive understanding of the phenomenon of interest. The process of combining and interpreting the useful data between the same modal or in different modal, and also interaction with the human beings with multi-modal data fusion &the accurate and deep interpretation and understanding of data obtained from the combining process from it, are central to the research challenge on multimodal data analysis. We contend that many of these challenges are shared by researchers in different fields. Both the 'why' and the 'how' of data fusion are addressed in this paper. Various examples from the realms of science and technology serve as inspiration for the first problem, which is then suggested by a various types of mathematical based model framework demonstrating the utility of data fusion. To solve the second problem, we introduce the idea of 'diversity' as a central concept and discuss several data-driven solutions based on matrix and tensor decompositions, highlighting how they take into account diversity in the underlying data sets to address the aforementioned issues, this research suggests a novel approach for modeling multi data fusion implemented using sparse technique and structural analysis. © 2023 IEEE.

关键词： Modal analysis

来源：评论

学校读者我要写书评

暂无评论

Green Energy AI Cloud: Intelligent Community-Based Electricity Management System 8

Green Energy AI Cloud: Intelligent Community-Based Electrici...

引用

8th IEEE International Conference on Energy Internet, ICEI 2024

作者： Mehta, Yukta Lo, Vincent Mehta, Vijen Agarwal, Kunal Madabathula, Charan Teja Gao, Jerry San Jose State University Department of Applied Data Science San JoseCA United States Evergreen Valley High School Department of Analysis San JoseCA United States San Jose State University Department of Computer Engineering San JoseCA United States

ISBN: (纸本)9798331523558

According to the U.S. Energy Information Administration, 60 percent of the world's electricity is generated from fossil fuels, 18 percent from nuclear power, and only 21 percent from green energy resources. These statistics highlight the pressing need to transition to renewable energy for a sustainable future. However, a major challenge that arises from this increased demand for green energy is the risk of electricity shortage. Overcoming this challenge requires a shift from conventional energy management to an AI-based electricity management system that can more efficiently balance supply and demand. In this paper, we have developed a comprehensive cloud-based Green AI service Management System to accommodate three important aspects of green energy services: electricity consumption, electricity generation, and shortage analysis. In this system, we propose using a community-based model that manages and analyzes multiple buildings' energy usage, allowing the model to be trained further via distributed and aggregated training. The models developed for these components use three approaches: deep learning, machine learning, and reinforcement algorithms, and achieved a shortage analysis accuracy of 98.2 percent. The model is deployed on the cloud and results are accessible from the user interface. Overall, we provide a unique Green AI solution to the market using a system that can handle various aspects of green energy and gives better accuracy than existing systems. © 2024 IEEE.

关键词： Consumption prediction Electricity Management System Electricity shortage forecasting Green AI Usage prediction

来源：评论

学校读者我要写书评

暂无评论

SELF-TRANSRIBER: FEW-SHOT LYRICS TRANSCRIPTION WITH SELF-TRAINING

arXiv

引用

arXiv 2022年

作者： Gao, Xiaoxue Yue, Xianghu Li, Haizhou Department of Electrical and Computer Engineering National University of Singapore Singapore Shenzhen Research Institute of Big Data Shenzhen China School of Data Science The Chinese University of Hong Kong Shenzhen China

The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive. How to benefit from unlabeled data and alleviate limited data problem have not been explored for lyrics transcription. We propose the first semi-supervised lyrics transcription paradigm, Self-Transcriber, by leveraging on unlabeled data using self-training with noisy student augmentation. We attempt to demonstrate the possibility of lyrics transcription with a few amount of labeled data. Self-Transcriber generates pseudo labels of the unlabeled singing using teacher model, and augments pseudo-labels to the labeled data for student model update with both self-training and supervised training losses. This work closes the gap between supervised and semi-supervised learning as well as opens doors for few-shot learning of lyrics transcription. Our experiments show that our approach using only 12.7 hours of labeled data achieves competitive performance compared with the supervised approaches trained on 149.1 hours of labeled data for lyrics transcription. © 2022, CC BY-NC-ND.

关键词： Music

来源：评论

学校读者我要写书评

暂无评论

PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network

arXiv

引用

arXiv 2023年

作者： Liu, Qinghua Ge, Meng Wu, Zhizheng Li, Haizhou Shenzhen Research Institute of Big Data Shenzhen China School of Data Science The Chinese University of Hong Kong Shenzhen China Department of Electrical and Computer Engineering National University of Singapore Singapore

It is common in everyday spoken communication that we look at the turning head of a talker to listen to his/her voice. Humans see the talker to listen better, so do machines. However, previous studies on audio-visual speaker extraction have not effectively handled the varying talking face. This paper studies how to take full advantage of the varying talking face. We propose a Pose-Invariant Audio-Visual Speaker Extraction Network (PIAVE) that incorporates an additional pose-invariant view to improve audio-visual speaker extraction. Specifically, we generate the pose-invariant view from each original pose orientation, which enables the model to receive a consistent frontal view of the talker regardless of his/her head pose, therefore, forming a multi-view visual input for the speaker. Experiments on the multi-view MEAD and in-the-wild LRS3 dataset demonstrate that PIAVE outperforms the state-of-the-art and is more robust to pose variations. © 2023, CC BY.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

An Algorithm for Describing the Convex and Concave Shape of Protein Surface

An Algorithm for Describing the Convex and Concave Shape of ...

引用

2018国际计算机前沿大会（原国际青年计算机大会）

作者： Wei Wang Keliang Li Hehe Lv Lin Sun Hongjun Zhang Jinling Shi Shiguang Zhang Yun Zhou Yuan Zhao Jingjing Xv Department of Computer Science and Technology College of Computer and Information Engineering Henan Normal University Laboratory of Computation Intelligence and Information Processing Engineering Technology Research Center for Computing Intelligence and Data Mining School of Aviation Engineering Anyang University School of International Education Xuchang University

Protein surface plays a key role in many biological *** proteins participate in the life activities of cells via binding to other proteins or ligand molecules. It is an important work to study protein structure and function by analyzing the protein surface shape. Based on the CX algorithm and the 2 D fngerprint-base method, we proposed a FCX method to identify the morphology of bulges and depressions on the protein surface. The experimental results show that the FCX algorithm has a more desirable outcome than CX algorithm. The FCX algorithm has a higher correlation with the convex and concave features than CX values with solvent accessibility, solvent accessibility, and Bfactor's Pearson correlation coefficient. This result shows that the FCX algorithm can describe the shape of the protein surface residues more accurately than the CX algorithm.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Person Re-Identification with Model-Contrastive Federated Learning in Edge-Cloud Environment

引用

Intelligent Automation & Soft Computing 2023年第10期38卷 35-55页

作者： Baixuan Tang Xiaolong Xu Fei Dai Song Wang School of Computer and Software Nanjing University of Information Science and TechnologyNanjing210044China Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(CICAEET) Nanjing University of Information Science and TechnologyNanjing210044China School of Big Data and Intelligence Engineering Southwest Forestry UniversityKunming650233China Nanjing Meteorological Service Center Nanjing Meteorological BureauNanjing210019China

Person re-identification(ReID)aims to recognize the same person in multiple images from different camera *** person ReID models are time-consuming and resource-intensive;thus,cloud computing is an appropriate model training ***,the required massive personal data for training contain private information with a significant risk of data leakage in cloud environments,leading to significant communication *** paper proposes a federated person ReID method with model-contrastive learning(MOON)in an edge-cloud environment,named ***,based on federated partial averaging,MOON warmup is added to correct the local training of individual edge servers and improve the model’s effectiveness by calculating and back-propagating a model-contrastive loss,which represents the similarity between local and global *** addition,we propose a lightweight person ReID network,named multi-branch combined depth space network(MB-CDNet),to reduce the computing resource usage of the edge device when training and testing the person ReID ***-CDNet is a multi-branch version of combined depth space network(CDNet).We add a part branch and a global branch on the basis of CDNet and introduce an attention pyramid to improve the performance of the *** experimental results on open-access person ReID datasets demonstrate that FRM achieves better performance than existing baseline.

关键词： Person re-identification federated learning contrastive learning

来源：评论

学校读者我要写书评

暂无评论

Generative AI-enabled Blockchain Networks: Fundamentals, Applications, and Case Study

arXiv

引用

arXiv 2024年

作者： Nguyen, Cong T. Liu, Yinqiu Du, Hongyang Hoang, Dinh Thai Niyato, Dusit Nguyen, Diep N. Mao, Shiwen The Institute of Fundamental and Applied Sciences Duy Tan University Viet Nam The School of Computer Science and Engineering Nanyang Technological University Singapore The School of Electrical and Data Engineering University of Technology Sydney Australia The Department of Electrical and Computer Engineering Auburn University Auburn United States

Generative Artificial Intelligence (GAI) has recently emerged as a promising solution to address critical challenges of blockchain technology, including scalability, security, privacy, and interoperability. In this paper, we first introduce GAI techniques, outline their applications, and discuss existing solutions for integrating GAI into blockchains. Then, we discuss emerging solutions that demonstrate the effectiveness of GAI in addressing various challenges of blockchain, such as detecting unknown blockchain attacks and smart contract vulnerabilities, designing key secret sharing schemes, and enhancing privacy. Moreover, we present a case study to demonstrate that GAI, specifically the generative diffusion model, can be employed to optimize blockchain network performance metrics. Experimental results clearly show that, compared to a baseline traditional AI approach, the proposed generative diffusion model approach can converge faster, achieve higher rewards, and significantly improve the throughput and latency of the blockchain network. Additionally, we highlight future research directions for GAI in blockchain applications, including personalized GAI-enabled blockchains, GAI-blockchain synergy, and privacy and security considerations within blockchain ecosystems. © 2024, CC BY-NC-SA.

关键词： Blockchain

来源：评论

学校读者我要写书评

暂无评论

Supervised hashing with recurrent scaling 3rd

Supervised hashing with recurrent scaling

引用

3rd APWeb and WAIM Joint Conference on Web and Big data, APWeb-WAIM 2019

作者： Fu, Xiyao Bin, Yi Wang, Zheng Wei, Qin Chen, Si Guizhou Provincial Key Laboratory of Public Big Data Guizhou University GuiyangGuizhou550025 China School of Computer Science and Engineering University of Electronic Science and Technology of China ChengduSichuan611731 China Information Science Academy China Electronics Technology Group Corporation Beijing China

ISBN: (纸本)9783030260743

Learning to hash is a method that can deal with content-based information retrieval efficiently. Traditional learning to hash methods, however, lack the ability to map the generated hash codes to the high-level semantic space. Attributes, as a kind of higher level of visual data representation compared to features, have the potential ability in deep learning to boost the performance. Utilizing attributes from visual data in deep learning to hash can link every bit of the hash codes and a certain type of attributes, therefore giving the hash code an explicit explanation. This paper presents a novel framework, named Deep Recurrent Scaling Hashing (DRSH), to solve the traditional image retrieval problem. The hash codes generated from DRSH are a combination of the outputs of each step of an enhanced LSTM and features generated from convolutional neural nets and are learned through images' attributes. This RNN is reformed to adjust the decorrelation of data flowing between each cell step, which not only makes the learning phase benefit from the ability of recurrent neural nets to learn with recurrent memory but also enable the availability of each hash bit to preserve distinct information. Experiments show that this framework can achieve appreciable performance on major datasets, and also have the ability to explain the meaning of hash codes based on attributes. © 2019, Springer Nature Switzerland AG.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

OptiFog: A Framework for Acquiring State Information and Predicting Resource Availability for Task Offloading in Cooperative Fog-Networks

引用

IEEE Transactions on Services Computing 2024年 1-13页

作者： Alam, Mehbub Ahmed, Nurzaman Ghosh, Shyamal Matam, Rakesh Barbhuiya, Ferdous Ahmed Department of Computer Science and Engineering Indian Institute of Information Technology Guwahati India Department of Computer Science Dartmouth College Hanover USA School of Data Science Indian Institute of Science Education and Research Thiruvananthapuram India

The primary objective of fog computing is to minimize the reliance of IoT devices on the cloud by leveraging the resources of fog network. Typically, IoT devices offload computation tasks to fog to meet different task requirements such as latency in task execution, computation costs, etc. So, selecting such a fog node that meets task requirements is a crucial challenge. To choose an optimal fog node, access to each node's resource availability information is essential. Existing approaches often assume state availability or depend on a subset of state information to design mechanisms tailored to different task requirements. In this paper, OptiFog: a cluster-based fog computing architecture for acquiring the state information followed by optimal fog node selection and task offloading mechanism is proposed. Additionally, a continuous time Markov chain based stochastic model for predicting the resource availability on fog nodes is proposed. This model prevents the need to frequently synchronize the resource availability status of fog nodes, and allows to maintain an updated state information. Extensive simulation results show that OptiFog lowers task execution latency considerably, and schedules almost all the tasks at the fog layer compared to the existing state-of-the-art. IEEE

关键词： computer architecture

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：