Speaker extraction and diarization are two enabling techniques for real-world speech applications. Speaker extraction aims to extract a target speaker's voice from a speech mixture, while speaker diarization demar...
详细信息
Speaker extraction and diarization are two enabling techniques for real-world speech applications. Speaker extraction aims to extract a target speaker's voice from a speech mixture, while speaker diarization demarcates speech segments by speaker, annotating ‘who spoke when’. Previous studies have typically treated the two tasks independently. In practical applications, it is more meaningful to have knowledge about ‘who spoke what and when’, which is captured by the two tasks. The two tasks share a similar objective of disentangling speakers. Speaker extraction operates in the frequency domain, whereas diarization is in the temporal domain. It is logical to believe that speaker activities obtained from speaker diarization can benefit speaker extraction, while the extracted speech offers more accurate speaker activity detection than the speech mixture. In this paper, we propose a unified model called Universal Speaker Extraction and Diarization (USED) to address output inconsistency and scenario mismatch issues. It is designed to manage speech mixtures with varying overlap ratios and variable number of speakers. We show that the USED model significantly outperforms the competitive baselines for speaker extraction and diarization tasks on LibriMix and SparseLibriMix datasets. We further validate the diarization performance on CALLHOME, a dataset based on real recordings, and experimental results indicate that our model surpasses recently proposed approaches.
Chinese traditional opera (Xiqu) performers often experience skin problems due to the long-term use of heavy-metal-laden face paints. To explore the current skincare challenges encountered by Xiqu performers, we condu...
详细信息
In Dhaka, the capital city of Bangladesh, various sources including vehicle emissions, industrial activities, brick kilns, building sites, and open rubbish burning contribute to the air pollution problem. To assess th...
In Dhaka, the capital city of Bangladesh, various sources including vehicle emissions, industrial activities, brick kilns, building sites, and open rubbish burning contribute to the air pollution problem. To assess the air quality, the Air Quality Index (AQI) is utilized, which categorizes air quality based on pollutant concentration. In this study, we have built ARIMA, Auto-ARIMA, SARIMAX, and VAR models to predict the air quality of Dhaka. Unlike previous studies, we have utilized hourly air pollutants factors such as PM 2.5 , PM 10 , SO 2 , CO, NO 2 , and O 3 to forecast air quality. Our novel approach enables us to predict the monthly and weekly air quality of Dhaka city. Our analysis reveals that the SARIMAX model, which takes into account seasonal patterns, trends, and external factors, is the most accurate in predicting Dhaka city’s air quality. The model’s prediction performance is assessed using statistical indicators such as mean absolute percentage error and root mean square error. The study highlights that the SARIMAX model could aid policymakers in evaluating the efficacy of air pollution control measures.
Nowadays there are a great amount of data that can be used to train artificial intelligent systems for classification, or prediction purposes. Although there are tons of publicly available data, there are also very va...
详细信息
Nowadays there are a great amount of data that can be used to train artificial intelligent systems for classification, or prediction purposes. Although there are tons of publicly available data, there are also very valuable data that is private, and therefore, it can not be shared without breaking the data protections laws. For example, hospital data has great value, but it involves persons, so we must try to preserve their privacy rights. Furthermore, although it could be interesting to train a model with the data of only one entity (i.e. a hospital), it could have more value to train the model with the data of several entities. But, since the data of each entity might not be shared, it is not possible to train a global model. In that sense, Federated Learning has emerged as a research field that deals with the training of complex models, without the necessity to share data, and therefore, keeping the data private. In this contribution, we present a global conceptual analysis based on co-words networks of the Federated Learning research field. To do that, the field was delimited using an advance query in Web of science. The corpus contain a total of 2444 documents. As the main result, it should be highlighted that the Federated Learning research field is focused on six main global areas: telecommunications, privacy and security, computer architecture and data modeling, machine learning, and applications.
Event relation extraction is an important research direction in the field of information extraction. Compared with named entity recognition, entity relation extraction, and event extraction. Event relation extraction ...
详细信息
The advent of millimeter wave (mm-Wave) technology in modern communication systems, including 5G networks, has brought about unprecedented data transmission speeds and bandwidths. However, environmental factors highly...
详细信息
ISBN:
(数字)9798350351118
ISBN:
(纸本)9798350351125
The advent of millimeter wave (mm-Wave) technology in modern communication systems, including 5G networks, has brought about unprecedented data transmission speeds and bandwidths. However, environmental factors highly affect mm-Wave signals, particularly in regions susceptible to dust and sand storms. Dust storms, characterized by high concentrations of suspended particles, lead to significant signal attenuation and degradation during the absorbed and scattered incident wave. This attenuation poses challenges to the reliability and performance of mm-Wave communication systems. The previous research used Mie theory to compute the specific attenuation due to dusty storms because it provides a complete analytical solution to Maxwell’s equations compared to other analytical and numerical methods. However, the Mie scattering model lacks accuracy due to consideration of only the amplitude of the attenuation factor with respect to the dust and sand environment. This paper presents the development of predictive mathematical models designed to estimate mm-Wave signal degradation in dust and sand storm conditions. The models integrate key physical parameters such as dust particle size distribution, storm intensity, signal frequency, and atmospheric *** predictive model demonstrates a significant accuracy in estimating signal attenuation by considering the phase shift in signal by introducing complex attenuation factor. We mathematically demonstrated that dust and sandstorms can cause mm-Wave signal attenuation but also cause a significant signal phase shift. This complex attenuation factor provides valuable insights for network engineers to design and optimize mm-Wave communi cation systems in dust-prone environments. Comparative analysis with existing models underscores the proposed models’ enhanced predictive capability and flexibility in adapting to diverse dust storm *** research outcomes contribute to the ongoing efforts to improve mm-Wave communicat
The field of clinical natural language processing (NLP) can extract useful information from clinical text. Since 2017, the NLP field has shifted towards using pre-trained language models (PLMs), improving performance ...
详细信息
Biomedical Wireless Sensor Networks (BWSN) is major technique for Health Care applications for providing quality of life with minimum cost. To enforce the quality of medical care provided to the citizen, such networks...
详细信息
ISBN:
(数字)9798350317060
ISBN:
(纸本)9798350317077
Biomedical Wireless Sensor Networks (BWSN) is major technique for Health Care applications for providing quality of life with minimum cost. To enforce the quality of medical care provided to the citizen, such networks should be integrated with existing network infrastructures. The requirement of quality of service should be concentrated on development stages and are monitored for the duration of network operation. The life time of the network is most relevant feature for ensuring the quality of medical care requirements. In order to increases the network lifetime, maintaining the remaining energy of each sensor node with reliable communication link throughout the network. An energy aware clustering for biomedical wireless sensor network (EACBWSN) is proposed for increasing the network lifetime by minimizing the energy consumption among sensor nodes and provides a reliable communication link between sensor nodes. The proposed algorithm efficiently prolong the lifetime of the network when compared with existing algorithm.
The rapid evolution of artificial intelligence (AI) has shifted from static, data-driven models to dynamic systems capable of perceiving and interacting with real-world environments. Despite advancements in pattern re...
详细信息
The supply chain is a thriving industry where numerous parties have different interests. Subsequently, the immense volume of data produced is difficult to audit. Some information can be lost or intentionally distorted...
详细信息
暂无评论