Despite the tremendous success of automatic speech recognition (ASR) with the introduction of deep learning, its performance is still unsatisfactory in many real-world multi-talker scenarios. Speaker separation excels...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Despite the tremendous success of automatic speech recognition (ASR) with the introduction of deep learning, its performance is still unsatisfactory in many real-world multi-talker scenarios. Speaker separation excels in separating individual talkers but, as a frontend, it introduces processing artifacts that degrade the ASR backend trained on clean speech. As a result, mainstream robust ASR systems train on noisy speech to avoid processing artifacts. In this work, we propose to decouple the training of the multi-channel speaker separation frontend and the ASR backend, with the latter trained only on clean speech. On SMS-WSJ, the proposed approach achieves a word error rate (WER) of 5.74%, outperforming the previous best by 14.3%. Furthermore, on recorded LibriCSS, we achieve the speaker-attributed WER of 3.86%, outperforming the previous best system trained on the same data by 24.8%. These state-of-the-art results suggest that decoupling speech separation and recognition is a potentially effective approach to robust ASR.
The accumulation of large-scale data in recommendation systems can significantly increase resource consumption during model training. In particular, when redundant data exists, the model may overfit as it evaluates te...
详细信息
ISBN:
(数字)9798331506940
ISBN:
(纸本)9798331506957
The accumulation of large-scale data in recommendation systems can significantly increase resource consumption during model training. In particular, when redundant data exists, the model may overfit as it evaluates test data based on already seen, similar data during the training process. To address the issues of resource consumption and overfitting, this study proposes a lightweight similarity-based algorithm to reduce interaction data. The proposed algorithm decomposes interaction data using matrix factorization to derive a user matrix and then calculates the similarity between users using cosine similarity. The data is then reduced by removing the data of users with fewer items among pairs of users whose similarity exceeds a certain threshold. Experiments were conducted to analyze the impact of the proposed algorithm on recommendation performance and training time after data reduction. Furthermore, the algorithm’s effectiveness was evaluated through performance comparisons with existing data reduction methods. The primary contribution of this study is the introduction of a lightweight similarity-based approach that focuses on eliminating redundant user data in recommendation systems, thereby preventing overfitting and minimizing resource consumption.
In metropolitan settings, efficient trash disposal is essential because of growing public health concerns and the potential for disease spread from improperly handled rubbish. By combining IoT and image recognition te...
详细信息
From past few years, assessing and evaluating quality of college students Ideological and Political Education (IPE) played a vital role by enabling educators to pinpoint areas for improvement and gauge the effectivene...
详细信息
ISBN:
(数字)9798331529246
ISBN:
(纸本)9798331529253
From past few years, assessing and evaluating quality of college students Ideological and Political Education (IPE) played a vital role by enabling educators to pinpoint areas for improvement and gauge the effectiveness of education by enhancing student learning outcomes. Traditional approaches for evaluating political and ideological education had faced several challenges which include inability to handle complex data and lack of real-time assessment. Therefore, this research proposes Artificial Neural Networks (ANN) for evaluating political and ideological education. Initially, evaluation system is developed for IPE by integrating multitask learning and softmax regression for ensuring accurate assessments. Then, ANN is designed for evaluating IPE by learning complex relationships between input variables and output predictions. These ANN gave comprehensive evaluation output based on various criterions like educational effectiveness, campus style, and social reputation by providing robust assessment of IPE. The proposed ANN attained better results in terms of Mean Percentage Error Ratio MPER (90%) when compared with existing Radial Basis Function Neural Network (RBFNN).
Accurate stock price forecasting is a pivotal endeavor in financial markets, essential for informed investment decisions and effective economic strategies. Traditional models such as CAPM, Black-Scholes, ARIMA, and GA...
详细信息
ISBN:
(数字)9798331529246
ISBN:
(纸本)9798331529253
Accurate stock price forecasting is a pivotal endeavor in financial markets, essential for informed investment decisions and effective economic strategies. Traditional models such as CAPM, Black-Scholes, ARIMA, and GARCH often struggle to capture the complex, non-linear dynamics inherent in market data. This study presents a comparative analysis of deep learning-based approaches to predict trends in the financial time series. We propose a novel methodology that transforms raw financial data into image representations using the Markov Transformation Field (MTF) alongside a specialized image creation algorithm tailored for financial analysis. These images are then classified into the Buy, Sell, and Hold categories. The approach is evaluated on a dataset comprising twenty stocks from the Indian financial sector, offering a comprehensive assessment of model performance on a substantial scale. To be precise, our experiments suggest an F1 score of 73%. These numbers demonstrate that the proposed deep learning-based technique achieves promising accuracy, highlighting its potential advantages over traditional forecasting methods in capturing the behavior of financial markets.
With the development of new power systems, fast and accurate detection of False Data Injection Attacks (FDIA) is crucial for the secure operation of power grids. Existing FDIA detection models based on spatiotemporal ...
详细信息
ISBN:
(数字)9798331531935
ISBN:
(纸本)9798331531942
With the development of new power systems, fast and accurate detection of False Data Injection Attacks (FDIA) is crucial for the secure operation of power grids. Existing FDIA detection models based on spatiotemporal correlations have poor feature extraction capabilities and suffer from feature shift, facing the issue of disrupted inherent spatiotemporal correlations in measurement data. To address this, we propose a detection model based on adaptive fusion of spatiotemporal features. First, Graph Convolutional Networks (GCN) is used to extract static spatial features from the power grid topology, while Graph Attention Networks (GAT) captures the dynamic spatial features. Next, Long Short-Term Memory (LSTM) is employed to analyze the temporal variations and extract temporal features. Finally, the extracted spatiotemporal information is projected into feature space for alignment, and feature fusion is performed using a bilinear attention mechanism. The fused features are then used for FDIA detection. Experimental results show that, compared to existing detection models, the proposed model performs better in terms of detection accuracy and robustness.
The paper deals with the reaction of the vehicle operator's organism to external vibration and the development of a monitoring system at the early stages of vibration effects. The choice of wavelet transform as th...
详细信息
ISBN:
(数字)9798331518752
ISBN:
(纸本)9798331518769
The paper deals with the reaction of the vehicle operator's organism to external vibration and the development of a monitoring system at the early stages of vibration effects. The choice of wavelet transform as the main method of signal processing is justified. The zones of organism reaction to external overloads are singled out, and the application of Gaussian mother wavelets for electroencephalogram analysis, except Morlet wavelet, at vibration impact on the vehicle operator is evaluated. Experimental work was carried out on the EGV 10–100 stand of the vibromechanics laboratory of IMASH RAS.
Predicting the histological grading and genetic biomarkers of gliomas using medical imaging and deep learning is a highly challenging task with significant importance for providing personalized treatment and survival ...
详细信息
ISBN:
(纸本)9798400712203
Predicting the histological grading and genetic biomarkers of gliomas using medical imaging and deep learning is a highly challenging task with significant importance for providing personalized treatment and survival prediction. Previous studies have primarily focused on predicting the status of a single or two genes. This paper firstly proposes a new end-to-end network framework, GIResNet, to simultaneously predict the status of five genotypes. Our model, based on the ResNet architecture, optimizes channel and spatial scales by concatenating T1C and T2F images. Compared to the original ResNet, our approach demonstrates higher lesion attention and improvements in various metrics, showcasing strong potential for practical applications.
Adaptive real-time applications that respond to users’ cognitive and emotional states have emerged as a critical research frontier in brain-computer interface (BCI) technology. However, existing solutions often fail ...
详细信息
Brain cancer remains a major health challenge, particularly in constraint-driven settings like India, where it is the 10th leading cause of cancer-related deaths. Early detection is vital to improving treatment outcom...
详细信息
ISBN:
(数字)9798331519582
ISBN:
(纸本)9798331519599
Brain cancer remains a major health challenge, particularly in constraint-driven settings like India, where it is the 10th leading cause of cancer-related deaths. Early detection is vital to improving treatment outcomes, yet traditional methods relying on manual MRI interpretation can be time taking, expensive, and highly susceptible to error. This study is presenting a deep learning based approach which is using Convolutional Neural Networks (CNNs). The proposed system uses advanced data preprocessing techniques, including Z-score normalization, skull stripping, and augmentation, to optimize MRI datasets. A robust CNN model was developed and trained on a diverse dataset comprising glioma, meningioma, pituitary, and non-tumor cases. This model achieved exceptional performance, with training, validation, and test accuracies of 99.95%, 99.8%, and 99.3%, respectively, along with high accuracy, recall, and F1 points. Evaluation metrics such as ROC-AUC and confusion matrix analysis showcased the model's accuracy in differentiating tumour types and stages.
暂无评论