Background: In this research, a novel algorithm is formulated through the combination of gradient and adaptive thresholding. A set of 5 X 5 convolution kernels were generated to determine the gradients in the four mai...
详细信息
Audio Deepfakes, which are highly realistic fake audio recordings driven by AI tools that clone human voices, With Advancements in Text-Based Speech Generation (TTS) and Vocal Conversion (VC) technologies have enabled...
详细信息
Audio Deepfakes, which are highly realistic fake audio recordings driven by AI tools that clone human voices, With Advancements in Text-Based Speech Generation (TTS) and Vocal Conversion (VC) technologies have enabled it easier to create realistic synthetic and imitative speech, making audio Deepfakes a common and potentially dangerous form of deception. Well-known people, like politicians and celebrities, are often targeted. They get tricked into saying controversial things in fake recordings, causing trouble on social media. Even kids’ voices are cloned to scam parents into ransom payments, etc. Therefore, developing effective algorithms to distinguish Deepfake audio from real audio is critical to preventing such frauds. Various Machine learning (ML) and Deep learning (DL) techniques have been created to identify audio Deepfakes. However, most of these solutions are trained on datasets in English, Portuguese, French, and Spanish, expressing concerns regarding their correctness for other languages. The main goal of the research presented in this paper is to evaluate the effectiveness of deep learning neural networks in detecting audio Deepfakes in the Urdu language. Since there’s no suitable dataset of Urdu audio available for this purpose, we created our own dataset (URFV) utilizing both genuine and fake audio recordings. The Urdu Original/real audio recordings were gathered from random youtube podcasts and generated as Deepfake audios using the RVC model. Our dataset has three versions with clips of 5, 10, and 15 seconds. We have built various deep learning neural networks like (RNN+LSTM, CNN+attention, TCN, CNN+RNN) to detect Deepfake audio made through imitation or synthetic techniques. The proposed approach extracts Mel-Frequency-Cepstral-Coefficients (MFCC) features from the audios in the dataset. When tested and evaluated, Our models’ accuracy across datasets was noteworthy. 97.78% (5s), 98.89% (10s), and 98.33% (15s) were remarkable results for the RNN+LSTM
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions i...
详细信息
Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd datas
The architecture of integrating Software Defined Networking (SDN) with Network Function Virtualization (NFV) is excellent because the former virtualizes the control plane, and the latter virtualizes the data plane. As...
详细信息
In differentiable search architecture search methods,a more efficient search space design can significantly improve the performance of the searched architecture,thus requiring people to carefully define the search spa...
详细信息
In differentiable search architecture search methods,a more efficient search space design can significantly improve the performance of the searched architecture,thus requiring people to carefully define the search space with different complexity according to various *** rationalizing the search strategies to explore the well-defined search space will further improve the speed and efficiency of architecture *** this in mind,we propose a faster and more efficient differentiable architecture search method,***,we introduce a more efficient search space enriched by the introduction of two redefined convolution ***,we utilize a more efficient architectural parameter regularization method,mitigating the overfitting problem during the search process and reducing the error brought about by gradient ***,we introduce a natural exponential cosine annealing method to make the learning rate of the neural network training process more suitable for the search ***,group convolution and data augmentation are employed to reduce the computational ***,through extensive experiments on several public datasets,we demonstrate that our method can more swiftly search for better-performing neural network architectures in a more efficient search space,thus validating the effectiveness of our approach.
Uncertainty is an important factor that needs to be considered while analyzing the performance of any engineering *** order to quantify uncertainty,fuzzy set theory is frequently used by most of researchers,including ...
详细信息
Uncertainty is an important factor that needs to be considered while analyzing the performance of any engineering *** order to quantify uncertainty,fuzzy set theory is frequently used by most of researchers,including energy system *** to the classical reliability theory,component lifetimes have crisp parameters,but due to uncertainty and inaccuracy in data,it is sometimes very difficult to determine the exact values of these parameters in real-world *** overcome this difficulty in the current research,failure and repair rates were taken as triangular fuzzy numbers to determine the fuzzy availability of a system undergoing calendar-based periodic inspection subject to multiple failure modes(FMs).It was assumed that each component in the system had an exponential failure rate and repair rate with fuzzy *** FMs were explicitly taken into account when a functional state of the system was *** FM had a random failure *** the occurrence of any failure,a random time was selected for the relevant corrective repair *** proposed research was studied for one of the major sources of green energy,namely a wind turbine system wherein all the derived propositions have been implemented on it.
Digital speech processing applications including automatic speech recognition (ASR), speaker recognition, speech translation, and others, essentially require large volumes of speech data for training and testing purpo...
详细信息
In order to maintain sustainable agriculture, it is vital to monitor plant health. Since all species of plants are prone to characteristic diseases, it necessitates regular surveillance to search for any symptoms, whi...
详细信息
Aiming to enhance the management stage of Mobile English Interactive Educating in the intelligent flipped classroom mode, a design method of Mobile English Interactive Teaching Based on deep learning is proposed. Extr...
详细信息
Wheat is the most important cereal crop,and its low production incurs import pressure on the *** fulfills a significant portion of the daily energy requirements of the human *** wheat disease is one of the major facto...
详细信息
Wheat is the most important cereal crop,and its low production incurs import pressure on the *** fulfills a significant portion of the daily energy requirements of the human *** wheat disease is one of the major factors that result in low production and negatively affects the national ***,timely detection of wheat diseases is necessary for improving *** CNN-based architectures showed tremendous achievement in the image-based classification and prediction of crop ***,these models are computationally expensive and need a large amount of training *** this research,a light weighted modified CNN architecture is proposed that uses eight layers particularly,three convolutional layers,three SoftMax layers,and two flattened layers,to detect wheat diseases *** high-resolution images were collected from the fields in Azad Kashmir(Pakistan)and manually annotated by three human *** convolutional layers use 16,32,and 64 *** filter uses a 3×3 kernel *** strides for all convolutional layers are set to *** this research,three different variants of datasets are *** variants S1-70%:15%:15%,S2-75%:15%:10%,and S3-80%:10%:10%(train:validation:test)are used to evaluate the performance of the proposed *** extensive experiments revealed that the S3 performed better than S1 and S2 datasets with 93%*** experiment also concludes that a more extensive training set with high-resolution images can detect wheat diseases more accurately.
暂无评论