Breast cancer in women’s becoming the serious cause moving to the morbidity and the mortality worldwide. This paper aims to design the hybrid model using various machine learning classification algorithms like k-Near...
详细信息
The aim of this research is to analyze and enhance the response time of digital platforms, including both mobile apps and web applications, dedicated to student club activities. Materials and Methods: A sample size of...
详细信息
One of the fundamental differences in the perception of electric (e-) vehicles is how their radiated noise is perceived with respect to classic internal combustion engines. Even though e-vehicles are usually quieter, ...
详细信息
License plate recognition is one of the challenging tasks and it belongs to ITS, due to backgrounds, variation of illumination, occlusion the recognition is became the challenging task. These challenges enabled a comp...
详细信息
Precise object detection allows military personnel to clearly understand their surroundings, leading to planning effective military strategies. Particularly, satellites and drones allow real-time surveillance over lar...
详细信息
Sign language enhances the communication capabilities of the deaf-mute community, allowing for a deeper understanding of their needs and emotions. These languages are highly structured and visual, using gestures and v...
详细信息
An ultra-wideband (UWB) slotted compact Vivaldi antenna with a microstrip line feed was evaluated for microwave imaging (MI) applications. The recommended FR4 substrate-based Vivaldi antenna is 50×50×1.5 mm3...
详细信息
Background: Chronic renal disease, often known as Chronic Kidney Disease (CKD), is an illness that causes a steady decline in kidney function. As per the World Health Organization survey, the incidence of CKD may incr...
详细信息
Effective or efficient management of e-waste is considered as preeminent vital challenges of the modern days. The massive scale of e-waste generated and dumped in open landfills or oceans without proper treatment pose...
详细信息
Feature extraction is crucial in bioinformatics, as it converts genomic sequences into numerical feature vectors essential for machine learning algorithms, particularly in clustering, to identify the families of newly...
详细信息
Feature extraction is crucial in bioinformatics, as it converts genomic sequences into numerical feature vectors essential for machine learning algorithms, particularly in clustering, to identify the families of newly sequenced genomes. Traditional methods have relied on alignment-based techniques for clustering the genomic sequences. However, these methods are computationally intensive. In contrast, alignment-free methods are now more commonly used due to their reduced computational demands. Despite this, many alignment-free approaches may generate identical feature vectors for dissimilar sequences, as they focus solely on single nucleotide counts (1-gram) and their arrangement during feature extraction, often neglecting dinucleotide counts and their arrangement, which can degrade clustering performance. Furthermore, certain approaches include trinucleotide or higher-order compositions;they introduce high-dimensionality issues, resulting in inaccurate results. Additionally, some existing methods are not scalable and take substantial time to extract features from large genomic sequences. To address these issues, we proposed a novel 33-dimensional Scalable Alignment-Free Feature Vector (33d-SAFFV) approach to extract the significantly important features such as length of sequence, count of dinucleotides, and positional sum of dinucleotides, which produces a 33-dimensional feature vector. This approach leverages Apache Spark for scalability and efficient in-memory computations, making it suitable for large datasets. We evaluated the performance of our proposed method by applying the extracted 33-dimensional feature vectors to K-Means and Fuzzy C-Means (FCM) clustering algorithms. Performance is measured using the Silhouette Index (SI) and Calinski-Harabasz (CH) index. Experimental results on the gene sequences of four varieties of rice datasets and two varieties of soybean datasets show the effectiveness of the proposed 33d-SAFFV approach. In K-Means clustering with t
暂无评论