The advancement and application of information technology have provided significant benefits to society. Complete and quality information is needed to achieve efficiency and make it easier for people to find the right...
详细信息
Enzymes are biocatalysts with vital roles in biological functions and many industrial applications. Diverse enzymes are classified using Enzyme Commission (EC) nomenclature, making differentiation challenging. On the ...
详细信息
ISBN:
(数字)9798331520311
ISBN:
(纸本)9798331520328
Enzymes are biocatalysts with vital roles in biological functions and many industrial applications. Diverse enzymes are classified using Enzyme Commission (EC) nomenclature, making differentiation challenging. On the other hand, another biological information, gene ontology (GO), can describe the biological aspects of enzymes, covering related biological processes (BP), molecular functions (MF), and their locations within cells (CC). This study proposes a novel EC class and subclass classification of enzymes within the ontology subclass based on their GO semantics using a Bidirectional Encoder Representation of Transformer (BERT). The BERT model is first fine-tuned using the preprocessed GO term name and definition, with the enzymes in each ontology class (BP, MF, or CC) are also divided based on how the GO assigned, either through manual annotation (NONIEA) or electronically inferred (IEA). BERT successfully obtained 0.93, 0.60, 0.99, 0.90, 0.40, and 0.35 F1 scores during fine-tuning for BP IEA, BP NONIEA, MF IEA, MF NONIEA, CC IEA, and CC NONIEA, respectively. On the test set, the fine-tuned BERT significantly outperformed GOntoSim, a framework to calculate semantic similarity based on classical information theory, in EC class classification across all metrics with less inference time in all ontology subclass. Expanded further to the EC subclass, BERT can classify the enzyme on the EC subclass level in BP IEA and MF IEA ontology subclass. However, longer epochs are needed in fine-tuning. This result shows that the names and definitions of GO terms are distinguishable features in classifying enzymes as an alternative to the information content approach.
The success of machine learning models relies heavily on effectively representing high-dimensional data. However, ensuring data representations capture human-understandable concepts remains difficult, often requiring ...
Inspired by the dynamic coupling of moto-neurons and physical elasticity in animals, this work explores the possibility of generating locomotion gaits by utilizing physical oscillations in a soft snake by means of a l...
详细信息
Ancestry estimation is one crucial stage in genomic research. It generates scores that represent the admixed genetics profile as the result of human evolution. In the previous research, we implemented multiple unsuper...
详细信息
This article focuses on the research, design and implementation of a prediction tool for air quality to estimate pollutant concentrations, contributing to environmental engineering. It addresses prediction of fine par...
详细信息
In 2006, during a meeting of a working group of scientists in La Jolla, California at The Neurosciences Institute (NSI), Gerald Edelman described a roadmap towards the creation of a Conscious Artifact. As far as I kno...
详细信息
There is still a severe malaria problem worldwide, particularly in regions with limited access to diagnostic tools. It is crucial to develop a system for detecting malaria in blood cells. This paper presents a hybrid ...
There is still a severe malaria problem worldwide, particularly in regions with limited access to diagnostic tools. It is crucial to develop a system for detecting malaria in blood cells. This paper presents a hybrid Convolutional Neural Network (CNN), and as a classifier, we use a Support Vector Machine (SVM) framework for the automated detection of malaria parasites in blood cell images. The proposed system leverages the strengths of CNNs in feature extraction and representation learning from images, combined with the discriminative power of SVMs for classification. Initially, CNN extracts intricate features from blood cell images, capturing essential patterns indicative of malaria infection. Subsequently, the extracted features are used to train an SVM classifier, enabling accurate discrimination between parasitized and uninfected blood cells. Experimental dataset evaluations were obtained from the Lister Hill National Center for Biomedical Communications website of the National Library of Medicine. The proposed model achieves a better f1-score, outperforming individual CNN or SVM models, around 0.015 compared to individual CNN models and 0.27 compared to individual SVM models. This hybrid CNN-SVM methodology offers a promising solution for accurately and efficiently detecting malaria parasites in blood cell images.
Hate speech is one of the most challenging problem internet is facing today. With increasing numbers of users online, hate speech also rise and takes time to be classified manually particularly in languages other than...
详细信息
Speed bumps are vertical raisings of the road pavement used to force drivers to slow down to ensure greater safety in traffic. However, these obstacles have disadvantages in terms of efficiency and safety, where the p...
详细信息
Speed bumps are vertical raisings of the road pavement used to force drivers to slow down to ensure greater safety in traffic. However, these obstacles have disadvantages in terms of efficiency and safety, where the presence of speed bumps can affect travel time and fuel consumption, cause traffic jams, delay emergency vehicles, and cause vehicle damage or accidents when not properly signaled. Due to these factors, the availability of geolocation information for these obstacles can benefit several applications in Intelligent Transportation System (ITS), such as Advanced Driver Assistance Systems (ADAS) and autonomous vehicles, allowing to trace more efficient routes or alert the driver of the presence of the obstacle ahead. Speed bump detection applications described in the literature employ cameras or inertial sensors, represented by accelerometers and gyroscopes. While camera-based solutions are mature with evaluation in different contextual conditions, those based on inertial sensors do not offer multi-contextual analyses, being mostly simple applications of proof of concept, not applicable in real-world scenarios. For this reason, in this work, we propose the development of a reliable speed bump detection model based on inertial sensors, capable of operating reliably in contextual variations: different vehicles, driving styles, and environments in which vehicles can travel to. For the model development and validation, we collect nine datasets with contextual variations, using three different vehicles, with three different drivers, in three different environments, in which there are three different surface types, in addition to variations in conservation state and the presence of obstacles and anomalies. The speed bumps are present in two different pavement types, asphalt and cobblestone. We use the collected data in experiments to evaluate aspects such as the influence of the placement of the sensors for vehicle data collection and the data window size. Afterwar
暂无评论