检索结果-内蒙古大学图书馆

LDA Topic Modeling for bioinformatics Terms in arXiv Documents

Procedia computer science 2024年 245卷 229-238页

作者： Andrea Stevens Karnyoto Matthew Martianus Henry Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department BINUS Graduate Program - Master of Computer Science Bina Nusantara University Jakarta Indonesia 11480

A wide variety of disciplines contribute to bioinformatics research, including computer science, biology, chemistry, mathematics, and physics. This study determines the number of research articles published on arXiv classified as bioinformatics topics and the most frequently used bioinformatics terms using topic modeling, Latent Dirichlet Allocation (LDA). An algorithm based on LDA is used to discover topics hidden within large collections of documents through the use of statistical analysis. Our research examined 226453 articles on arXiv between January 2023 and January 2024. As a result, there are more than 10521 articles categorized into bioinformatics topics. Most commonly, 6352 documents are in the "Mathematical Physics" category. The second most popular category is "computer science," with 2950 documents. Accordingly, the terms 'RNA,' 'sequence,' 'tree,' and 'homology' are the three most commonly used terms in bioinformatics. The study of RNA plays a vital role in molecular biology; thus, the study of RNA is prevalent in bioinformatics. Sequential data refer to the order in which nucleotides or amino acids can be found in a DNA molecule or a protein.

关键词： arXiv bioinformatics Latent Dirichlet Allocation

来源：评论

学校读者我要写书评

暂无评论

Semi-automated meningioma segmentation with bounding boxes

引用

Procedia computer science 2024年 245卷 583-590页

作者： Nur Adhianti Heryanto Mahmud Isnan Matthew Martianus Henry Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department BINUS Graduate Program - Master of Computer Science Bina Nusantara University Jakarta Indonesia 11480

Segmentation is manually performed by physicians, which takes considerable time and may be subject to observers. Automating this task can increase efficiency and consistency. Existing studies on meningioma segmentation used data from limited study centers, indicating the need for research on multi-center data to assess generalizability. In this work, two semi-automated methods with bounding box priors, LiteMedSAM and BBU-Net, are evaluated on the brain tumor segmentation (BraTS) 2023 meningioma dataset collected from five study-centers. Preprocessing included exclusion of small tumors, z-score normalization, and extraction of slices that contain tumors, generating 25,602 2D axial magnetic resonance imaging (MRI) scans. A fine-tuning strategy is adopted for LiteMedSAM while BBU-Net is trained from scratch. The models are evaluated using a five-fold cross-validation, with data split at the case level. Results show that while U-Net models can achieve performance close to LiteMedSAM, the foundation model has overall better performance, with more than 90% in all evaluation scores.

关键词： Bounding boxes Meningioma Semi-automated segmentation

来源：评论

学校读者我要写书评

暂无评论

Automatic question generation for bahasa indonesia examination using copynet

引用

Procedia computer science 2024年 245卷 953-962页

作者： Matthew Martianus Henry Gregorius Natanael Elwirehardja Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department School of Computer Science Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department BINUS Graduate Program - Master of Computer Science Bina Nusantara University Jakarta Indonesia 11480

In educational institutions, an educator is responsible for assessing the student's knowledge grasp through examination. Creating exam questions, even the low-level factoid questions, is time-consuming, especially for inexperienced educators. Therefore, this study aims to create a sequence-to-sequence model using CopyNet by exploiting its copying mechanism advantage to automatically generate Bahasa Indonesia factoid questions to ease the educator's burden. Indonesian records in the TyDi QA dataset are used as the model input. GRU and Bi-GRU are employed as the CopyNet encoder, while LSTM is used as the CopyNet decoder. The model that utilizes GRU as the encoder achieves BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE-L scores of 0.28, 0.19, 0.14, 0.1, and 0.32, respectively. Bi-GRU utilization as the model encoder achieves BLEU1, BLEU2, BLEU3, BLEU4, and ROUGE-L scores of 0.26, 0.17, 0.12, 0.09, and 0.30, respectively. Models using either encoder still achieve low scores. However, compared with the previous work, the result is still on par regarding the BLEU score. Further examination found that the generated questions do not adhere to semantic and syntactical correctness. Adding more records to the dataset and utilizing a more advanced architecture like CopyBERT are encouraged to improve the model performance in future work. Despite the result, this study has shown that CopyNet, primarily designed for text summarization or single-turn dialogue, can be tailored for factoid question generation.

关键词： Automatic question generation Bahasa Indonesia Copying mechanism Natural language generation Sequence-to-sequence

来源：评论

学校读者我要写书评

暂无评论

Multivariate Time-Series Deep Learning for Joint Prediction of Temperature and Relative Humidity in a Closed Space 8

Multivariate Time-Series Deep Learning for Joint Prediction ...

引用

8th International Conference on computer science and Computational Intelligence, ICCSCI 2023

作者： Gunawan, Fergianto E. Budiman, Arief S. Pardamean, Bens Juana, Endang Romeli, Sugiarto Cenggoro, Tjeng W. Purwandari, Kartika Hidayat, Alam A. Redi, Anak A.N.P. Asrol, Muhammad Industrial Engineering Department BINUS Graduate Program - Master of Industrial Engineering Bina Nusantara University Jakarta11480 Indonesia Computer Science Department BINUS Graduate Program - Master of Computer Science Program Bina Nusantara University Jakarta11480 Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia Electrical Engineering Department Faculty of Industrial Technology Universitas Trisakti Jakarta11440 Indonesia PT Impack Pratama Industri Jakarta14350 Indonesia Computer Science Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia Mathematics Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia

An accurate predictive model of temperature and humidity plays a vital role in many industrial processes that utilize a closed space such as in agriculture and building management. With the exceptional performance of deep learning on time-series data, developing a predictive temperature and humidity model with deep learning is propitious. In this study, we demonstrated that deep learning models with multivariate time-series data produce remarkable performance for temperature and relative humidity prediction in a closed space. In detail, all deep learning models that we developed in this study achieve almost perfect performance with an R value over 0.99. © 2023 The Authors. Published by Elsevier B.V.

关键词： closed space deep learning humity prediction indoor temperature prediction

来源：评论

学校读者我要写书评

暂无评论

Trends, Opportunities, and Challenges in Detecting Depressive Disorders Through Mobile Devices: A Review 2

Trends, Opportunities, and Challenges in Detecting Depressiv...

引用

2nd International Conference on computer System, Information Technology, and Electrical Engineering, COSITE 2023

作者： Elwirehardja, Gregorius Natanael Isnan, Mahmud Perbangsa, Anzaludin Samsinga Muchtar, Kahlil Pardamean, Bens Computer Science Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia Mahmud Isnan Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia Information System Department School of Information Systems Bina Nusantara University Jakarta11480 Indonesia Department of Electrical and Computer Engineering Universitas Syiah Kuala Banda Aceh23111 Indonesia Computer Science Department BINUS Graduate Program-Master of Computer Science Bina Nusantara University Jakarta11480 Indonesia

ISBN: (纸本)9798350343069

Depressive Disorders (DD) is one of the most prevalent mental disorders in the world that may lead to suicide cases. To prevent the latter, ubiquitous early detection systems may be effective. Recent studies have since researched the development of such systems by exploiting several forms of data, including video, audio, Ecological Momentary Assessments (EMA), and passive sensing data using sensors embedded in mobile devices. To summarize the trends, opportunities, and existing challenges in this field, this study reviewed 15 papers to answer four research questions. EMA was the most popular data to be used in this task, but other approaches, such as using video, audio, and typing behaviors, may be considered due to the subjectivity of EMA. These data were typically recorded using smartphones and analyzed using Machine Learning (ML). However, most of the developed systems had yet to be implemented. Overall, it was concluded that further studies may need to explore usages of more objective data in multimodal approaches as well as consider using Mobile Cloud Computing (MCC) to deploy these systems to provide more effective and efficient diagnoses. Future studies must also take into account the existing challenges of the data and infrastructures, such as the weaknesses of several data types, limitations of mobile devices, as well as the challenges of diagnosis approaches. © 2023 IEEE.

关键词： Machine learning

来源：评论

学校读者我要写书评

暂无评论

Database Design for Indonesian Scholarship Recommender Systems 7

Database Design for Indonesian Scholarship Recommender Syste...

引用

7th International Conference on Information Management and Technology, ICIMTech 2022

作者： Elwirehardja, Gregorius Natanael Jason Dominic, Nicholas Pardamean, Bens BINUS Graduate Program - Master of Computer Science Bina Nusantara University Computer Science Department Jakarta11480 Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia

ISBN: (数字)9781665450904

ISBN: (纸本)9781665450904

Following the evolution of technology, researchers have conducted several studies to propose reliable scholarship recommender systems. However, few have explored the data storage systems, which are highly beneficial to collect huge volumes of real data in training the recommender systems. In this paper, a database design for collecting such data was proposed, equipped with integer encodings for categorical variables and a method to normalize variables with varying ranges. Using Connoly and Begg's database design method, the final design consisted of a normalized Entity Relationship Diagram (ERD) and data dictionaries. The design was also constructed to support usages of various data for determining scholarship recipients, including education history, achievements, and organizational experiences. The proposed design can be implemented on information systems to allow easier information access for scholarships in Indonesia. © 2022 IEEE.

关键词： Recommender systems

来源：评论

学校读者我要写书评

暂无评论

Clustering Analysis of a Spatiotemporal Dataset with a Novel Kernel Density Estimator

Clustering Analysis of a Spatiotemporal Dataset with a Novel...

引用

2023 International Conference on Machine Learning and Cybernetics, ICMLC 2023

作者： Yu, Jen-Chien Yang, Chun-Chieh Gilbert, John Reuben Liu, Rou-Jun Oyang, Yen-Jen Yang, Meng-Han National Kaohsiung University of Science and Technology Department of Computer Science and Information Engineering Kaohsiung City807618 Taiwan National Taiwan University Master Program in Statistics Taipei106216 Taiwan National Taiwan University Department of Computer Science & Information Engineering Taiwan Graduate Institute of Biomedical Electronics and Bioinformatics National Taiwan University Taiwan

ISBN: (纸本)9798350303780

A vast number of spatiotemporal datasets collected from a wide range of sources has motivated scientists to develop effective approaches to identify interesting patterns hidden in these datasets. In this respect, kernel density estimators, which belong to a class of non-parametric estimators in statistics, have been widely exploited in recent years. With this background, we have developed a novel kernel density estimator aiming to provide accurate analysis results. According to the evaluation with a real spatiotemporal dataset, which collected emergency medical service records in a county in the United States, the proposed kernel density estimator can approximate the probability density function significantly more accurately than a conventional kernel density estimator. Furthermore, we have exploited the proposed kernel density estimator to identify interesting patterns hidden in the real spatiotemporal dataset. © 2023 IEEE.

关键词： Probability density function

来源：评论

学校读者我要写书评

暂无评论

Web Information System Design for Fast Protein Post-Translational Modification Site Prediction 7

Web Information System Design for Fast Protein Post-Translat...

引用

7th International Conference on Information Management and Technology, ICIMTech 2022

作者： Elwirehardja, Gregorius Natanael Dominic, Nicholas Pardamean, Bens BINUS Graduate Program - Master of Computer Science Bina Nusantara University Computer Science Department Jakarta11480 Indonesia Research Center Bina Nusantara University Bioinformatics and Data Science Jakarta11480 Indonesia

ISBN: (数字)9781665450904

ISBN: (纸本)9781665450904

In the field of bioinformatics, the protein Post-Translational Modification (PTM) site prediction has been widely studied and Web Information Systems (WIS) has been deployed by researchers for this task. Through a literature review and benchmarking process, we identified the requirements which included quick predictions, efficient memory usage, and input validations. However, no detailed designs have been proposed so far, which may have contributed to some requirements not being implemented in some of the websites. Therefore, we propose a detailed WIS conceptual design, which can be used for predicting the sites of multiple PTM types, equipped with a validation algorithm and compared the usage of various string searching algorithms as well as file storage formats. Experiment results showed that the linear search algorithm is the fastest for this task and storing the protein data in npz format when performing multi-PTMs site prediction can assist in reducing memory usage. The proposed design can be implemented into user-friendly web tools that are both efficient in speed and memory usage in future studies. © 2022 IEEE.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Data Mining for the Global Multiplex Weekly Average Income Analysis

引用

Procedia computer science 2023年 219卷 52-59页

作者： Nicholas Dominic Gregorius Natanael Elwirehardja Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department BINUS Graduate Program Master of Computer Science Program Bina Nusantara University Jakarta Indonesia 11480

The box office (BO) income had significantly declined up to 80% in 2020, as the COVID-19 pandemic emerged. To minimize further financial risks, multiplex (multiple cinema complexes) owners need to analyze their potential income for each movie, each week. Therefore, we developed a proper data mining strategy that allows multiplex owners to analyze and discover insights on how successfully produced movies could be. The methodology comprises (1) data loading and exploration, (2) data cleaning, (3) data selection, integration, and transformation using Pentaho, (4) data mining in which the results were stored in the MySQL database, and (5) pattern evaluation and presentation using Qlik Sense as the Business Intelligence (BI) dashboard. Based on our data mining methodology, we revealed that drama, comedy, action, and thriller are favorite genres. We also found that DreamWorks Animation and Pixar Animation Studios are both the most popular production houses, even Apatow Productions and Escape Artists still have the biggest revenue on average.

关键词： movie multiplex income data mining Business Intelligence dashboard

来源：评论

学校读者我要写书评

暂无评论

Decision Theory and Risk Simulation Analysis for Optimizing Profit in PayLater Services

引用

Procedia computer science 2023年 219卷 60-67页

作者： Nicholas Dominic Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia 11480 Computer Science Department BINUS Graduate Program Master of Computer Science Program Bina Nusantara University Jakarta Indonesia 11480

The projected increase in PayLater utilization reaches up to five million people by 2025. To optimize the yearly profit from their PayLater service, fintech companies must examine all possible risks before a unanimous decision is taken. Therefore, we proposed a unified decision framework derived from decision theory and the Monte Carlo simulation technique. Two schemes were coined: (1) a decision-making scheme, and (2) a risk simulation scheme. Throughout experiments, the framework was able to estimate several alternative decisions and their impacts, analyze the causes of failure and delays in the development of the PayLater service, and execute Monte Carlo simulations in up to 10,000 trials. Outputs of this study will benefit decision-makers in the fintech initiative before launching their PayLater products.

关键词： PayLater decision-making Bayesian analysis fault tree analysis critical path method Monte Carlo simulation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：