Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that "AI generated content may exhibit a distinctive behavior that can be separated from scientific ar...
详细信息
Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that "AI generated content may exhibit a distinctive behavior that can be separated from scientific articles". In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources. To mitigate overfitting issues, we incorporated a calibration step that is built upon data-driven heuristics, including proximity and ratios. Specifically, from a total of a 3952 fake articles for three different medical conditions, the algorithm was trained using only 100 articles, but calibrated using folds of 100 articles. As for the classification step, it was performed using 300 articles per condition. The actual label steps took place against an equal mix of 50 generated articles and 50 authentic PubMed abstracts. The testing also spanned publication periods from 2010 to 2024 and encompassed research on three distinct diseases: cancer, depression, and Alzheimer’s. Further, we evaluated the accuracy of the xFakeSci algorithm against some of the classical data mining algorithms (e.g., Support Vector Machines, Regression, and Naive Bayes). The xFakeSci algorithm achieved F1 scores ranging from 80% to 94%, outperforming common data mining algorithms, which scored F1 values between 38% and 52%. We attribute the noticeable difference to the introduction of calibration and a proximity distance heuristic, which underscores this promising performance. Indeed, the prediction of fake science generated by ChatGPT presents a considerable challenge. Nonetheless, the introduction of the xFakeSci algorithm is a significant st
In this study, we introduce Modular State-based Stackelberg Games (Mod-SbSG), a novel game structure developed for distributed self-learning in modular manufacturing systems. Mod-SbSG enhances cooperative decision-mak...
详细信息
Measurement is a fundamental building block of numerous scientific models and their creation. This is in particular true for data driven science. Due to the high complexity and size of modern data sets, the necessity ...
详细信息
Dimension reduction of data sets is a standard problem in the realm of machine learning and knowledge reasoning. They affect patterns in and dependencies on data dimensions and ultimately influence any decision-making...
详细信息
Formal Concept Analysis (FCA) allows to analyze binary data by deriving concepts and ordering them in lattices. One of the main goals of FCA is to enable humans to comprehend the information that is encapsulated in th...
详细信息
It is known that a (concept) lattice contains an n-dimensional Boolean suborder if and only if the context contains an n-dimensional contra-nominal scale as subcontext. In this work, we investigate more closely the in...
详细信息
Formal Concept Analysis (FCA) provides a method called attribute exploration which helps a domain expert discover structural dependencies in knowledge domains that can be represented by a formal context (a cross table...
详细信息
Order diagrams allow human analysts to understand and analyze structural properties of ordered data. While an experienced expert can create easily readable order diagrams, the automatic generation of those remains a h...
详细信息
There is a plenitude of software programs to analyze data sets using notions from formal concept analysis (FCA). For example, there are 64 FCA related projects listed on GitHub. Those are developed in ten different pr...
详细信息
Concept lattice drawings are an important tool to visualize complex relations in data in a simple manner to human readers. Many attempts were made to transfer classical graph drawing approaches to order diagrams. Alth...
详细信息
暂无评论