This paper investigates the integration of Artificial Intelligence (AI) into systematic literature reviews (SLRs), aiming to address the challenges associated with the manual review process. SLRs, a crucial aspect of ...
详细信息
This paper investigates the integration of Artificial Intelligence (AI) into systematic literature reviews (SLRs), aiming to address the challenges associated with the manual review process. SLRs, a crucial aspect of scholarly research, often prove time-consuming and prone to errors. In response, this work explores the application of AI techniques, including Natural Language Processing (NLP), machine learning, data mining, and text analytics, to automate various stages of the SLR process. Specifically, we focus on paper identification, information extraction, and data synthesis. The study delves into the roles of NLP and machine learning algorithms in automating the identification of relevant papers based on defined criteria. Researchers now have access to a diverse set of AI -based tools and platforms designed to streamline SLRs, offering automated search, retrieval, text mining, and analysis of relevant publications. The dynamic field of AI -driven SLR automation continues to evolve, with ongoing exploration of new techniques and enhancements to existing algorithms. This shift from manual efforts to automation not only enhances the efficiency and effectiveness of SLRs but also marks a significant advancement in the broader research process.
This paper aims to tackle the challenge posed by the increasing integration of software tools in research across various disciplines by investigating the application of Falcon-7b for the detection and classification o...
详细信息
ISBN:
(纸本)9783031657931;9783031657948
This paper aims to tackle the challenge posed by the increasing integration of software tools in research across various disciplines by investigating the application of Falcon-7b for the detection and classification of software mentions within scholarly texts. Specifically, the study focuses on solving Subtask I of the Software Mention Detection in scholarly Publications (SOMD), which entails identifying and categorizing software mentions from academic literature. Through comprehensive experimentation, the paper explores different training strategies, including a dual-classifier approach, adaptive sampling, and weighted loss scaling, to enhance detection accuracy while overcoming the complexities of class imbalance and the nuanced syntax of scholarly writing. The findings highlight the benefits of selective labelling and adaptive sampling in improving the model's performance. However, they also indicate that integrating multiple strategies does not necessarily result in cumulative improvements. This research offers insights into the effective application of large language models for specific tasks such as SOMD, underlining the importance of tailored approaches to address the unique challenges presented by academic text analysis.
Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digita...
详细信息
Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digital social trace data (DSTD). This identification allows us to utilize the graph structure of the data (e.g., user connected to a tweet, author connected to a paper, author connected to authors, etc.) for developing new information extraction tasks. The thesis focuses on information extraction from DSTD, first, using only the text data from tweets and scholarly paper abstracts, and then using the full graph structure of Twitter and scholarly communications datasets. This thesis makes three major contributions. First, new IE tasks based on DSTD representation of the data are introduced. For scholarly communication data, methods are developed to identify article and author level novelty and expertise. Furthermore, interfaces for examining the extracted information are introduced. A social communication temporal graph (SCTG) is introduced for comparing different communication data like tweets tagged with sentiment, tweets about a search query, and Facebook group posts. For social media, new text classification categories are introduced, with the aim of identifying enthusiastic and supportive users, via their tweets. Additionally, the correlation between sentiment classes and Twitter meta-data in public corpora is analyzed, leading to the development of a better model for sentiment classification. Second, methods are introduced for extracting information from social media and scholarlydata. For scholarlydata, a semi-automatic method is introduced for the construction of a large-scale taxonomy of computer science concepts. The method relies on the Wikipedia category tree. The constructed taxonomy is used for identifying key computer science phrases in scholarly papers, and tracking their evolution over time. Similarly, for social media data, machine lear
In the context of big scholarlydata, various metrics and indicators have been widely applied to evaluate the impact of scholars from different perspectives, such as publication counts, citations, h-index, and their v...
详细信息
In the context of big scholarlydata, various metrics and indicators have been widely applied to evaluate the impact of scholars from different perspectives, such as publication counts, citations, h-index, and their variants. However, these indicators have limited capacity in characterizing prospective impacts or achievements of scholars. To solve this problem, we propose the Academic Potential Index (API) to quantify scholar's academic potential. Furthermore, an algorithm is devised to calculate the value of API. It should be noted that API is a dynamic index throughout scholar's academic career. By applying API to rank scholars, we can identify scholars who show their academic potentials during the early academic careers. With extensive experiments conducted based on the Microsoft Academic Graph dataset, it can be found that the proposed index evaluates scholars' academic potentials effectively and captures the variation tendency of their academic impacts. Besides, we also apply this index to identify rising stars in academia. Experimental results show that the proposed API can achieve superior performance in identifying potential scholars compared with three baseline methods.
Interdisciplinary collaborations, i.e., scholarly crossdomain collaborations have generated huge impact to society, and has been previously proved to exhibit domain skewness[20]. To illustrate, scholarly cross-domain ...
详细信息
scholarly information usually contains millions of raw data, such as authors, papers, citations, as well as scholarly networks. With the rapid growth of the digital publishing and harvesting, how to visually present t...
详细信息
scholarly information usually contains millions of raw data, such as authors, papers, citations, as well as scholarly networks. With the rapid growth of the digital publishing and harvesting, how to visually present the data efficiently becomes challenging. Nowadays, various visualization techniques can be easily applied on scholarlydata visualization and visual analysis, which enables scientists to have a better way to represent the structure of scholarlydata sets and reveal hidden patterns in the data. In this paper, we first introduce the basic concepts and the collection of scholarlydata. Then, we provide a comprehensive overview of related data visualization tools, existing techniques, as well as systems for the analyzing volumes of diverse scholarlydata. Finally, open issues are discussed to pursue new solutions for abundant and complicated scholarlydata visualization, as well as techniques, that support a multitude of facets.
暂无评论