Applying large language models (LLMs) to academic API usage shows promise in reducing researchers' efforts to seek academic information. However, current LLM methods for using APIs struggle with the complex API co...
详细信息
ISBN:
(纸本)9798400712456
Applying large language models (LLMs) to academic API usage shows promise in reducing researchers' efforts to seek academic information. However, current LLM methods for using APIs struggle with the complex API coupling commonly encountered in academic queries. To address this, we introduce SoAy, a solution-based LLM methodology for academic information seeking. SoAy enables LLMs to generate code for invoking APIs, guided by a pre-constructed API calling sequence referred to as a solution. This solution simplifies the model's understanding of complex API relationships, while the generated code enhances reasoning efficiency. LLMs are aligned with this solution-oriented, code-based reasoning method by automatically enumerating valid API coupling sequences and transforming them into queries and executable *** evaluate SoAy, we introduce SoAyBench, an evaluation benchmark accompanied by SoAyEval, built upon a cloned environment of APIs from AMiner. Experimental results demonstrate a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines. All datasets, codes, tuned models, and deployed online services are publicly accessible at https://***/RUCKBReasoning/SoAy.
The low-altitude economy (LAE), driven by unmanned aerial vehicles (UAVs) and other aircraft, has revolutionized fields such as transportation, agriculture, and environmental monitoring. In the upcoming six-generation...
详细信息
Social media platforms have become a hub for influencer marketing, where understanding influencer categories and post content is crucial. This study presents an Enhanced Influencer Profiler model that leverages both t...
详细信息
ISBN:
(数字)9798350356236
ISBN:
(纸本)9798350356243
Social media platforms have become a hub for influencer marketing, where understanding influencer categories and post content is crucial. This study presents an Enhanced Influencer Profiler model that leverages both text and image data for classifying influencers and posts with exceptional accuracy. The study utilizes an enhanced multimodal encoder, combining Efficient Net for image feature extraction with BERT, RoBERTa, and GPT-3 for textual features, and apply a cross-attention mechanism to fuse these features into a unified representation. In order to extend the definitional aspect of the influencer representation, the study considers a multi-level attention framework that can analyses internal and external interactions between influencers as well as their temporal and engagement patterns. In the context of influencer classification, the authors obtain a remarkable accuracy measurement of 99.62 % and this model maintains high scores of F1 in each category to quantify the effectiveness of this classification model. In post classification, the low-level model attains a level of 98.45% this is much higher than the general baseline method like SVC (71.60%) and Random Forest (76.25%). The findings described here demonstrate the effectiveness and flexibility of the model in multimodal classification tasks and outperform traditional models. Thus, the Extended Influencer Profiler, which has been developed using the Python software for data processing and training, shows the feasibility of the method. Therefore, the research findings propose that this model can be applied for influencer marketing and content classification in order to work more accurately and efficiently for marketers and analysts. In general, the concept of the Enhanced Influencer Profiler contributes to the progress of social media analysis practices, especially for dealing with multiple and heterogeneous data. Possible research directions for future work may involve imposing other modalities and evalua
The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and propose...
详细信息
The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed an improved algorithm. The algorithm finds all consequents layer by layer, so it is breadth-first. In this paper, we propose a new algorithm Generate Rules by using Set-Enumeration Tree (GRSET) which uses the structure of Set-Enumeration Tree and depth-first method to find all consequents of the association rules one by one and get all association rules correspond to the consequents. Experiments show GRSET algorithm to be practicable and efficient.
作者:
王珊杜小勇孟小峰陈红School of Information
Renmin University of China MOE Key Lab of Data Engineering and Knowledge Engineering Beijing 100872 P.R. China
database system is the infrastructure of the modern information system. The R&D in the database system and its technologies is one of the important research topics in the field. The database R&D in China took off la...
详细信息
database system is the infrastructure of the modern information system. The R&D in the database system and its technologies is one of the important research topics in the field. The database R&D in China took off later but it moves along by giant steps. This report presents the achievements Renmin University of China (RUC) has made in the past 25 years and at the same time addresses some of the research projects we, RUC, are currently working on. The National Natural Science Foundation of China supports and initiates most of our research projects and these successfully conducted projects have produced fruitful results.
Monitoring on data streams is an efficient method of acquiring the characters of data stream. However the available resources for each data stream are limited, so the problem of how to use the limited resources to pro...
详细信息
Monitoring on data streams is an efficient method of acquiring the characters of data stream. However the available resources for each data stream are limited, so the problem of how to use the limited resources to process infinite data stream is an open challenging problem. In this paper, we adopt the wavelet and sliding window methods to design a multi-resolution summarization data structure, the Multi-Resolution Summarization Tree (MRST) which can be updated incrementally with the incoming data and can support point queries, range queries, multi-point queries and keep the precision of queries. We use both synthetic data and real-world data to evaluate our algorithm. The results of experiment indicate that the efficiency of query and the adaptability of MRST have exceeded the current algorithm, at the same time the realization of it is simpler than others.
Dual-view gaze target estimation in classroom environments has not been thoroughly explored. Existing methods lack consideration of depth information, primarily focusing on 2D image information and neglecting the late...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Dual-view gaze target estimation in classroom environments has not been thoroughly explored. Existing methods lack consideration of depth information, primarily focusing on 2D image information and neglecting the latent 3D spatial context, which could lead to suboptimal transformation and cause the gaze cone to intersect with an incorrect object. This paper introduces a novel dual-view gaze target estimation method tailored for classroom settings, leveraging depth-enhanced spatial transformations. By formulating a depth-enhanced 2D space, our method uses depth-enhanced spatial transformation to accurately project students’ gaze cones to the teacher-oriented image. Additionally, we collected a dataset named DVSGE, specifically for student gaze target estimation in dual-view classroom images. Experimental results demonstrate significant performance improvements of 9.8% in AUC and 19.9% in L2-Distance for our method, surpassing existing methods.
Dear editor,Frequent itemset mining (FIM) is important in many data mining applications [1], such as web log mining and trend analysis. However, if the data are sensitive (e.g., web browsing history), directly releasi...
详细信息
Dear editor,Frequent itemset mining (FIM) is important in many data mining applications [1], such as web log mining and trend analysis. However, if the data are sensitive (e.g., web browsing history), directly releasing frequent itemsets and their support may breach user privacy. The protection of user privacy while obtaining statistical information is im-
Using data mining technology to analyze the huge amounts of meteorological data plays an important role in improving the accuracy of weather forecasts. After analyzed the features of meteorological data, a distributed...
详细信息
暂无评论