检索结果-内蒙古大学图书馆

An improved Bert learning model for E-commerce text entity extraction

JOURNAL OF SUPERCOMPUTING 2025年第4期81卷 1-28页

作者： Fan, Huiqiong Wan, Changxuan Jiangxi Univ Finance & Econ Sch Informat Management Nanchang 330032 Jiangxi Peoples R China Jiangxi Univ Finance & Econ Jiangxi Key Lab Data & Knowledge Engn Nanchang 330013 Jiangxi Peoples R China

In response to the intricate and non-standardized nature of e-commerce product descriptions, we propose an enhanced BERT-BiLSTM-CRF entity extraction model to address the limitations of existing models in accurately extracting necessary entities. BERT serves as the encoder, BiLSTM constructs the feature extractor, and CRF refines the model for improved accuracy. The e-commerce product text data are preprocessed and annotated according to real-world task requirements, and the model's performance is validated on both self-generated and MSRA public datasets. Experimental results show that the BERT-BiLSTM-CRF model achieves an F1 score of 98% and precision of 98% on the self-generated dataset, effectively catering to real-world e-commerce product information mining tasks. Furthermore, it achieves a relative balance between precision and recall, enabling more precise extraction of essential entities. The model's performance improvements are attributed to the integration of BERT's contextual understanding, BiLSTM's ability to capture long-range dependencies, and CRF's label sequence optimization, which collectively enhance the model's generalization and accuracy across diverse datasets. Compared to traditional models, the proposed BERT-BiLSTM-CRF model shows a 6.5% improvement in F1 score on the MSRA dataset and a 4% improvement in precision on the self-generated dataset, highlighting its superior capability in handling complex and non-standardized e-commerce text data.

关键词： Entity extraction E-commerce text big data Deep learning Bert learning model

来源：评论

学校读者我要写书评

暂无评论

OEE-CFC: A dataset for Open Event Extraction from Chinese Financial Commentary

OEE-CFC: A Dataset for Open Event Extraction from Chinese Fi...

引用

2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024

作者： Wan, Qizhi Wan, Changxuan Hu, Rong Liu, Dexi Xu, Wenwu Xu, Kang Zou, Meihua Liu, Tao Yang, Jie Xiong, Zhenwei School of Computer and Artificial Intelligence Jiangxi University of Finance and Economics Jiangxi Key Laboratory of Data and Knowledge Engineering China

ISBN: (纸本)9798891761681

To meet application needs, event extraction has shifted from simple entities to unconventional entities serving as event arguments. However, current corpora with unconventional entities as event arguments are limited in event types and lack rich multi-events and shared arguments. Financial commentary not only describes the basic elements of an event but also states the background, scope, manner, condition, result, and tool used for the event, as well as the tense, intensity, and emotions of actions or state changes. Therefore, it is not suitable to develop event types that include only a few specific roles, as these cannot comprehensively capture the event's semantics. Also, there are affluent complex entities serving as event arguments, multiple events, and shared event arguments. To advance the practicality of event extraction technology, this paper first develops a general open event template from the perspective of understanding the meaning of events, aiming to comprehensively reveal useful information about events. This template includes 21 event argument roles, divided into three categories: core event roles, situational event roles, and adverbial roles. Then, based on the constructed event template, Chinese financial commentaries are collected and manually annotated to create a corpus OEE-CFC supporting open event extraction. This corpus includes 17,469 events, 44,221 arguments, 3,644 complex arguments, and 5,898 shared arguments. Finally, based on the characteristics of OEE-CFC, we design four types of prompts, and two models for event argument extraction are developed, with experiments conducted on the prompts. © 2024 Association for Computational Linguistics.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

A Multifocal Graph-Based Neural Network Scheme for Topic Event Extraction

引用

ACM TRANSACTIONS ON INFORMATION SYSTEMS 2025年第1期43卷 1-36页

作者： Wan, Qizhi Wan, Changxuan Xiao, Keli Hu, Rong Liu, Dexi Liao, Guoqiong Liu, Xiping Shuai, Yuxin Jiangxi Univ Finance & Econ Sch Comp & Artificial Intelligence Nanchang Peoples R China Jiangxi Key Lab Data & Knowledge Engn Nanchang Peoples R China SUNY Stony Brook Coll Business Stony Brook NY 11794 USA

Event extraction is a long-standing and challenging task in natural language processing, and existing studies mainly focus on extracting events within sentences. However, a significant problem that has not been carefully investigated is whether an "event topic" can be identified to represent the main aspects of extracted events. This article formulates the "topic event" extraction problem, aiming to identify a representative event from extracted ones. Specifically, after defining the topic event, we develop a multifocal graph-based framework to handle the extraction task. To enrich the associations of events and their tokens, we construct four event graphs, including the event subgraph and three event-associated graphs (i.e., event dependency parsing graph, event organization graph, and event share token graph), that reflect the internal and external structures of events, respectively. Subsequently, we design a multi-attention event-graph neural network to capture these event graph structures and improve event subgraph embedding. Finally, the output embeddings in the last layer of each channel are concatenated and fed into a fully connected network for topic event recognition. Extensive experiments validate the effectiveness of our method, and the results confirm its superiority over state-of-the-art baselines. In-depth analyses explore the essential factors (e.g., graph structures, attentions, feature generation method, etc.) determining the extraction performance.

关键词： Event Topic topic event extraction event graphs subgraph graph neural network

来源：评论

学校读者我要写书评

暂无评论

rHDP: An Aspect Sharing-Enhanced Hierarchical Topic Model for Multi-Domain Corpus

引用

ACM TRANSACTIONS ON INFORMATION SYSTEMS 2024年第3期42卷 1-31页

作者： Zhang, Yitao Wan, Changxuan Xiao, Keli Wan, Qizhi Liu, Dexi Liu, Xiping Jiangxi Univ Finance & Econ Nanchang Peoples R China East China Jiaotong Univ Nanchang Peoples R China Jiangxi Key Lab Data & Knowledge Engn Nanchang Peoples R China SUNY Stony Brook Coll Business New York NY USA

Learning topic hierarchies from a multi-domain corpus is crucial in topic modeling as it reveals valuable structural information embedded within documents. Despite the extensive literature on hierarchical topic models, effectively discovering inter-topic correlations and differences among subtopics at the same level in the topic hierarchy, obtained from multiple domains, remains an unresolved challenge. This article proposes an enhanced nested Chinese restaurant process (nCRP), nCRP+, by introducing an additional mechanism based on Chinese restaurant franchise (CRF) for aspect-sharing pattern extraction in the original nCRP. Subsequently, by employing the distribution extracted from nCRP+ as the prior distribution for topic hierarchy in the hierarchical Dirichlet processes (HDP), we develop a hierarchical topic model for multi-domain corpus, named rHDP. We describe the model with the analogy of Chinese restaurant franchise based on the central kitchen and propose a hierarchical Gibbs sampling scheme to infer the model. Our method effectively constructs well-established topic hierarchies, accurately reflecting diverse parent-child topic relationships, explicit topic aspect sharing correlations for inter-topics, and differences between these shared topics. To validate the efficacy of our approach, we conduct experiments using a renowned public dataset and an online collection of Chinese financial documents. The experimental results confirm the superiority of our method over the state-of-the-art techniques in identifying multi-domain topic hierarchies, according to multiple evaluation metrics.

关键词： Hierarchical topic model aspect sharing pattern Chinese restaurant franchise hierarchical Dirichlet processes

来源：评论

学校读者我要写书评

暂无评论

Token-Event-Role Structure-Based Multi-Channel Document-Level Event Extraction

引用

ACM TRANSACTIONS ON INFORMATION SYSTEMS 2024年第4期42卷 1-27页

作者： Wan, Qizhi Wan, Changxuan Xiao, Keli Xiong, Hui Liu, Dexi Liu, Xiping Hu, Rong Jiangxi Univ Finance & Econ Sch Informat Management Nanchang Jiangxi Peoples R China Jiangxi Key Lab Data & Knowledge Engn Nanchang Jiangxi Peoples R China SUNY Stony Brook Coll Business Stony Brook NY USA Hong Kong Univ Sci & Technol Guangzhou Thrust Artificial Intelligence Guangzhou Guangdong Peoples R China

Document-level event extraction is a long-standing challenging information retrieval problem involving a sequence of sub-tasks: entity extraction, event type judgment, and event type-specific multi-event extraction. However, addressing the problem as multiple learning tasks leads to increased model complexity. Also, existing methods insufficiently utilize the correlation of entities crossing different events, resulting in limited event extraction performance. This article introduces a novel framework for document-level event extraction, incorporating a newdata structure called token-event-role and a multi-channel argument role predictionmodule. The proposed data structure enables our model to uncover the primary role of tokens in multiple events, facilitating a more comprehensive understanding of event relationships. By leveraging the multi-channel prediction module, we transform entity and multi-event extraction into a single task of predicting token-event pairs, thereby reducing the overall parameter size and enhancing model efficiency. The results demonstrate that our approach outperforms the state-of-the-art method by 9.5 percentage points in terms of the F1 score, highlighting its superior performance in event extraction. Furthermore, an ablation study confirms the significant value of the proposed data structure in improving event extraction tasks, further validating its importance in enhancing the overall performance of the framework

关键词： Document-level event extraction token-event-role data structure joint learning multi-channel neural network

来源：评论

学校读者我要写书评

暂无评论

CFERE: Multi-type Chinese financial event relation extraction

引用

INFORMATION SCIENCES 2023年第1期630卷 119-134页

作者： Wan, Qizhi Wan, Changxuan Xiao, Keli Hu, Rong Liu, Dexi Liu, Xiping Jiangxi Univ Finance & Econ Jiangxi Key Lab Data & Knowledge Engn Nanchang Peoples R China Jiangxi Univ Finance & Econ Sch Informat Technol Nanchang Peoples R China SUNY Stony Brook Coll Business Stony Brook NY 11794 USA

Extracting various types of event relations in financial texts can benefit many downstream applications supporting financial analysis. This paper addresses the multi-type event relation extraction problem in the finance domain focusing on handling several issues in existing studies, including specialIntscript limited event relation types involved, specialIntscript insufficient feasibility when handling non-annotated data, specialIntscript ineffectiveness in recognizing multi-type event relation , specialIntscript the asynchronous event , event relation extraction process. To tackle these limitations, we carefully define six types of event relations based on the characteristics of financial texts (e.g., abundant numerical words) and further devise an integral framework for Chinese financial event relation extraction. The framework is capable of handling unsupervised event extraction and event relation recognition jointly. Specifically, according to linguistic characteristics, a Core Verb Chain is employed for the event identification. Then, by constructing a Syntactic Semantic Dependency Parsing graph, scattered events are combined into pairs, and event ellipsis elements can be completed to prevent event information loss. Also, to capture more sentence semantics, we formulate an Event Restore module that converts the structured event pairs into event restore sentence pairs and pour the pairs into the BERT model for relation type identification. Finally, to enhance the embeddings for event core elements, an Event Core Embeddings layer is augmented in BERT, and we fine-tune the model on our annotated financial corpus. Extensive experiments are conducted to validate the effectiveness of our method, and the results confirm its superiority over the state-of-the-art baselines.

关键词： Chinese financial event relation Core verb chain Syntactic semantic dependency parsing graph Event restore sentence Event core embeddings BERT

来源：评论

学校读者我要写书评

暂无评论

A Multi-channel Hierarchical Graph Attention Network for Open Event Extraction

引用

ACM TRANSACTIONS ON INFORMATION SYSTEMS 2023年第1期41卷 1-27页

作者： Wan, Qizhi Wan, Changxuan Xiao, Keli Hu, Rong Liu, Dexi Jiangxi Key Lab Data & Knowledge Engn Nanchang 33013 Jiangxi Peoples R China Jiangxi Univ Finance & Econ Sch Informat Management Nanchang 330032 Jiangxi Peoples R China SUNY Stony Brook Coll Business Stony Brook NY 11794 USA Jiangxi Univ Finance & Econ Sch Software & Internet Things Engn Nanchang 330032 Jiangxi Peoples R China

Event extraction is an essential task in natural language processing. Although extensively studied, existing work shares issues in three aspects, including (1) the limitations of using original syntactic dependency structure, (2) insufficient consideration of the node level and type information in Graph Attention Network (GAT), and (3) insufficient joint exploitation of the node dependency type and part-of-speech (POS) encoding on the graph structure. To address these issues, we propose a novel framework for open event extraction in documents. Specifically, to obtain an enhanced dependency structure with powerful encoding ability, our model is capable of handling an enriched parallel structure with connected ellipsis nodes. Moreover, through a bidirectional dependency parsing graph, it considers the sequence of order structure and associates the ancestor and descendant nodes. Subsequently, we further exploit node information, such as the node level and type, to strengthen the aggregation of node features in our GAT. Finally, based on the coordination of triple-channel features (i.e., semantic, syntactic dependency and POS), the performance of event extraction is significantly improved. Extensive experiments are conducted to validate the effectiveness of our method, and the results confirm its superiority over the state-of-the-art baselines. Furthermore, in-depth analyses are provided to explore the essential factors determining the extraction performance.

关键词： Open event extraction bidirectional dependency parsing graph Hierarchical Graph Attention Network multiple channels

来源：评论

学校读者我要写书评

暂无评论

Financial causal sentence recognition based on BERT-CNN text classification

引用

JOURNAL OF SUPERCOMPUTING 2022年第5期78卷 6503-6527页

作者： Wan, Chang-Xuan Li, Bo Jiangxi Univ Finance & Econ Sch Informat & Technol Nanchang 330013 Jiangxi Peoples R China Jiangxi Univ Sci & Technol Ganzhou 340000 Peoples R China Jiangxi Univ Finance & Econ Jiangxi Key Lab Data & Knowledge Engn Nanchang 330013 Jiangxi Peoples R China

By studying the causality contained in financial texts, we can further reveal more potential laws of economic activities, such as "factors promoting stable and healthy economic development," "The central bank's use of the loan window to issue money will increase the probability of inflation," "The consequence of overcapacity is a decline in product prices," and so on. Causal sentence recognition usually includes two sub-tasks: one is to design rules or templates to find candidate causal sentences;the other is to design a classifier to sort candidate causal sentences to finally identify the causal sentence. This article first focuses on the characteristics of complex sentence patterns of multiple causes and one effect, multiple effects and one cause, and multiple causes and multiple effects in financial review texts, and provides a relatively complete candidate causal sentence identification rules, which can identify both simple causal sentences and complex causal sentences. A BERT-CNN (Bidirectional Encoder Representations from Transformers-Convolutional Neural Networks) combination model is proposed for the classification of candidate causal sentences. On the one hand, by adding a CNN (Convolutional Neural Networks) structure to the specific task layer of the BERT (Bidirectional Encoder Representations from Transformers) model to capture important local information in the text. On the other hand, in order to make better use of the self-attention mechanism, the local text representation and the output of the BERT are input together in the multi-layer transformer encoder. A complete representation of the text is finally obtained through a single-layer transformer encoder. Experimental results show that our model is significantly better than the most advanced baseline model, with a 5.31 pts improvement in F1 over previous analyzers.

关键词： Text classification Recognition of causality BERT model

来源：评论

学校读者我要写书评

暂无评论

A Part-of-Speech Tagging Model Employing Word Clustering and Syntactic Parsing

引用

Chinese Journal of Electronics 2025年第1期23卷 109-114页

作者： Lichi Yuan School of Information Technology Jiangxi University of Finance and Economics Nanchang China Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang China

Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.

关键词：

来源：评论

学校读者我要写书评

暂无评论

On-Line System of Garbage Image-Orientated Intelligent Classification, Submission and Examination 20

On-Line System of Garbage Image-Orientated Intelligent Class...

引用

20th IEEE International Conference on e-Business engineering, ICEBE 2024

作者： Tian, Jiayin Wang, Yaozhi Liu, Jiaxin Chen, Yan School of Computer Science and Technology Xi'an Jiaotong University Shaanxi Xi'an China Xi'an Jiaotong University Shaanxi Key Lab of Big Data Knowledge Engineering Shaanxi Xi'an China School of Computer Science and Technology Xi'an Jiaotong University Xi'an Jiaotong University Shaanxi Key Lab of Big Data Knowledge Engineering Shaanxi Xi'an China

ISBN: (纸本)9798350365856

In a world brimming with new products continually, novel waste types are ubiquitous. This makes current image-based garbage classification systems difficult to perform well due to the long-tailed effects of distribution of garbage types, and necessitates an urgent and efficient garbage classification with abilities of detecting new and rare wastes and class-incremental learning for environmental sustainability. Therefore, we propose a framework of Online System of Garbage Image-Oriented Intelligent Classification, Submission, and Examination, facilitating the incremental garbage classification efforts. In which, to identify novel garbage effectively, we also introduced few-shot object detection method with two key algorithms: Two-Stage Object Detection Learning Algorithm and Dynamic Query-based Incremental Few-shot Learning Algorithm. Our experiment results show that Both outperform the current existing ones in dataset, MS COCO. Then, a strategy of Class-Incremental learning based Residual Network is proposed to meet the need of new waste class-incremental learning. The experimental results support our strategy. Finally, a prototype system employed the above algorithms and the strategy is described. © 2024 IEEE.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：