咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Construction site accident ana... 收藏

Construction site accident analysis using text mining and natural language processing techniques

用文章采矿和自然语言处理技术的构造地点事故分析

作     者:Zhang, Fan Fleyeh, Hasan Wang, Xinru Lu, Minghui 

作者机构:Dalarna Univ Dept Comp Engn S-79188 Falun Sweden Univ Nottingham Ningbo Res Ctr Fluids & Thermal Engn Ningbo 315100 Zhejiang Peoples R China Shanghai Jiao Tong Univ Dept Comp Sci & Engn Shanghai 200030 Peoples R China 

出 版 物:《AUTOMATION IN CONSTRUCTION》 (建造自动化)

年 卷 期:2019年第99卷

页      面:238-248页

核心收录:

学科分类:08[工学] 0813[工学-建筑学] 0814[工学-土木工程] 

基  金:The dataset used in this study is published and processed by Yang Miang Goh and C.U. Ubeynarayana . Download link is: https://github.com/safetyhub/OSHA_Acc.git. The original data can be downloaded from below link: https://www.osha.gov/pls/imis/accidentsearch.html (Occupational Safety and Health Administration  2016) 

主  题:Construction site accident analysis Machine learning Text mining Natural language processing Sequential quadratic programming Optimization 

摘      要:Workplace safety is a major concern in many countries. Among various industries, construction sector is identified as the most hazardous work place. Construction accidents not only cause human sufferings but also result in huge financial loss. To prevent reoccurrence of similar accidents in the future and make scientific risk control plans, analysis of accidents is essential. In construction industry, fatality and catastrophe investigation summary reports are available for the past accidents. In this study, text mining and natural language process (NLP) techniques are applied to analyze the construction accident reports. To be more specific, five baseline models, support vector machine (SVM), linear regression (LR), K-nearest neighbor (KNN), decision tree (DT), Naive Bayes (NB) and an ensemble model are proposed to classify the causes of the accidents. Besides, Sequential Quadratic Programming (SQP) algorithm is utilized to optimize weight of each classifier involved in the ensemble model. Experiment results show that the optimized ensemble model outperforms rest models considered in this study in terms of average weighted F1 score. The result also shows that the proposed approach is more robust to cases of low support. Moreover, an unsupervised chunking approach is proposed to extract common objects which cause the accidents based on grammar rules identified in the reports. As harmful objects are one of the major factors leading to construction accidents, identifying such objects is extremely helpful to mitigate potential risks. Certain limitations of the proposed methods are discussed and suggestions and future improvements are provided.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分