ICD coding is usually considered as a multi-label prediction task, which assigns accurate multiple ICD codes to clinical texts. In this paper, we present a novel way of injecting prior knowledge of hierarchical struct...
详细信息
ISBN:
(纸本)9783030923099;9783030923105
ICD coding is usually considered as a multi-label prediction task, which assigns accurate multiple ICD codes to clinical texts. In this paper, we present a novel way of injecting prior knowledge of hierarchical structures into BERT (HieBERT) to predict ICD codes automatically. Hierarchical structures consist of code tree positions and code tree sequence LSTM embeddings. We generate them as hierarchical representations of ICD codes. Besides, we train a clinical BERT model on millions of clinical texts to capture contextual and co-occurrence information. Then, we propose an aligning method to map the hierarchical representations into BERT vector space. We implement HieBERT on a widely used dataset: MIMIC-III. And the experimental results indicate that our proposed model achieves great performance compared with previous works.
暂无评论