oracleboneinscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current...
oracle character is one kind of the earliest hieroglyphics, which can be dated back to Shang Dynasty in China. oracle character recognition is important for modern archaeology, ancient text understanding, and historic...
详细信息
oracle character is one kind of the earliest hieroglyphics, which can be dated back to Shang Dynasty in China. oracle character recognition is important for modern archaeology, ancient text understanding, and historical chronology, etc. To overcome the limitation and class imbalance of training data in oracle character recognition, we propose a classification method based on deep metric learning. We use a convolutional neural network (CNN) to map the character images to an Euclidean space where the distance between different samples can measure their similarities such that classification can be performed by the Nearest Neighbor (NN) rule. Because new categories are still being discovered in reality, our model enables the rejection of unseen categories and the configuration of new categories. To accelerate NN classification, we also propose a prototype pruning method with little loss of accuracy. The proposed method exceeds the state of the art on the public dataset oracle-20K and outperforms CNN with softmax layer on a new dataset oracle-AYNU.
The oraclebone character (OBC) from ancient China is the most famous ancient writing systems around the world. Identifying and deciphering OBCs is one of the most important topics in oraclebone study. In research, o...
The oraclebone character (OBC) from ancient China is the most famous ancient writing systems around the world. Identifying and deciphering OBCs is one of the most important topics in oraclebone study. In research, one of the challenges is that the literature review usually leads to a huge cost of time and manpower. Therefore, the digitazation of OBC literature through the automatic recognition is the inevitable trend of future development. However, the OBCs in the literature are usually writing characters while the database of handwriting OBC has not yet been presented. In this paper, we establish a handwriting oraclebone character database called HWOBC, containing 83,245 character-level samples which are grouped into 3881-character categories. We also present the performance of several baseline DCNN-based methods, in which Melnyk-Net exhibits the best accuracy of 97.64%. It is anticipated that the publication of this database will facilitate the development of OBC research.
暂无评论