At present, there are more and more system applications using gestures to operate the mouse to interact, but there is no convenient and effective application system for inputting Chinese characters into the computer s...
详细信息
oracle character is one kind of the earliest hieroglyphics, which can be dated back to Shang Dynasty in China. oracle character recognition is important for modern archaeology, ancient text understanding, and historic...
详细信息
oracle character is one kind of the earliest hieroglyphics, which can be dated back to Shang Dynasty in China. oracle character recognition is important for modern archaeology, ancient text understanding, and historical chronology, etc. To overcome the limitation and class imbalance of training data in oracle character recognition, we propose a classification method based on deep metric learning. We use a convolutional neural network (CNN) to map the character images to an Euclidean space where the distance between different samples can measure their similarities such that classification can be performed by the Nearest Neighbor (NN) rule. Because new categories are still being discovered in reality, our model enables the rejection of unseen categories and the configuration of new categories. To accelerate NN classification, we also propose a prototype pruning method with little loss of accuracy. The proposed method exceeds the state of the art on the public dataset oracle-20K and outperforms CNN with softmax layer on a new dataset oracle-AYNU.
Video question answering (Video QA) has received much attention in recent years. It can answer questions according to the visual content of a video clip. Video QA task can be solved only according to the video data. B...
Video question answering (Video QA) has received much attention in recent years. It can answer questions according to the visual content of a video clip. Video QA task can be solved only according to the video data. But if the video clip has some relevant text information, It can also be solved by using the fused video and text data. How to select the useful region features from the video frames and select the useful text features from the text information needs to be solved. And how to fuse the video and text features also needs to be solved. Therefore, we propose a forget memory network to solve these problems. The forget memory network with video framework can solve Video QA task only according to the video data. It can select the useful region features for the question and forget the irrelevant region features from the video frames. The forget memory network with video and text framework can extract the useful text features and forget the irrelevant text features for the question. And it can fuse the video and text data to solve Video QA task. The fused video and text features can help improve the experimental performance.
Deep convolutional neural networks trained with strong pixel-level supervision have recently significantly boosted the performance in semantic image segmentation. The receptive field is a crucial issue in such visual ...
Deep convolutional neural networks trained with strong pixel-level supervision have recently significantly boosted the performance in semantic image segmentation. The receptive field is a crucial issue in such visual tasks, as the output must capture enough information about large objects to make a better decision. In DCNNs, the theoretical receptive field size could be very large, but the effective receptive field may be quite small. The latter is an really important factor in performance. In this work, we defined a method of measuring effective receptive field. We observed that stacking layers with large receptive field can increase the size of receptive field and increase the density of receptive field. Based on the observation, we designed a Dense Global Context Module, which makes the effective receptive field coverage larger and density higher. With the Dense Global Context Module, segmentation model reduces a large number of parameters while the performance has been substantially improved. Massive experiments proved that our Dense Global Context Module exhibits very excellent performance on the PASCAL VOC2012 and PASCAL CONTEXT data set.
oracleboneinscriptions (OBI) refers to incised ancient Chinese characters found on oraclebones, which are animal bones or turtle shells used in divination in Bronze Age China. The vast majority record the pyromanti...
详细信息
oracleboneinscriptions (OBI) refers to incised ancient Chinese characters found on oraclebones, which are animal bones or turtle shells used in divination in Bronze Age China. The vast majority record the pyromania...
详细信息
暂无评论