Eye semantic segmentation is a fundamental task in many works such as identification and medical applications. In this study, three encoder-decoder architectures using convolutional neural network are applied to segme...
详细信息
ISBN:
(纸本)9781450372633
Eye semantic segmentation is a fundamental task in many works such as identification and medical applications. In this study, three encoder-decoder architectures using convolutional neural network are applied to segment the eyes. A simple encoder-decoder architecture is capable of generating only coarse segmentation results. On the other hand, fine details like eyelashes can be achieved by U-net and SegNet architectures. However, they sometimes produce overall results worse than the simple one. To resolve this problem, we introduce a deep convolutional neural network-based ensemble technique for eye segmentation. The results from those architectures are combined in order to yield good results in both coarse-level and fine-level segmentation. In the proposed technique, a trainable mask function is applied to achieve an optimal ensemble of coarse-level and fine-level results. Our dataset comprises 64 eye images from different environments, camera settings, people, and eye conditions. Experimental results show that our ensemble technique can improve the results from the conventional architectures. The proposed ensemble method manages to reach the average accuracy of 96.33% for three-class segmentation.
Many computer vision applications rely on segmentation task. To achieve a good result on Handwritten text recognition (HTR), character segmentation is significant in terms of extracting each individual character. In t...
详细信息
ISBN:
(纸本)9781450372633
Many computer vision applications rely on segmentation task. To achieve a good result on Handwritten text recognition (HTR), character segmentation is significant in terms of extracting each individual character. In this study, we propose a novel algorithm for tackling offline handwritten character segmentation, particularly for the Thai language. Not only are the characteristics of the Thai language described, but also the problems when performing Thai character segmentation are defined. There are two parts of segmentation: horizontal link segmentation and vertical link segmentation. The chosen type of algorithm is convolutional encoder-decoder network. Our models are based on the renowned encoder-decoder models, U-net and SegNet. The best horizontal link segmentation model achieves up to 0.929 F1-score on the real-world test set. For the vertical link segmentation, the best models of topmost, upper, base, and lower characters attains F1-scores of 0.799, 0.873, 0.932, and 0.820, respectively.
In this paper, we propose a non-task-oriented dialogue system controlling the utterance length. The dialogue system can be classified into a task-oriented dialogue system or a non-task-oriented dialogue system. Recent...
详细信息
ISBN:
(纸本)9781538626344;9781538626337
In this paper, we propose a non-task-oriented dialogue system controlling the utterance length. The dialogue system can be classified into a task-oriented dialogue system or a non-task-oriented dialogue system. Recently, demand for the non-task-oriented dialogue system is increasing. The utterance length is one of the important information in a dialogue system. In general, our utterance length tends to be long when we are speakers. On the other hand, the length of our utterance tends to be short when we are listeners. In addition, the utterance length differs from person to person, so we change our utterance length for friendly communication. The effect of the utterance length has never considered in dialogue systems using encoder-decoder model. Therefore, we propose an utterance length estimator (ULE) and an index of the utterance length. ULE is a neural network which learns the utterance length by training data of dialogue. The index of the utterance length is the parameter considers user's personality and it is calculated during dialogue. Our dialogue system decides the length of system's utterance by ULE and index of the utterance length, and generates output sequences by using a neural encoder-decoder controlling output length. Experimental results show our system can decide the appropriate length of the utterance and makes users more satisfied than the conventional method.
暂无评论