image thinning is one of the fundamental pre-processing steps prior to the shape representation in the image analysis of printed and handwritten alphabet recognition applications. In this paper, an approach is introdu...
详细信息
image annotation plays a vital role in dealing with effective organization and retrieval of a large number of digital images. Multi-instance multi-label (MIML) learning can deal with complicated objects by solving the...
详细信息
Color image demosaicking is key in developing low-cost digital cameras using a color filter array(CFA). Similarly, multispectral image demosaicking can be used to develop low-cost and portable multispectral cameras us...
详细信息
In this work, we have designed a local descriptor based on the reassigned Stankovic time frequency distribution. The Stankovic distribution is one of the improved extensions of the well knownWignerWille distribution. ...
详细信息
Haze poses challenges in many vision-related applications. Thus, dehazing an image becomes popular among vision researchers. Available methods use various priors, deep learning models, or a combination of both to get ...
详细信息
In this paper, the problem of estimating the depth of an object from its monocular image is addressed. Here, basically, an algorithm is developed, which performs shape matching, and as a result, achieve accurate depth...
详细信息
image-to-image translation is an emerging method of computervision dataset augmentation, which allows transferring the style of real life images onto synthetic ones, making them more realistic. In our work we propose...
详细信息
ISBN:
(纸本)9781728190808
image-to-image translation is an emerging method of computervision dataset augmentation, which allows transferring the style of real life images onto synthetic ones, making them more realistic. In our work we propose an incremental improvement over the adversarial learning generator architectures used by image-to-image translation models. First, we managed to use a single network, instead of 2, thus creating a more memory-efficient model, which allowed for an end-to-end training on high resolutions. Second, inspired from recent work on semantic segmentation architectures, we enhanced our model by implying a multi-scale encoding and stylization phase, allowing for a better control over the contextual and spatial features. Given a synthetic image, our framework allows for its mullimodal translation into the real domain. Our model shows promising results at narrowing the semantic gap between synthetic and real data.
This paper presents a bus route number recognition using imageprocessing and machine learning techniques. The prototype is designed to extract bus route number from scene images and synthesize audio of bus number for...
详细信息
ISBN:
(纸本)9781509020331
This paper presents a bus route number recognition using imageprocessing and machine learning techniques. The prototype is designed to extract bus route number from scene images and synthesize audio of bus number for people with low vision. The proposed system consists of two parts, text detection and digit recognition. Text detection based on MSERs is used to find and segment text area from input images. Digit recognizer based on Convolutional Neural Network is used to recognize digits on the detected text area. Finally, the recognized digit is converted to audio which can help the visually impaired people (low vision) to take the correct bus. The experimental results of the proposed method presented in precision, recall and F-measure in text detection and acquire 73.47% accuracy of a digit segment in self-collect Thai bus dataset.
In this paper we propose a publicly available static hand pose database called OUHANDS and protocols for training and evaluating hand pose classification and hand detection methods. A comparison between the OUHANDS da...
详细信息
ISBN:
(纸本)9781467389105
In this paper we propose a publicly available static hand pose database called OUHANDS and protocols for training and evaluating hand pose classification and hand detection methods. A comparison between the OUHANDS database and existing databases is given. Baseline results for both of the protocols are presented.
This paper presents the results of a program that was written to convert a Web page to adhere to a consistent layout. images and adverts are removed from the consistent layout. A usability test was conducted and eye t...
详细信息
ISBN:
(纸本)9789898533524
This paper presents the results of a program that was written to convert a Web page to adhere to a consistent layout. images and adverts are removed from the consistent layout. A usability test was conducted and eye tracking data was gathered. The results of experiment showed that users had different eye movements on Web pages depending on layout and design.
暂无评论