Real-time applications of handwriting analysis have increased drastically in the fields of forensic and information security because of accurate cues. One of such applications is human age estimation based on handwrit...
详细信息
Real-time applications of handwriting analysis have increased drastically in the fields of forensic and information security because of accurate cues. One of such applications is human age estimation based on handwriting for the purpose of immigrant checking. In this paper, we have proposed a new method for age estimation using handwriting analysis using Hu invariant moments and disconnectedness features. To make the proposed method robust to both ruled and un-ruled documents, we propose to explore intersection point detection in Canny edge images of each input document, which results in text components. For each text component pair, we propose Hu invariant moments for extracting disconnectedness features, which in fact measure multi-shape components based on distance, shape and mutual position analysis of components. Furthermore, iterative k-means clustering is proposed for the classification of different age groups. Experimental results on our dataset and some standard datasets, namely, IAM and KHATT, show that the proposed method is effective and outperforms the state-of-the-art methods.
Detecting text located on the torsos of marathon runners and sports players in video is a challenging issue due to poor quality and adverse effects caused by flexible/colorful clothing, and different structures of hum...
详细信息
Multi-view clustering has attracted more attention recently since many real-world data are comprised of different representations or views. Recent multi-view clustering works mainly exploit the instance consistency to...
详细信息
Multi-view clustering has attracted more attention recently since many real-world data are comprised of different representations or views. Recent multi-view clustering works mainly exploit the instance consistency to obtain the shared representations across different views, and apply a single-view clustering method to perform data partitions. However, these existing methods often ignore the inconsistency of instance associations within the views, which may enlarge the intra-class diversity among the views and therefore degrade the clustering performance. To address this issue, this paper proposes an efficient mutual contrastive teacher-student leaning (MC-TSL) model to enhance the multi-view clustering, which is the first attempt to study the inconsistency distillation for consistency learning. First, the proposed MC-TSL approach exploits a view-specific encoder with two heads, an instance encoding head and a semantic distillation head, respectively, for capturing the consistent and discriminative feature representations. To be specific, the former head exploits a cross-view contrastive learning method to obtain a redundancy-free consistent representation at the instance level, while the latter head designs a mutual teacher-student learning module to capture the intra-view information at semantic level. By training these two heads in an end-to-end manner, the discriminative multi-view embeddings are efficiently obtained and refined by minimizing the weighted sum of the reconstruction loss, contrastive loss and contrast distillation loss. Extensive experiments verify the superiorities of the proposed MC-TSL framework and show its competitive clustering performances.
Identifying crime for forensic investigating teams when crimes involve people of different nationals is challenging. This paper proposes a new method for ethnicity (nationality) identification based on Cloud of Line D...
详细信息
Recent studies have witnessed the effectiveness of 3D convolutions on segmenting volumetric medical images. Compared with the 2D counterparts, 3D convolutions can capture the spatial context in three dimensions. Never...
详细信息
Identifying crime for forensic investigating teams when crimes involve people of different nationals is challenging. This paper proposes a new method for ethnicity (nationality) identification based on Cloud of Line D...
详细信息
Identifying crime for forensic investigating teams when crimes involve people of different nationals is challenging. This paper proposes a new method for ethnicity (nationality) identification based on Cloud of Line Distribution (COLD) features of handwriting components. The proposed method, at first, uses tangent angle of the contour pixels in each row and the mean of intensity values of each row for segmenting text lines. For segmented text lines, we use tangent angle and direction of base lines to remove rule lines in the image. We use polygonal approximation for finding dominant points for contours of edge components. Then the proposed method connects the nearest dominant points of every dominant point, which results in line segments of dominant point pairs. For each line segment, the proposed method estimates angle and length, which gives a point in polar domain. For all the line segments, the proposed method generates dense points in polar domain, which results in COLD distribution. As character component shapes change, according to nationals, the shape of the distribution changes. This observation is extracted based on distance from pixels of distribution to Principal Axis of the distribution. Then the features are subjected to an SVM classifier for identifying nationals. Experiments are conducted on a complex dataset, which show the proposed method is effective and outperforms the existing method.
Radiation therapy (RT) is widely employed in the clinic for the treatment of head and neck (HaN) cancers. An essential step of RT planning is the accurate segmentation of various organs-at-risks (OARs) in HaN CT image...
详细信息
Methods developed for normal 2D text detection do not work well for text that is rendered using decorative, 3D effects, etc. This paper proposes a new method for classification of 2D and 3D natural scene text images s...
详细信息
Methods developed for normal 2D text detection do not work well for text that is rendered using decorative, 3D effects, etc. This paper proposes a new method for classification of 2D and 3D natural scene text images so that an appropriate recognition method can be chosen accordingly based on the classification results for better performance. The proposed method explores local gradient differences for obtaining candidate pixels, which represent a stroke. To study the spatial distribution of candidate pixels, we propose a measure, called COLD, which is denser for pixels toward the center of strokes and scattered for non-stroke pixels. This observation leads us to introduce mass features for extracting the regular spatial pattern of COLD, which indicates a 2D text image. The extracted features are fed into a Neural Network (NN) for classification. The proposed method is tested on (i) a new dataset introduced in this work (ii) a second dataset assembled from standard natural scene datasets (iii) Non-Text Image datasets which does not contain text, rather it contains objects. Experimental results of the proposed method on images with text and non-text show that the proposed method is independent of text. The proposed approach improves text detection and recognition performance significantly after classification.
The primary challenge in tracing the participants in sports and marathon video or images is to detect and localize the jersey/Bib number that may present in different regions of their outfit captured in cluttered envi...
详细信息
The primary challenge in tracing the participants in sports and marathon video or images is to detect and localize the jersey/Bib number that may present in different regions of their outfit captured in cluttered environment conditions. In this work, we proposed a new framework based on detecting the human body parts such that both Jersey Bib number and text is localized reliably. To achieve this, the proposed method first detects and localize the human in a given image using Single Shot Multibox Detector (SSD). In the next step, different human body parts namely, Torso, Left Thigh, Right Thigh, that generally contain a Bib number or text region is automatically extracted. These detected individual parts are processed individually to detect the Jersey Bib number/text using a deep CNN network based on the 2-channel architecture based on the novel adaptive weighting loss function. Finally, the detected text is cropped out and fed to a CNN-RNN based deep model abbreviated as CRNN for recognizing jersey/Bib/text. Extensive experiments are carried out on the four different datasets including both bench-marking dataset and a new dataset. The performance of the proposed method is compared with the state-of-the-art methods on all four datasets that indicates the improved performance of the proposed method on all four datasets.
作者:
Duo ChenJun ChengDacheng TaoCollege of Communication Engineering
Chongqing University Chongqing 400044 China. He is also with the Shenzhen Key Laboratory of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences. Shenzhen Institutes of Advanced Technology
Chinese Academy of Sciences Shenzhen 518055 China. He is also with the Chinese University of Hong Kong and Guangdong Provincial Key Laboratory of Robotics and Intelligent System. Center for Quantum Computation and Intelligent System
Faculty of Engineering and Information Technology University of Technology Sydney New South Wales 2007 Australia.
To facilitate human-robot interactions, human gender information is very important. Motivated by the success of manifold learning for visual recognition, we present a novel clustering-based discriminative locality ali...
详细信息
ISBN:
(数字)9781467317368
ISBN:
(纸本)9781467317375
To facilitate human-robot interactions, human gender information is very important. Motivated by the success of manifold learning for visual recognition, we present a novel clustering-based discriminative locality alignment (CDLA) algorithm to discover the low-dimensional intrinsic submanifold from the embedding high-dimensional ambient space for improving the face gender recognition performance. In particular, CDLA exploits the global geometry through k-means clustering, extracts the discriminative information through margin maximization and explores the local geometry through intra cluster sample concentration. These three properties uniquely characterize CDLA for face gender recognition. The experimental results obtained from the FERET data sets suggest the superiority of the proposed method in terms of recognition speed and accuracy by comparing with several representative methods.
暂无评论