this book constitutes the refereed proceedings of the 11thchineseconference on Image and Graphics Technologies and Applications, IGTA 2016, held in Beijing, China in July 2016. the 27 papers presented were carefully...
ISBN:
(数字)9789811022609
ISBN:
(纸本)9789811022593;9789811022609
this book constitutes the refereed proceedings of the 11thchineseconference on Image and Graphics Technologies and Applications, IGTA 2016, held in Beijing, China in July 2016. the 27 papers presented were carefully reviewed and selected from 69 submissions. they provide a forum for sharing progresses in the areas of image processing technology; image analysis and understanding; computervision and patternrecognition; big data mining, computer graphics and VR; as well as image technology applications.
Convolutional neural network (CNN) has achieved tremendous success in handwritten chinese character recognition (HCCR). However, most CNN-based HCCR research nowadays focus on complicated and deep CNN module, rarely a...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
Convolutional neural network (CNN) has achieved tremendous success in handwritten chinese character recognition (HCCR). However, most CNN-based HCCR research nowadays focus on complicated and deep CNN module, rarely analyzing the whole feature extraction process which has a crucial impact on the final recognition rate. In this paper, the following two questions are answered: (1). Information loss is inevitable on the training stage of complex learning problems, but at which layer does the information loss mainly occur;(2). Different layers have different effects on CNN, what is the best place for multistage feature extraction that influences CNN most. We make use of the proposed module in typical CNN and analyze classification results on CASIA-HWDB1.1. It is shown in this paper that, (1). Multi-stage feature extraction achieves better performance on HCCR than single stage feature extraction. (2). Multi-stage feature extraction should be designed at the convolution layer rather than the pooling layer. (3). Multi-stage feature extraction designed at shallow layers outperforms that designed at deeper layers. By analyzing the structure of multistage feature extraction, we propose an appropriate CNN approach to HCCR, which achieves a new state-of-the-art recognition accuracy of 91.89 %.
Semantic segmentation is of great importance to various vision applications. Depth information plays an important role in human visual system to help people obtain meaningful segmentation results, but it is not well c...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
Semantic segmentation is of great importance to various vision applications. Depth information plays an important role in human visual system to help people obtain meaningful segmentation results, but it is not well considered by most existing segmentation methods. In this paper, we address the problem of semantic segmentation by incorporating depth information via deep neural Markov Random Field. In our method, the color image and its corresponding depth map are first fed to a convolutional neural network. then, a deconvolution approach is performed on the network output to obtain the pixelwise prediction in terms of the probability of labels assigned to pixels. Finally, the dense prediction is used to design unary term and pairwise term, which are determined by pixels coordinate, color and depth. Experiments are conducted on several public datasets to illustrate the effectiveness of the proposed method. On the PASCAL VOC 2011 test dataset, experimental results show that our method can get accurate results when compared withthe ground truth. On the PASCAL VOC 2012 dataset and NYUDv2 dataset, the proposed method can obtain competitive results.
In this paper, we propose to adapt the recurrent neural network (RNN) based language model to improve the performance of multi-accent Mandarin speech recognition. N-gram based language model can be easily applied to s...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
In this paper, we propose to adapt the recurrent neural network (RNN) based language model to improve the performance of multi-accent Mandarin speech recognition. N-gram based language model can be easily applied to speech recognition system, but it is hard to describe the long span information in a sentence and arises a serious phenomenon of data sparsity. Instead, RNN based language model can overcome these two shortcomings, but it will take a long time to decode directly. Taking these into consideration, this paper proposes a method which combines these two types of language model (LM) together and adapts the RNN based language model to rescore lattices for different accents of Mandarin speech. the architecture of the adapted RNN LM is accent-specific top layers and shared hidden layer. the accent-specific top layers are used to adapt different accents and the shared hidden layer stores history information, which can be seen as a memory layer. Experiments on the RASC863 corpus show that the proposed method can improve the performance of accented Mandarin speech recognition over the baseline system.
A new method of selecting landmarks on 2D shapes which are represented by Centripetal Catmull-Rom spline is proposed in this paper. Firstly, a mean shape is generated from training set and landmarks on mean shape are ...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
A new method of selecting landmarks on 2D shapes which are represented by Centripetal Catmull-Rom spline is proposed in this paper. Firstly, a mean shape is generated from training set and landmarks on mean shape are extracted based on curvature and arc-length information. then the corresponding landmarks on each shape can be obtained by projecting the mean shape back to each sample using non-rigid registration method Coherent Point Drift. Experiments showed that landmarks auto-generated are more accurate than landmarks manual annotated when used in segmentation.
Crowd collectiveness measuring has attracted a great deal of attentions in recently years. We adopt the path integral descriptor idea to measure the collectiveness of a crowd system. A new path integral descriptor is ...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
Crowd collectiveness measuring has attracted a great deal of attentions in recently years. We adopt the path integral descriptor idea to measure the collectiveness of a crowd system. A new path integral descriptor is proposed by exponent generating function to avoid parameter setting. Several good properties of the proposed path integral descriptor are demonstrated in this paper. the proposed path integral descriptor of a set is regard as the collectiveness measure of a set, which can be a moving system such as human crowd, sheep herd and so on. Self-driven particle (SDP) model and the crowd motion database are used to test the ability of the proposed method in measuring collectiveness.
this paper presents a fused feature using dual cameras for face spoofing detection. the feature takes full advantage of input image pairs in terms of texture and depth. It consists of two parts: 2D component and 3D co...
详细信息
ISBN:
(纸本)9789811030024;9789811030017
this paper presents a fused feature using dual cameras for face spoofing detection. the feature takes full advantage of input image pairs in terms of texture and depth. It consists of two parts: 2D component and 3D component. For the former, we propose an algorithm based on image similarity to combine every pair of input images into one gray-level image, from which the 2D feature is extracted. For the latter, based on point feature histograms (PFH) method, we describe the point cloud obtained by stereo reconstruction algorithms. the concatenation of 2D and 3D features above is used to represent the input image pair. Experiments on self collected dataset demonstrate the competitive performance and potential of the proposed feature.
Surveillance is very essential for the safety of power substation. the detection of whether wearing safety helmets or not for perambulatory workers is the key component of overall intelligent surveillance system in po...
详细信息
ISBN:
(纸本)9781538604915
Surveillance is very essential for the safety of power substation. the detection of whether wearing safety helmets or not for perambulatory workers is the key component of overall intelligent surveillance system in power substation. In this paper, a novel and practical safety helmet detection framework based on computervision, machine learning and image processing is proposed. In order to ascertain motion objects in power substation, the ViBe background modelling algorithm is employed. Moreover, based on the result of motion objects segmentation, real-time human classification framework C4 is applied to locate pedestrian in power substation accurately and quickly. Finally, according to the result of pedestrian detection, the safety helmet wearing detection is implemented using the head location, the color space transformation and the color feature discrimination. Extensive compelling experimental results in power substation illustrate the efficiency and effectiveness of the proposed framework.
License plate detection is a crucial part in license plate recognition systems and is often considered as a solved problem. However, there are still plenty of complex scenes where the current methods are invalidated. ...
详细信息
ISBN:
(纸本)9789811030055;9789811030048
License plate detection is a crucial part in license plate recognition systems and is often considered as a solved problem. However, there are still plenty of complex scenes where the current methods are invalidated. In order to increase the performance in these scenes, we propose a novel character-based method to detect multiple license plates in complex images. Firstly, a preprocessing step is performed. then we use a modified maximally stable extremal region (MSER) based detector called MSER-+ to detect the possible character regions. Some of the regions are removed according to their geographical information. Hierarchical morphology helps to connect candidate MSERs of various sizes. the regions satisfying some geographical limits will be fed into a convolutional neural network (CNN) model for further verification. Extensive experimental results validate that our method works well in a large variety of complex scenes.
the proceedings contain 40 papers. the special focus in this conference is on Feature Extraction, computervision and patternrecognition. the topics include: On the benefit of state separation for tracking in image s...
ISBN:
(纸本)9783319336176
the proceedings contain 40 papers. the special focus in this conference is on Feature Extraction, computervision and patternrecognition. the topics include: On the benefit of state separation for tracking in image space with an interacting multiple model filter;feature asymmetry of the conformal monogenic signal;edge detection based on riesz transform;otolithrecognition system using a normal angles contour;a hybrid combination of multiple SVM classifiers for automatic recognition of the damages and symptoms on plant leaves;leaf classification using convexity measure of polygons;privacy preserving dynamic room layout mapping;defect detection on patterned fabrics using entropy cues;curve extraction by geodesics fusion;a chaotic cryptosystem for color image with dynamic look-up table;nonlinear estimation of chromophore concentrations and shading from hyperspectral images;a color image database for haze model and dehazing methods evaluation;collaborative unmixing hyperspectral imagery via nonnegative matrix factorization;a new method for arabic text detection in natural scene image based on the color homogeneity;measuring spectral reflectance and 3d shape using multi-primary image projector;computervision color constancy from maximal projections mean assumption;demosaicking method for multispectral images based on spatial gradient and inter-channel correlation;single image super-resolution using sparse representation on a K-NN dictionary;super-resolved enhancement of a single image and its application in cardiac MRI;speaker classification via supervised hierarchical clustering using ICA mixture model and speaker discrimination using several classifiers and a relativistic speaker characterization.
暂无评论