检索结果-内蒙古大学图书馆

International Workshop on Frontiers in Handwriting recognition

作者： Ali Mirza Momina Moetesum Imran Siddiqi Chawki Djeddi Center of Computer Vision and Pattern Recognition Bahria University Islamabad Pakistan LAMIS Laboratory Larbi Tebessi University Tebessa Algeria

ISBN: (纸本)9781509009824

Prediction of gender and other demographic attributes of individuals from handwriting samples offers an interesting basic, as well as applied research problem. The correlation between gender and the visual appearance of handwriting has been validated by a number of studies and the present study is based on the same idea. We exploit the textural measurements as the discriminating attribute between male and female writings. The textural information in a writing is captured by applying a bank of Gabor filters to the image of handwriting. The mean and standard deviation values of the filter responses are collected in matrix and the Fourier transform of the matrix is used as a feature. Classification is carried out using a feed forward neural network. The proposed technique evaluated on a subset of the QUWI database realized promising results under different experimental settings.

关键词： Writing Databases Feature extraction Visualization Training Correlation Standards

来源：评论

学校读者我要写书评

暂无评论

Communication via eye blinks and eyebrow raises: Video-based human-computer interfaces

引用

Universal Access in the Information Society 2003年第4期2卷 359-373页

作者： Grauman, K. Betke, M. Lombardi, J. Gips, J. Bradski, G.R. Vision Interface Group AI Laboratory Massachusetts Institute of Technology 77 Massachusetts Avenue CambridgeMA02139 United States Computer Science Department Boston University 111 Cummington St BostonMA02215 United States EagleEyes Computer Science Department Boston College Fulton Hall Chestnut HillMA02467 United States Vision Graphics and Pattern Recognition Microcomputer Research Laboratory Intel Corporation SC12-303 2200 Mission College Blvd Santa ClaraCA95054-1537 United States

Two video-based human-computer interaction tools are introduced that can activate a binary switch and issue a selection command. "BlinkLink," as the first tool is called, automatically detects a user's eye blinks and accurately measures their durations. The system is intended to provide an alternate input modality to allow people with severe disabilities to access a computer. Voluntary long blinks trigger mouse clicks, while involuntary short blinks are ignored. The system enables communication using "blink patterns:" sequences of long and short blinks which are interpreted as semiotic messages. The second tool, "EyebrowClicker," automatically detects when a user raises his or her eyebrows and then triggers a mouse click. Both systems can initialize themselves, track the eyes at frame rate, and recover in the event of errors. No special lighting is required. The systems have been tested with interactive games and a spelling program. Results demonstrate overall detection accuracy of 95.6% for BlinkLink and 89.0% for EyebrowClicker. © Springer-Verlag 2003.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Morpheme-based feature-rich language models using Deep Neural Networks for LVCSR of Egyptian Arabic

Morpheme-based feature-rich language models using Deep Neura...

引用

2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013

作者： El-Desoky Mousa, Amr Kuo, Hong-Kwang Jeff Mangu, Lidia Soltau, Hagen Human Language Technology and Pattern Recognition Computer Science Department RWTH Aachen University 52056 Aachen Germany IBM T. J. Watson Research Center Yorktown Heights NY 10598 United States

ISBN: (纸本)9781479903566

Egyptian Arabic (EA) is a colloquial version of Arabic. It is a low-resource morphologically rich language that causes problems in Large Vocabulary Continuous Speech recognition (LVCSR). Building LMs on morpheme level is considered a better choice to achieve higher lexical coverage and better LM probabilities. Another approach is to utilize information from additional features such as morphological tags. On the other hand, LMs based on Neural Networks (NNs) with a single hidden layer have shown superiority over the conventional n-gram LMs. Recently, Deep Neural Networks (DNNs) with multiple hidden layers have achieved better performance in various tasks. In this paper, we explore the use of feature-rich DNN-LMs, where the inputs to the network are a mixture of words and morphemes along with their features. Significant Word Error Rate (WER) reductions are achieved compared to the traditional word-based LMs. © 2013 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax 1

引用

16th European Conference on computer vision, ECCV 2020

作者： Zhang, Xiao Zhao, Rui Qiao, Yu Li, Hongsheng CUHK-SenseTime Joint Lab The Chinese University of Hong Kong Hong Kong SenseTime Research Hong Kong ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen China

ISBN: (数字)9783030585747

ISBN: (纸本)9783030585730

Deep neural networks have achieved remarkable successes in learning feature representations for visual classification. However, deep features learned by the softmax cross-entropy loss generally show excessive intra-class variations. We argue that, because the traditional softmax losses aim to optimize only the relative differences between intra-class and inter-class distances (logits), it cannot obtain representative class prototypes (class weights/centers) to regularize intra-class distances, even when the training is converged. Previous efforts mitigate this problem by introducing auxiliary regularization losses. But these modified losses mainly focus on optimizing intra-class compactness, while ignoring keeping reasonable relations between different class prototypes. These lead to weak models and eventually limit their performance. To address this problem, this paper introduces a novel Radial Basis Function (RBF) distances to replace the commonly used inner products in the softmax loss function, such that it can adaptively assign losses to regularize the intra-class and inter-class distances by reshaping the relative differences, and thus creating more representative prototypes of classes to improve optimization. The proposed RBF-Softmax loss function not only effectively reduces intra-class distances, stabilizes the training behavior, and reserves ideal relations between prototypes, but also significantly improves the testing performance. Experiments on visual recognition benchmarks including MNIST, CIFAR-10/100, and ImageNet demonstrate that the proposed RBF-Softmax achieves better results than cross-entropy and other state-of-the-art classification losses. The code is at https://***/2han9x1a0release/RBF-Softmax. © 2020, Springer Nature Switzerland AG.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

An efficient algorithm for detection of road-like structures in satellite images

An efficient algorithm for detection of road-like structures...

引用

International Conference on pattern recognition

作者： A. Mukherjee S.K. Parui D. Chaudhuri B.B. Chaudhuri R. Krishnan Indian Statistical Institute Computer Vision and Pattern Recognition Unit Calcutta India Advanced Data Processing Research Institute Secunderabad India

Road networks are important features of satellite imagery. The main contribution of the present road detection method consists of an effective enhancement technique and an efficient segmentation technique that removes non-road pixels step by step from the image where parameters involved: in each step images are determined by the sensor characteristics (like spatial resolution and spectral range) of the satellite. Also, the segmentation process depends not only on the road contrast but also on the road length. Thus, a low contrast but long road segment does not get removed. We have tested the algorithm on a number of images from IRS and SPOT satellites and the results are satisfactory.

关键词： Satellites Roads Image segmentation Pixel Sensor phenomena and characterization Spatial resolution Joining processes Intelligent networks computer vision pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Multiple Classifier Systems 1

引用

丛书名： Lecture Notes in computer Science

1000年

作者： Nikunj C. Oza Robi Polikar Josef Kittler Fabio Roli

来源：评论

学校读者我要写书评

暂无评论

Handbook of Face recognition 1

引用

1000年

作者： Stan Z. Li Anil K. Jain

来源：评论

学校读者我要写书评

暂无评论

An Hybrid Attention-Based System for the Prediction of Facial Attributes 4th

An Hybrid Attention-Based System for the Prediction of Fac...

引用

4th International Workshop on Brain-Inspired Computing, BrainComp 2019

作者： Khellat-Kihel, Souad Sun, Zhenan Tistarelli, Massimo Computer Vision Laboratory University of Sassari Viale Italia 39 Sassari07100 Italy Center for Research on Intelligent Perception and Computing National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences Room 1605 Intelligence Bulding 95 Zhongguancun East Road Beijing100190 China Computer Vision Laboratory Department of Biomedical Sciences and Information Technology University of Sassari Viale S. Pietro 43/b Sassari07100 Italy

ISBN: (纸本)9783030824266

Recent research on face analysis has demonstrated the richness of information embedded in feature vectors extracted from a deep convolutional neural network. Even though deep learning achieved a very high performance on several challenging visual tasks, such as determining the identity, age, gender and race, it still lacks a well grounded theory which allows to properly understand the processes taking place inside the network layers. Therefore, most of the underlying processes are unknown and not easy to control. On the other hand, the human visual system follows a well understood process in analyzing a scene or an object, such as a face. The direction of the eye gaze is repeatedly directed, through purposively planned saccadic movements, towards salient regions to capture several details. In this paper we propose to capitalize on the knowledge of the saccadic human visual processes to design a system to predict facial attributes embedding a biologically-inspired network architecture, the HMAX. The architecture is tailored to predict attributes with different textural information and conveying different semantic meaning, such as attributes related and unrelated to the subject’s identity. Salient points on the face are extracted from the outputs of the S2 layer of the HMAX architecture and fed to a local texture characterization module based on LBP (Local Binary pattern). The resulting feature vector is used to perform a binary classification on a set of pre-defined visual attributes. The devised system allows to distill a very informative, yet robust, representation of the imaged faces, allowing to obtain high performance but with a much simpler architecture as compared to a deep convolutional neural network. Several experiments performed on publicly available, challenging, large datasets demonstrate the validity of the proposed approach. © 2021, The Author(s).

关键词： Network architecture

来源：评论

学校读者我要写书评

暂无评论

Automatic Detection of Handwritten Texts from Video Frames of Lectures

Automatic Detection of Handwritten Texts from Video Frames o...

引用

International Workshop on Frontiers in Handwriting recognition

作者： Purnendu Banerjee Ujjwal Bhattacharya Bidyut B. Chaudhuri Society for Natural Language Technology Research Kolkata India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India

Automatic recognition of handwritten texts in video lectures has important applications. In video lectures, the presenter usually writes on white / colored board. The video camera often captures the writing board along with certain other objects possibly including the presenter itself. recognition of handwritten texts from such a video frame requires prior detection of the region of texts in the frame. In this article, we present our recent study of text localization in such video lecture frames. Here, we use Scale Invariant Feature Transform (SIFT) descriptors densely over the entire region of the frame. The descriptors are located on a regular grid of 5 pixels following the usual practice and considered a uniform patch size of 60 × 60 pixels as its support on the basis of an empirical study. This SIFT descriptor at each location (grid point) is fed as a 128-dimensional input feature vector to a Multilayer Perceptron (MLP) network which gives response for each grid point as either text or non-text. Depending on certain aggregate response at each pixel we localize text regions in the input video frame. Next, we employ K-means clustering to detect the text components present in the localized region of the video frame. Finally, two simple rules are applied to decide certain possible detected text components as noise. We obtained encouraging simulation results of this approach on a variety of video lecture frames.

关键词： Handwriting recognition Text recognition Training Databases Image color analysis Cameras Noise

来源：评论

学校读者我要写书评

暂无评论

A High Semantic Representation for Abnormal Events Detection in Crowded Scenes

A High Semantic Representation for Abnormal Events Detection...

引用

2016 IEEE 7th International Conference on Software Engineering and Service Science

作者： Ye Tao Ye Jin Peng Liu Pattern Recognition and Intelligent System Research Center School of Computer Science and TechnologyHarbin Institute of Technology

Many crowd abnormal motion detection methods in video surveillance have been proposed in resent ***,most of them are based on low semantic features,such gray value,velocity and ***,low semantic features contain weak discriminative information of the *** addition,these methods often ignore important information in time and space *** this work,a high semantic representation is *** feature analysis（SFA） is adopted to provide high semantic ***,a random walk model,which takes into account the spatio-temporal information,is used to detect the abnormal motions in *** conduct extensive experiments on two datasets to demonstrate the effectiveness of proposed *** results suggest that our method outperforms the state-of-the-art methods.

关键词： Video surveillance Crowd analysis Abnormal events Slow feature analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：