检索结果-内蒙古大学图书馆

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Gokhale, Vinayak Jin, Jonghoon Dundar, Aysegul Martini, Berin Culurciello, Eugenio Purdue Univ W Lafayette IN 47907 USA Purdue Univ Weldon Sch Biomed Engn W Lafayette IN 47907 USA

ISBN: (纸本)9781479943098

Deep networks are state-of-the-art models used for understanding the content of images, videos, audio and raw input data. Current computing systems are not able to run deep network models in real-time with low power consumption. In this paper we present nn-X: a scalable, low-power coprocessor for enabling real-time execution of deep neural networks. nn-X is implemented on programmable logic devices and comprises an array of configurable processing elements called collections. these collections perform the most common operations in deep networks: convolution, subsampling and non-linear functions. the nn-X system includes 4 high-speed direct memory access interfaces to DDR3 memory and two ARM Cortex-A9 processors. Each port is capable of a sustained throughput of 950 MB/s in full duplex. nn-X is able to achieve a peak performance of 227 G-ops/s, a measured performance in deep learning applications of up to 200 G-ops/s while consuming less than 4 watts of power. this translates to a performance per power improvement of 10 to 100 times that of conventional mobile and desktop processors.

关键词： computer vision convolutional neural networks embedded vision system hardware acceleration machine learning

来源：评论

学校读者我要写书评

暂无评论

Driver Cell Phone Usage Detection From HOV/HOT NIR Images

Driver Cell Phone Usage Detection From HOV/HOT NIR Images

引用

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Artan, Yusuf Bulan, Orhan Loce, Robert P. Paul, Peter Xerox Res Ctr Webster Webster NY 14580 USA

ISBN: (纸本)9781479943098

Distracted driving due to cell phone usage is an increasingly costly problem in terms of lost lives and damaged property. Motivated by its impact on public safety and property, several state and federal governments have enacted regulations that prohibit driver mobile phone usage while driving. these regulations have created a need for cell phone usage detection for law enforcement. In this paper, we propose a computer vision based method for determining driver cell phone usage using a near infrared (NIR) camera system directed at the vehicle's front windshield. the developed method consists of two stages;first, we localize the driver's face region within the front windshield image using the deformable part model (DPM). Next, we utilize a local aggregation based image classification technique to classify a region of interest (ROI) around the drivers face to detect the cell phone usage. We propose two classification architectures by using full face and half face images for classification and compare their performance in terms of accuracy, specificity, and sensitivity. We also present a comparison of various local aggregation-based image classification methods using bag-of-visual-words (BOW), vector of locally aggregated descriptors (VLAD) and Fisher vectors (FV). A data set of 1500 images was collected on a public roadway and is used to perform the experiments.

关键词： cell phone usage detection image classification Image classification cell phone use Drivers images driver circuits full face computer vision Windscreen Public safety Law Enforcement

来源：评论

学校读者我要写书评

暂无评论

Multi-Source Multi-Modal Activity recognition in Aerial Video Surveillance

Multi-Source Multi-Modal Activity Recognition in Aerial Vide...

引用

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Hammoud, Riad I. Sahin, Cem S. Blasch, Erik P. Rhodes, Bradley J. BAE Syst Burlington MA 01803 USA Air Force Res Lab Rome NY USA

ISBN: (纸本)9781479943098

Recognizing activities in wide aerial/overhead imagery remains a challenging problem due in part to low-resolution video and cluttered scenes with a large number of moving objects. In the context of this research, we deal with two unsynchronized data sources collected in real-world operating scenarios: full-motion videos (FMV) and analyst call-outs (ACO) in the form of chat messages (voice-to-text) made by a human watching the streamed FMV from an aerial platform. We present a multi-source multi-modal activity/event recognition system for surveillance applications, consisting of: (1) detecting and tracking multiple dynamic targets from a moving platform, (2) representing FMV target tracks and chat messages as graphs of attributes, (3) associating FMV tracks and chat messages using a probabilistic graph-based matching approach, and (4) detecting spatial-temporal activity boundaries. We also present an activity pattern learning framework which uses the multi-source associated data as training to index a large archive of FMV videos. Finally, we describe a multi-intelligence user interface for querying an index of activities of interest (AOIs) by movement type and geo-location, and for playing-back a summary of associated text (ACO) and activity video segments of targets-of-interest (TOIs) (in both pixel and geo-coordinates). Such tools help the end-user to quickly search, browse, and prepare mission reports from multi-source data.

关键词： FMV exploitation MINER activity recognition chat and video fusion event recognition fusion graph matching graph representation surveillance

来源：评论

学校读者我要写书评

暂无评论

Detecting Social Groups in Crowded Surveillance Videos Using Visual Attention

Detecting Social Groups in Crowded Surveillance Videos Using...

引用

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Leach, Michael. J. V. Baxter, Rolf. Robertson, Neil. M. Sparks, Ed. P. Roke Manor Res Romsey Hants England Heriot Watt Univ Sch Engn & Phys Sci Edinburgh Midlothian Scotland

ISBN: (纸本)9781479943098

In this paper we demonstrate that the current state of the art social grouping methodology can be enhanced with the use of visual attention estimation. In a surveillance environment it is possible to extract the gazing direction of pedestrians, a feature which can be used to improve social grouping estimation. We implement a state of the art motion based social grouping technique to get a baseline success at social grouping, and implement the same grouping with the addition of the visual attention feature. By a comparison of the success at finding social groups for two techniques we evaluate the effectiveness of including the visual attention feature. We test both methods on two datasets containing busy surveillance scenes. We find that the inclusion of visual interest improves the motion social grouping capability. For the Oxford data, we see a 5.6% improvement in true positives and 28.5% reduction in false positives. We see up to a 50% reduction in false positives in other datasets. the strength of the visual feature is demonstrated by the association of social connections that are otherwise missed by the motion only social grouping technique.

关键词： Video surveillance computer aided analysis Machine vision

来源：评论

学校读者我要写书评

暂无评论

Generalized Autoencoder: A Neural Network Framework for Dimensionality Reduction

Generalized Autoencoder: A Neural Network Framework for Dime...

引用

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Wang, Wei Huang, Yan Wang, Yizhou Wang, Liang Chinese Acad Sci Inst Automat Natl Lab Pattern Recognit CRIPAC Beijing Peoples R China Peking Univ Sch EECS Natl Eng Lab Video Technol Key Lab Machine Percep Beijing Peoples R China

ISBN: (纸本)9781479943098

the autoencoder algorithm and its deep version as traditional dimensionality reduction methods have achieved great success via the powerful representability of neural networks. However, they just use each instance to reconstruct itself and ignore to explicitly model the data relation so as to discover the underlying effective manifold structure. In this paper, we propose a dimensionality reduction method by manifold learning, which iteratively explores data relation and use the relation to pursue the manifold structure. the method is realized by a so called "generalized autoencoder" (GAE), which extends the traditional autoencoder in two aspects: (1) each instance x(i) is used to reconstruct a set of instances {x(j)} rather than itself. (2) the reconstruction error of each instance (|parallel to x(j) - x(i)'parallel to(2)) is weighted by a relational function of x(i) and x(j) defined on the learned manifold. Hence, the GAE captures the structure of the data space through minimizing the weighted distances between reconstructed instances and the original ones. the generalized autoencoder provides a general neural network framework for dimensionality reduction. In addition, we propose a multilayer architecture of the generalized autoencoder called deep generalized autoencoder to handle highly complex datasets. Finally, to evaluate the proposed methods, we perform extensive experiments on three datasets. the experiments demonstrate that the proposed methods achieve promising performance.

关键词： Autoencoder Deep learning Dimensionality reduction

来源：评论

学校读者我要写书评

暂无评论

Analysis of Widely-used Descriptors for Finger-vein recognition

Analysis of Widely-used Descriptors for Finger-vein Recognit...

引用

9th International conference on computer vision theory and Applications (VISAPP)

作者： Yousefi, Fariba Sivri, Erdal Kaya, Ozgur Suloglu, Selma Kalkan, Sinan Middle East Tech Univ Ankara Turkey SoSoft Ltd Sti Ankara Turkey

ISBN: (纸本)9789897581335

For finger-vein recognition, many successful methods, such as Line Tracking (LT), Maximum Curvature (MC) and Wide Line Detector (WL), have been proposed. Among these, LT has a very slow matching and feature-extraction phase, and LT, MC and WL are translation and rotation dependent. Moreover, we show in the paper, they are affected by noise. To overcome these drawbacks, we propose using popular feature descriptors widely used for several computer vision or pattern recognition (cvpr) problems in the literature. the cvpr descriptors we test include Histogram of Oriented Gradients (HOG), Fourier Descriptors (FD), Zernike Moments (ZM), Local Binary patterns (LBP) and Global Binary patterns (GBP), which have not been applied to the finger-vein recognition problem before. We compare these descriptors against LT, MC, and WL and evaluate their running times, performance and resilience against noise, rotation and translation. We report that the LT and WL methods accuracy are comparable to each other and WL gives the best accuracy, LT method's speed is the slowest. Our results indicate that WL can be used together with ZM and GBP in case of rotation and noise, respectively.

关键词： Finger-vein recognition Histogram of Oriented Gradients Fourier Descriptors Zernike Moments Global Binary patterns

来源：评论

学校读者我要写书评

暂无评论

Landmark Based Facial Component Reconstruction for recognition Across Pose

Landmark Based Facial Component Reconstruction for Recogniti...

引用

27th ieee conference on computer vision and pattern recognition (cvpr)

作者： Hsu, Gee-Sern Peng, Hsiao-Chia Chang, Kai-Hsiang Natl Taiwan Univ Sci & Technol Dept Mech Engn Taipei Taiwan

ISBN: (纸本)9781479943098

Different from previous 3D face modeling approaches that consider the whole facial area, the proposed method reconstructs 3D facial components for handling cross-pose recognition. It has two phases, component reconstruction and component-based recognition. In the reconstruction phase, we first extract four component regions, namely two eyes, nose and mouth, from each gallery face using the pose-invariant landmarks obtained by a modified version of a landmark detection algorithm. A 3D model of each component region is reconstructed using a constrained minimization scheme with a gender and ethnicity oriented 3D model as the reference. In the recognition phase, the pose of a given probe is determined by a set of landmarks which guides the rotation of the reconstructed components so that the reconstructed can be aligned to the probe components. the match is determined by the components instead of the whole faces so that different components can be considered at different poses. Experiments on the PIE and Multi-PIE databases show that the proposed component-based approach does not just outperform its holistic counterpart, but is also competitive to many contemporary methods.

关键词： face recognition face reconstruction 3D facial component

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the ieee computer Society conference on computer vision and pattern recognition

Proceedings of the IEEE Computer Society Conference on Compu...

引用

27th ieee conference on computer vision and pattern recognition, cvpr 2014

ISBN: (纸本)9781479951178

the proceedings contain 539 papers. the topics discussed include: fast and accurate image matching with cascade hashing for 3D reconstruction;minimal solvers for relative pose with a single unknown radial distortion;spectral graph reduction for efficient image and streaming video segmentation;video motion segmentation using new adaptive manifold denoising model;event detection using multi-level relevance labels and multiple features;full-angle quaternions for robustly matching vectors of 3D rotations;semi-supervised spectral clustering for image set classification;learning mid-level filters for person re-identification;DeepReID: deep filter pairing neural network for person re-identification;NMF-KNN: image annotation using weighted multi-view non-negative matrix factorization;beyond comparing image pairs: setwise active learning for relative attributes;and histograms of pattern sets for image classification and object recognition.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Chinese chess recognition algorithm based on computer vision

Chinese chess recognition algorithm based on computer vision

引用

26th Chinese Control and Decision conference, CCDC 2014

作者： Wu, Gui Tao, Jun Educational Administration Office Jianghan University Wuhan 430056 China School of Mathematics and Computer Science Jianghan University Wuhan 430056 China

ISBN: (纸本)9781479937066

this paper introduces the Chinese chess recognition algorithm based on computer vision and image processing. In order to simplify processing and enhance efficiency, the images of chessboard and chessman need preprocessing in advance. the steps of preprocessing include of transformation from color images to gray images, images filtering with mean filter or median filter, and binaryzation of the gray images. the edges of chessboard and chessman are able to be extracted from the binarized images by image segmentation. then the location of center of chessman and the circle edge of chessman can be calculated with an advanced Hough transformation, which can ascertain the location of chessman in the chessboard and the size of each chessman. According to the features of chess images, main recognition method is to analyze the radial chess pixel statistical data with mathematical morphology. Because the values of pixel coordination in any angle of chessman can keep same and stable, the recognition algorithm should be with a good recognition rate from the experimental results. the advanced and modified recognition algorithm is proved to be practical and applicative by the experimentation of computer vision system in Chinese chess games provided in this paper. © 2014 ieee.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

第13届文档分析与识别国际会议（英文）

引用

智能系统学报 2015年第1期10卷 67-67页

Welcome to the 13th International conference on Document Analysis and recognition(ICDAR 2015),hosted by the *** the Association of Sustainable Innovation in Tunisia(Tunisian Chapter of IAPR),will be held in Tunis(Tunisia)from August 23-26th,*** 2015 is sponsored by the International Association for pattern recognition(IAPR)and technically co-sponsored by TC-10(Graphics recognition),TC-11(Reading Systems),ieee computer Society(pending approval)。

关键词：文档分析文档识别技术创新发展现状

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：