检索结果-内蒙古大学图书馆

IEEE Southwest Symposium on image Analysis and Interpretation

作者： Viacheslav Voronin Svetlana Tokareva Evgenii Semenishchev Sos Agaian Lab. "Mathematical methods of image processing and intelligent computer vision systems" Don State Technical University Rostov-on-Don Russian Federation Dept. of Computer Science CUNY/The College of Staten Island Staten Island New York United States

This paper presents a new thermal image enhancement algorithm based on combined local and global image processing in the frequency domain. The presented approach uses the fact that the relationship between stimulus and perception is logarithmic. The basic idea is to apply logarithmic transform histogram matching with spatial equalization approach on different image blocks. The resulting image is a weighted mean of all processing blocks. The weights for every local and global enhanced image driven through optimization of measure of enhancement (EME). Some presented experimental results illustrate the performance of the proposed algorithm on real thermal images in comparison with the traditional methods.

关键词： Histograms Transforms image enhancement Adaptive equalizers Frequency-domain analysis Computer vision

来源：评论

学校读者我要写书评

暂无评论

PILAE: A non-gradient descent learning scheme for deep feedforward neural networks

arXiv

引用

arXiv 2018年

作者： Guo, Ping Wang, Ke Zhou, XiuLing The Image Processing & Pattern Recognition Lab. School of Systems Science Beijing Normal University Beijing100875 China The School of Information Engineering Zhengzhou University Zhengzhou450001 China The Department of Technology and Industry Development Beijing City University Beijing100083 China

In this work, a non-gradient descent learning (NGDL) scheme was proposed for deep feedforward neural networks (DNN). It is known that an autoencoder can be used as the building blocks of the multi-layer perceptron (MLP) DNN, the MLP is taken as an example to illustrate the proposed scheme of pseudoinverse learning algorithm for autoencoder (PILAE) in this paper. The PILAE with low rank approximation is a NGDL algorithm, and the encoder weight matrix is set to be the low rank approximation of the pseudoinverse of the input matrix, while the decoder weight matrix is calculated by the pseudoinverse learning algorithm. It is worth to note that only very few network structure hyper-parameters need to be tuned compared with classical gradient descent learning algorithm. Hence, the proposed algorithm could be regarded as a quasi-automated training algorithm which could be utilized in automated machine learning field. The experimental results show that the proposed learning scheme for DNN could achieve better performance on considering the tradeoff between training efficiency and classification accuracy. Copyright © 2018, The Authors. All rights reserved.

关键词： Feedforward neural networks

来源：评论

学校读者我要写书评

暂无评论

Eye Movement Pattern Modeling and Visual Comfort Viewing S3D images

Eye Movement Pattern Modeling and Visual Comfort Viewing S3D...

引用

IEEE Visual communications and image processing (VCIP)

作者： Chi Zhang Jun Zhou Xiao Gu Shouchen Zhu Alan. C. Bovik Institute of Image Communication & Network Enginerring Shanghai Jiao Tong University Shanghai China Shanghai Key Lab of Digital Media Processing & Transmissions Shanghai Jiao Tong University China Shanghai Yanan High School Shanghai China Department of Electrical and Computer Engineering The University of Texas at Austin USA

ISBN: (纸本)9781538644591;9781538644584

Stereoscopic-3D (S3D) displays are widely used but present problems related to experiences of visual discomfort for human vision. One aspect of this issue is the movement of the gaze point within different depth fields. Here we aim to analyze the relationship between eye movement patterns and visual comfort experienced when viewing S3D images. Rather than simply lab.ling eye movement data according to categories such as gaze, saccade and so on, we depoly nonparametric Bayesian method to analyze and cluster several eye movement patterns, and to relate them to visual comfort. The results are relevant to the prediction of visual comfort assessment in S3D images by automatic algorithms.

关键词： Visualization Hidden Markov models Bayes methods Brain modeling Graphical models Three-dimensional displays

来源：评论

学校读者我要写书评

暂无评论

Multi-View images Fusion Model

引用

IOP Conference Series: Materials Science and Engineering 2019年第1期680卷

作者： M M Zhdanova V V Voronin R A Sizyakin M S Minkin A A Zelensky Lab. 'Mathematical methods of image processing and intelligent computer vision systems' Don State Technical University Gagarin sq. 1 Rostov-on-Don 344000 Russia Moscow State University of Technology 'STANKIN' Vadkovsky line 1 Moscow 127055 Russia

The tasks of recognition actions and classification objects are fundamental in computer vision systems. Even subtasks, such as recognition of atomic motion and single objects form the basis for understanding the situation in the work area and the scene in general. This is especially important in video surveillance systems designed to ensure security. Thus, the effectiveness of recognition and classification methods is one of the primary tasks of computer vision. But the visual methods implemented in similar video surveillance systems, encounter some difficulties, such as inhomogeneous background, uncontrolled operating environments, irregular illumination, etc. To address these drawbacks, the paper presents a model for combining visible range images and depth images. This model allows to improve the quality of recognized images, provides the construction of a more informative descriptor, which also positively affects the recognition efficiency. Our results show that it has good performance in fusion visible image and depth map.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Modelling of random extra pulses during quasi-closed glottal cycle phases 10

Modelling of random extra pulses during quasi-closed glottal...

引用

10th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2017

作者： Aichinger, P. Roesner, I. Schoentgen, J. Pernkopf, F. Department of Otorhinolaryngology Division of Phoniatrics-Logopedics Medical University of Vienna Austria F.N.R.S. Université Libre de Bruxelles Laboratories of Image Signal Processing and Acoustics Faculty of Applied Sciences Brussels Belgium Signal Processing and Speech Communication Lab Graz University of Technology Austria

ISBN: (纸本)9788864536064

The presence of random extra pulses during quasi-closed glottal cycle phases may constitute a distinct voice quality type relevant to the clinical care of disordered voices. In this paper, we propose for this voice type a glottal area waveform model that includes automatic parameter estimation. The model involves (1) extraction of the fundamental frequency, (2) estimation of the cyclic pulse times, heights and shapes, (3) Fourier synthesis of a cyclic pulse train model, (4) closed phases estimation via fitting an inverted parabola to the averaged pulse shape, (5) estimation of the random extra pulses’ positions and shapes, and (6) pulse shape filtering based synthesis of the random extra pulses. For a typical voice sample, the root mean square error energy level of the purely cyclic model = -13.2 dB, which improves by 1.5 dB when extra pulses are added to the model. © Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA *** right reserved.

关键词： Mean square error

来源：评论

学校读者我要写书评

暂无评论

Smart Cloud System for Forensic Thermal image Enhancement Using Local and Global Logarithmic Transform Histogram Matching

Smart Cloud System for Forensic Thermal Image Enhancement Us...

引用

IEEE International Conference on Smart Cloud (SmartCloud)

作者： Viacheslav Voronin Evgenii Semenishchev Vladimir Frants Sos Agaian Lab. “Mathematical methods of image processing and intelligent computer vision systems” Don State Technical University Rostov-on-Don Russian Federation “STANKIN” Moscow State University of Technology Moscow Russian Federation Dept. of Computer Science CUNY/The College of Staten Island Staten Island New York United States

Digital images used in the investigation of a crime often undergo several concurrent enhancement operations for improved automated analysis. The challenges are related to the big size of data and complexity of the forensic image processing. Our purpose is providing a smart cloud system to image processing for PC and Smartphones with limited computation complexity. This paper presents a new thermal image enhancement algorithm based on combined local and global image processing in the frequency domain. The presented approach uses the fact that the relationship between stimulus and perception is logarithmic. The basic idea is to apply logarithmic transform histogram matching with spatial equalization approach on different image blocks. The resulting image is a weighted mean of all processing blocks. The weights for every local and global enhanced image driven through optimization of measure of enhancement (EME). Some presented experimental results illustrate the performance of the proposed cloud system on real thermal images in comparison with the traditional methods.

关键词： Histograms Transforms image enhancement Forensics Adaptive equalizers Clouds

来源：评论

学校读者我要写书评

暂无评论

Relaxed spatio-temporal deep feature aggregation for real-fake expression prediction

arXiv

引用

arXiv 2017年

作者： Ozkan, Savas Akar, Gozde Bozdagi TUBITAK UZAY Image Processing Group Ankara Turkey Middle East Technical University Multimedia Lab. Ankara Turkey

Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66.7%) . Lastly, we believe that this method can be extended to different problems such as action/event recognition in future. Copyright © 2017, The Authors. All rights reserved.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fake Expression Prediction

Relaxed Spatio-Temporal Deep Feature Aggregation for Real-Fa...

引用

International Conference on Computer Vision Workshops (ICCV Workshops)

作者： Savas Ozkan Gozde Bozdagi Akar Middle East Technical University Multimedia Lab. Ankara Turkey TUBITAK UZAY Image Processing Group Ankara Turkey

Frame-level visual features are generally aggregated in time with the techniques such as LSTM, Fisher Vectors, NetVLAD etc. to produce a robust video-level representation. We here introduce a learnable aggregation technique whose primary objective is to retain short-time temporal structure between frame-level features and their spatial interdependencies in the representation. Also, it can be easily adapted to the cases where there have very scarce training samples. We evaluate the method on a real-fake expression prediction dataset to demonstrate its superiority. Our method obtains 65% score on the test dataset in the official MAP evaluation and there is only one misclassified decision with the best reported result in the Chalearn Challenge (i.e. 66.7%). Lastly, we believe that this method can be extended to different problems such as action/event recognition in future.

关键词： Feature extraction Visualization Computer architecture Computational modeling Data models Robustness Face

来源：评论

学校读者我要写书评

暂无评论

Derivatives and Inverse of Cascaded Linear+Nonlinear Neural Models

arXiv

引用

arXiv 2017年

作者： Martinez-Garcia, M. Cyriac, P. Batard, T. Bertalmío, M. Malo, J. Image Processing Lab. Univ. València Spain Instituto de Neurociencias CSIC Alicante Spain Information and Communication Technologies Dept. Univ. Pompeu Fabra Barcelona Spain

In vision science, cascades of Linear+Nonlinear transforms are very successful in modeling a number of perceptual experiences [1]. However, the conventional literature is usually too focused on only describing the forward input-output transform. Instead, in this work we present the mathematics of such cascades beyond the forward transform, namely the Jacobian matrices and the inverse. The fundamental reason for this analytical treatment is that it offers useful analytical insight into the psychophysics, the physiology, and the function of the visual system. For instance, we show how the trends of the sensitivity (volume of the discrimination regions) and the adaptation of the receptive fields can be identified in the expression of the Jacobian w.r.t. the stimulus. This matrix also tells us which regions of the stimulus space are encoded more efficiently in multi-information terms. The Jacobian w.r.t. the parameters shows which aspects of the model have bigger impact in the response, and hence their relative relevance. The analytic inverse implies conditions for the response and model parameters to ensure appropriate decoding. From the experimental and applied perspective, (a) the Jacobian w.r.t. the stimulus is necessary in new experimental methods based on the synthesis of visual stimuli with interesting geometrical properties, (b) the Jacobian matrices w.r.t. the parameters are convenient to learn the model from classical experiments or alternative goal optimization, and (c) the inverse is a promising model-based alternative to blind machine-learning methods for neural decoding that do not include meaningful biological information. The theory is checked by building and testing a vision model that actually follows the modular program suggested in [1]. Our illustrative derivable and invertible model consists of a cascade of modules that account for brightness, contrast, energy masking, and wavelet masking. To stress the generality of this modular setting we show exa

关键词： Mathematical transformations

来源：评论

学校读者我要写书评

暂无评论

Blind Visual Quality Assessment for Smart Cloud-Based Video Storage

Blind Visual Quality Assessment for Smart Cloud-Based Video ...

引用

IEEE International Conference on Smart Cloud (SmartCloud)

作者： Vladimir Frants Viacheslav Voronin Alexander Zelenskiy Sos Agaian Metrological lab. “Small GIC” Moscow State University of Technology “STANKIN” Moscow Russian Federation Pro-rector for Research Work and R&D Politics Moscow State University of Technology “STANKIN” Moscow Russian Federation Lab. “Mathematical methods of image processing and intelligent computer vision systems.” Don State Technical University Rostov-on-Don Russian Federation Dept. of Computer Science CUNY The College of Staten Island Staten Island New York United States

In this paper, we present a new video quality metric targeted for use within cloud-based video storage systems. Because of the limited capacity of storage solutions currently in use, it is common for stored videos to have low perceived video quality. Ability to predict the quality of coded video sequence in fully automatic fashion is crucial for optimal coding parameters selection. We have developed general quality estimation approach applicable for different kinds of video content and useful for online correction of perceived quality of stored video sequence. Proposed objective video quality metric is designed to have a high level of correlation with human-based quality assessment results. A high level of conformity of predicted video quality to perceived quality level is achieved by using a convolutional neural network trained on a large volume of video data in the framework of generative adversarial learning and combination of carefully selected regularization techniques. Evaluation of developed VQA metric on commonly used datasets shows equal or better correlation with MOS than the current state of the art approaches.

关键词： Quality assessment Measurement Video recording Video sequences Streaming media Training Visualization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：