The presence of random extra pulses during quasi-closed glottal cycle phases may constitute a distinct voice quality type relevant to the clinical care of disordered voices. In this paper, we propose for this voice ty...
详细信息
Efficient video transmission over unreliable channels may encounter huge challenge due to unavoidable bit error or packets loss. Error concealment (EC) techniques at the decoder side have been developed to recover the...
详细信息
Alpha-stable distributions have recently been recognized in the signal processing community as simple, yet very accurate, two-parameter statistical models for signals and noises that contain an impulsive component of ...
详细信息
This paper presents a decolorization method using gradient and saliency as the maintained features in the conversion to preserve the local and global visual perception. First, we construct a linear parametric mapping ...
详细信息
Nowadays, more and more families are willing to buy 3D TV to improve their watching experience. Stereo perception produced by watching 3D images or videos brings strong immersive watching experience to users. However,...
详细信息
ISBN:
(纸本)9789811081071
Nowadays, more and more families are willing to buy 3D TV to improve their watching experience. Stereo perception produced by watching 3D images or videos brings strong immersive watching experience to users. However, accumulated vision fatigue confuses users a lot after watching 3D TV for a long time. When watching 3D images, controlled by past recognition experience and visual attention mechanism, gaze point of two eyes is changing among different objects which have different depth of field. The eye movement in this changing process is called vergence. Vergence can be defined as movement of our eyes in opposite directions to locate the area of interest on the fovea and accommodation as alteration of the lens to obtain and maintain the area of interest focused on the fovea. So the more frequently the vergence process occurs, the more uncomfortable we feel. We expect to obtain several eye movement patterns, which can be considered as some typical visual attention patterns, by building a top-down recognition and visual attention model and then applying some clustering methods to find them. So we use an eye tracker to record eye movement data and then model it as a bayesian network model. The generative model is based on beta process and we build an Autoregression-HMM model to describe the relationship between latent eye movement patterns and eye movement data. To uncover parameters which represent different eye movement patterns in this model, we use MCMC method to calculate them with iterative computations. In this work, some different latent patterns existed in the sequential eye movement data can be revealed. After analyzing these patterns, we are able to find out some similarities and differences of visual attention models between different people watching the same image or between different images viewed by the same one. These conclusions can help to improve quality of 3D image thus lessening the users’ vision fatigue when watching 3D TV. This work also will con
作者:
Caridakis, GeorgeDiamanti, OlgaKarpouzis, KostasMaragos, PetrosImage
Video and Multimedia Systems Lab. National Technical University of Athens Iroon Polytexneiou 9 15780 Athens Greece Computer Vision
Speech Communication and Signal Processing Group National Technical University of Athens Iroon Polytexneiou 9 15780 Athens Greece
This work focuses on two of the research problems comprising automatic sign language recognition, namely robust computer vision techniques for consistent hand detection and tracking, while preserving the hand shape co...
详细信息
ISBN:
(纸本)9781605580678
This work focuses on two of the research problems comprising automatic sign language recognition, namely robust computer vision techniques for consistent hand detection and tracking, while preserving the hand shape contour which is useful for extraction of features related to the handshape and a novel classification scheme incorporating Self-organizing maps, Markov chains and Hidden Markov Models. Geodesic Active Contours enhanced with skin color and motion information are employed for the hand detection and the extraction of the hand silhouette, while features extracted describe hand trajectory, region and shape. Extracted features are used as input to separate classifiers, forming a robust and adaptive architecture whose main contribution is the optimal utilization of the neighboring characteristic of the SOM during the decoding stage of the Markov chain, representing the sign class. Copyright 2008 ACM.
Non-invasive gaze estimation from only eye images captured by camera is a challenging problem due to various eye shapes, eye structures and image qualities. Recently, CNN network has been applied to directly regress e...
详细信息
Finding correspondences between images is essential for many computer vision tasks and sparse matching pipelines have been popular for decades. However, matching noise within and between images, along with inconsisten...
详细信息
ISBN:
(数字)9798350353006
ISBN:
(纸本)9798350353013
Finding correspondences between images is essential for many computer vision tasks and sparse matching pipelines have been popular for decades. However, matching noise within and between images, along with inconsistent key-point detection, frequently degrades the matching performance. We review these problems and thus propose: 1) a novel and unified Filtering and Calibrating (FC) approach that jointly rejects outliers and optimizes inliers, and 2) leveraging both the matching context and the underlying image texture to remove matching uncertainties. Under the guidance of the above innovations, we construct Filtering and Calibrating Graph Neural Network (FC-GNN), which follows the FC approach to recover reliable and accurate correspondences from various interferences. FC-GNN conducts an effectively combined inference of contextual and local information through careful embedding and multiple information aggregations, predicting confidence scores and calibration offsets for the input correspondences to jointly filter out outliers and improve pixel-level matching accuracy. Moreover, we exploit the local coherence of matches to perform inference on local graphs, thereby reducing computational complexity. Overall, FC-GNN operates at lightning speed and can greatly boost the performance of diverse matching pipelines across various tasks, showcasing the immense potential of such approaches to become standard and pivotal components of image matching. Code is avaiable at https://***/xuy123456/fcgnn.
Anemia is a common medical condition affecting millions worldwide, particularly in developing countries. Early detection of anemia is crucial for prompt treatment and prevention of its potential complications. In rece...
Anemia is a common medical condition affecting millions worldwide, particularly in developing countries. Early detection of anemia is crucial for prompt treatment and prevention of its potential complications. In recent years, deep learning (DL) has shown great potential in various medical applications, including medical image classification, anomaly detection, and segmentation. This study proposes a transfer learning-based approach using a pre-trained DL model to detect anemia from palpebral conjunctiva images. The proposed method utilizes a pre-trained DenseNet-201 model and fine-tuned it on a target dataset of palpebral conjunctiva images to detect anemia. Deep features of palpebral conjunctiva images computed from the fine-tuned DenseNet-201 are fed to MLP to identify anemia. The performance of the proposed method is evaluated on a publicly available anemia dataset, and the results show that the proposed method achieves an accuracy of 93.7 % in detecting anemia from palpebral conjunctiva images. In addition to anemia classification, we computed the hemoglobin level of palpebral conjunctiva images based on the gray-level co-occurrence matrix (GLCM) statistical properties. The statistical properties of GLCM are given to support vector and polynomial regressors, and the mean value of the predicted scores of both regressors is used to estimate the hemoglobin level. Experimental results show that the proposed model achieves an average root mean square error of 0.72 for conjunctiva images.
Fundus imaging is a valuable diagnostic tool in ophthalmology, providing clinicians with detailed visualizations of the retina and aiding in the detection and monitoring of various eye diseases, including age-related ...
Fundus imaging is a valuable diagnostic tool in ophthalmology, providing clinicians with detailed visualizations of the retina and aiding in the detection and monitoring of various eye diseases, including age-related macular degeneration (AMD), glaucoma, diabetic retinopathy (DR), and cataract. However, the quality of fundus images can be significantly affected by noise, mainly additive white Gaussian noise (AWGN), which is inherent in many imaging systems. The presence of noise in real-world data poses significant challenges for computer vision tasks. In the field of medical image classification, a wrong diagnoisis has heavy consequences. Understanding the impact of AWGN on fundus images is crucial for developing practical denoising algorithms and improving diagnostic accuracy. This work presents an analysis of AWGN noise in fundus images aims to characterize its effects on image quality and assess its impact on diagnostic tasks. The work also analyzes the performance of six models (3 each) of two popular deep learning architectures, Convolutional Neural Networks (CNN) and Vision Transformers (ViT) in the presence of AWGN. AWGN is first introduced to the clean image datasets to conduct the analysis. The CNN and ViT models are trained on the noisy datasets to evaluate the performance of the image classification task. The work also involves six denoising algorithms and a popular image enhancement algorithm- Contrast Limited Adaptive Histogram Equalization (CLAHE).
暂无评论