This paper presents a system that transforms images acquired with a camera into sounds. The system is designed for the visually impaired people and will convert real time images into sounds, respecting a certain algor...
详细信息
ISBN:
(纸本)9789612480295
This paper presents a system that transforms images acquired with a camera into sounds. The system is designed for the visually impaired people and will convert real time images into sounds, respecting a certain algorithm, to preserve the visual information, but not forgetting the limitations imposed by the human hearing system. The resolution used for images to be conversed is to be found out after some future tests and statistics. The hardware implementation will have to use the capabilities of a portable device, such as a PDA or mobile phone or special built embedded system with microcontroller, and the images will be received though a small web-cam. Actual results obtained by this system configuration are to be evaluated within testing.
The performance of a minutiae-based palmprint recognition system relies heavily on the quality of captured palmprint images. This paper introduces a block-based approach to evaluate the quality of a palniprint, in whi...
详细信息
ISBN:
(纸本)9789612480363
The performance of a minutiae-based palmprint recognition system relies heavily on the quality of captured palmprint images. This paper introduces a block-based approach to evaluate the quality of a palniprint, in which two inter-block indices and four inner-block indices are developed. Inter-block quality indices includes block level orientation continuity, ridge thickness uniformity, while inner-block quality indices involve smudginess dryness judgment, the certainly of orientation in a block, ridge-valley frequency value, and special regions information. The final block quality value and image quality value are calculated based on these six indices. In experiment part, we evaluate the proposed measurement from three aspects: by comparing with human visual inspection, by observing the rise of Accuracy Rate after masking and by watching the relationship between GI value and final image quality value Q. These results show that our measurement can predict the quality of palmprint images accurately.
In this paper we propose an efficient multi-phase image segmentation for color images based on the piecewise constant multi-phase Vese-Chan model and the split Bregman method. The proposed model is first presented in ...
详细信息
ISBN:
(纸本)9781479902880
In this paper we propose an efficient multi-phase image segmentation for color images based on the piecewise constant multi-phase Vese-Chan model and the split Bregman method. The proposed model is first presented in a four-phase level set formulation and then extended to a multi-phase formulation. The four-phase and multi-phase energy functionals are defined and the corresponding minimization problems of the proposed active contour model are presented. The split Bregman method is applied to minimize the multi-phase energy functional efficiently. The proposed model has been applied to synthetic and real color images with promising results. The advantages of the proposed active contour model have been demonstrated by numerical results.
Proliferative Diabetic Retinopathy (PDR) is a serious retinal disease threatening diabetic patients. Intense retinal neovascularization in the retinal image is the most important clinical symptom of PDR, leading to vi...
详细信息
ISBN:
(纸本)9781665475921
Proliferative Diabetic Retinopathy (PDR) is a serious retinal disease threatening diabetic patients. Intense retinal neovascularization in the retinal image is the most important clinical symptom of PDR, leading to visual distortion if not controlled. Accurate and timely detection of neovascularization from retinal images allows patients to receive adequate treatment to avoid further vision loss. In this work, we propose a retinal neovascularization automatic segmentation model based on improved Pyramid Scene Parsing Network (PSP-Net). To improve the accuracy of the model, we introduce the proposed channel attention module into the model. The network is evaluated with color fundus images from practice. Evaluation results show the network is superior to FCN, SegNet, U-Net and PSP-Net in accuracy and sensitivity. The model could achieve accuracy, sensitivity, specificity, precision and Jaccard similarity score of 0.9832, 0.9265, 0.9897, 0.9116 and 0.8501, respectively. This paper proves through plenty of experimental results that the network model is able to improve the accuracy of segmentation, relieve the workload of doctors, and is worthy of further clinical promotion.
In this paper, we propose a high-frequency guided CNN for video compression artifacts reduction. In the proposed method, high frequency component in Y channel is extracted and used to guide the quality enhancement of ...
详细信息
ISBN:
(纸本)9781665475921
In this paper, we propose a high-frequency guided CNN for video compression artifacts reduction. In the proposed method, high frequency component in Y channel is extracted and used to guide the quality enhancement of all Y, U, V channels. As high frequency component contains the edge and contour information of the objects in the image, which is of vital importance to both subjective and objective quality. In general, the proposed method consists of two modules: the high frequency guidance module and the quality enhancement module. The high-frequency guidance module uses multiple octave convolutions to extract the high-frequency component in Y channel and then fuse it into the features of Y, U, and V channels. While in the quality enhancement module, multiple CNN residual blocks are used for the quality enhancement of Y, U, and V channels. The proposed method was integrated into both HM-16.22 and VTM-16.0. The results on the JVET test sequence under All Intra configuration shows the effectiveness of the proposed method. Compared with HEVC, the proposed method achieves the average BD-rate reductions of -12.3%, -22.7% and -23.5% for Y, U and V channels respectively. Compared with VVC, the average BD-rate reductions are -6.7%, -12.3% and -13.2% correspondingly.
Disparity estimation is an important technique in stereo video coding. This paper presents a disparity estimation algorithm based on edge detection. The algorithm makes full use of the human visual characteristics, th...
详细信息
ISBN:
(纸本)9781424448562
Disparity estimation is an important technique in stereo video coding. This paper presents a disparity estimation algorithm based on edge detection. The algorithm makes full use of the human visual characteristics, that is, the human eye is more sensitive to the distortion of the edge region. Therefore, joint estimation is used for edge detection. The large code block size for coding the background region and the flat areas while small size for coding the edge region were used in this paper. Compared to the disparity estimation algorithm proposed in 181, the proposed algorithm can greatly improve the encoding speed of stereo video without affecting subjective image quality.
This paper outlines a generalized image reconstruction approach to improve the resolution of an Electro-Optic (EO) imaging sensor using multiple frames of an image sequence. This method only assumes the constituent vi...
详细信息
ISBN:
(纸本)0819444111
This paper outlines a generalized image reconstruction approach to improve the resolution of an Electro-Optic (EO) imaging sensor using multiple frames of an image sequence. This method only assumes the constituent video has some ambient motion between the sensor and stationary background, and the optical image is physically captured by a staring focal plane array.
Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighborin...
详细信息
ISBN:
(纸本)9781728180687
Intra prediction is an essential component in the image coding. This paper gives an intra prediction framework completely based on neural network modes (NM). Each NM can be regarded as a regression from the neighboring reference blocks to the current coding block. (1) For variable block size, we utilize different network structures. For small blocks 4x4 and 8x8, fully connected networks are used, while for large blocks 16x16 and 32x32, convolutional neural networks are exploited. (2) For each prediction mode, we develop a specific pre-trained network to boost the regression accuracy. When integrating into HEVC test model, we can save 3.55%, 3.03% and 3.27% BD-rate for Y, U, V components compared with the anchor. As far as we know, this is the first work to explore a fully NM based framework for intra prediction, and we reach a better coding gain with a lower complexity compared with the previous work.
Automatically discovering common visual patterns in images is very challenging due to the uncertainties in the visual appearances of such spatial patterns and the enormous computational cost involved in exploring the ...
详细信息
ISBN:
(纸本)9781424414369
Automatically discovering common visual patterns in images is very challenging due to the uncertainties in the visual appearances of such spatial patterns and the enormous computational cost involved in exploring the huge solution space. Instead of performing exhaustive search on all possible candidates of such spatial patterns at various locations and scales, this paper presents a novel and very efficient algorithm for discovering common visual patterns by designing a provably correct and computationally efficient pruning procedure that has a quadratic complexity. This new approach is able to efficiently search a set of images for unknown visual patterns that exhibit large appearance variations because of rotation, scale changes, slight view changes, color variations and partial occlusions.
This paper introduces an efficient method to substantially increase the recognition performance of a vocabulary tree based recognition system. We propose to enhance the hypothesis obtained by a standard inverse object...
详细信息
ISBN:
(纸本)9781424414369
This paper introduces an efficient method to substantially increase the recognition performance of a vocabulary tree based recognition system. We propose to enhance the hypothesis obtained by a standard inverse object voting algorithm with reliable descriptor co-occurrences. The algorithm operates on different layers of a standard k-means tree benefiting from the advantages of different levels of information abstraction. The visual vocabulary tree shows good results when a large number of distinctive descriptors form a large visual vocabulary. Co-occurrences perform well even on a coarse object representation with a small number of visual words. An arbitration strategy with minimal computational effort combines the specific strengths of the particular representations. We demonstrate the achieved performance boost and robustness to occlusions in a challenging object recognition task.
暂无评论