Violent interaction detection is of vital importance in some video surveillance scenarios like railway stations, prisons or psychiatric centres. Existing vision-based methods are mainly based on hand-crafted features ...
Violent interaction detection is of vital importance in some video surveillance scenarios like railway stations, prisons or psychiatric centres. Existing vision-based methods are mainly based on hand-crafted features such as statistic features between motion regions, leading to a poor adaptability to another dataset. En lightened by the development of convolutional networks on common activity recognition, we construct a FightNet to represent the complicated visual violence interaction. In this paper, a new input modality, image acceleration field is proposed to better extract the motion attributes. Firstly, each video is framed as RGB images. Secondly, optical flow field is computed using the consecutive frames and acceleration field is obtained according to the optical flow field. Thirdly, the FightNet is trained with three kinds of input modalities, i.e., RGB images for spatial networks, optical flow images and acceleration images for temporal networks. By fusing results from different inputs, we conclude whether a video tells a violent event or not. To provide researchers a common ground for comparison, we have collected a violent interaction dataset (VID), containing 2314 videos with 1077 fight ones and 1237 no-fight ones. By comparison with other algorithms, experimental results demonstrate that the proposed model for violent interaction detection shows higher accuracy and better robustness.
It is challenging to capture a high-dynamic range (HDR) scene using a low-dynamic range (LDR) camera. This paper presents an approach for improving the dynamic range of cameras by using multiple exposure images of sam...
It is challenging to capture a high-dynamic range (HDR) scene using a low-dynamic range (LDR) camera. This paper presents an approach for improving the dynamic range of cameras by using multiple exposure images of same scene taken under different exposure times. First, the camera response function (CRF) is recovered by solving a high-order polynomial in which only the ratios of the exposures are used. Then, the HDR radiance image is reconstructed by weighted summation of the each radiance maps. After that, a novel local tone mapping (TM) operator is proposed for the display of the HDR radiance image. By solving the high-order polynomial, the CRF can be recovered quickly and easily. Taken the local image feature and characteristic of histogram statics into consideration, the proposed TM operator could preserve the local details efficiently. Experimental result demonstrates the effectiveness of our method. By comparison, the method outperforms other methods in terms of imaging quality.
Gliomas are the most common primary brain malignancies, with different degrees of aggressiveness, variable prognosis and various heterogeneous histologic sub-regions, i.e., peritumoral edematous/invaded tissue, necrot...
详细信息
Arabic character recognition is a challenging problem in several artificial intelligence applications, especially when recognizing connected cursive letters. Another dimension of complexity is that Arabic characters m...
详细信息
Arabic character recognition is a challenging problem in several artificial intelligence applications, especially when recognizing connected cursive letters. Another dimension of complexity is that Arabic characters may form various shapes depending on their positions in the word. As a result, unconstrained handwritten Arabic character recognition has not been well explored. In this study, we propose an efficient algorithm for Arabic character recognition. The new algorithm combines features extracted from curvelet and spatial domains. The curvelet domain is multiscale and multidirectional. Therefore, curvelet domain is efficient in representing edges and curves. Meanwhile, the spatial domain preserves original aspects of the characters. This feature vector is then trained using the back propagation neural network for the recognition task. The proposed algorithm is evaluated using a database containing 5,600 handwritten characters from 50 different writers. A promising average success rate of 90.3% has been achieved. Therefore, the proposed algorithm is suitable for the unconstrained handwritten Arabic character recognition applications.
作者:
Feri CandraSyed Abd. Rahman Abu-BakarComputer Vision
Video and Image Processing Research Lab Electronics and Computer Engineering Department Faculty of Electrical Engineering Universiti Teknologi Malaysia 81310 Johor Bahru Malaysia
Spectral imaging technique such as hyperspectral and multispectral imaging is a combination of imaging and spectroscopy. This powerful technique can provide samples of spectral images, which can be used to analyze a n...
详细信息
ISBN:
(纸本)9781479989973
Spectral imaging technique such as hyperspectral and multispectral imaging is a combination of imaging and spectroscopy. This powerful technique can provide samples of spectral images, which can be used to analyze a number of fruit properties. The aim of this study is to develop calibration or predictive model for determining soluble solid content (SSC) of starfruit samples based on their spectral images. Partial least squares (PLSR) and support vector regression (SVR) techniques were applied to build the relationship between the mean spectral data and the reference value. The mean spectral data was extracted from spectral images of each starfruit samples. The simple template for region of interest (ROI) selection and five optimal wavelengths (565.2, 677.2, 736, 873.2 and 943.2 nm) as proposed in previous study were used for extraction of the mean spectral data. The result showed that the calibration model with PLSR and SVR had better performance than the previous study. Moreover, the calibration model with SVR was the best performance for prediction of SSC value of starfruit.
This paper presents a new image abstraction approach, aiming to improve typical image related pattern recognition tasks such as segmentation, tracking, and classification. The proposed image abstraction framework perf...
详细信息
This paper presents a new image abstraction approach, aiming to improve typical image related pattern recognition tasks such as segmentation, tracking, and classification. The proposed image abstraction framework performs image denoising and homogeneous region simplification, along with border and region enhancement. The proposed framework consists in a novel generalized approach of common weighted averaging denoising algorithms mixed with Unsharp Masking (USM) border enhancement techniques, to avoid typical USM artifacts as ringing. Results of the different configurations within the image abstraction framework for a cell tracking application are presented.
作者:
S DalimanS A R Abu-BakarS H Md Nor AzamComputer Vision
Video and Image Processing (CvviP) Research Lab Department of Electronics and Computer Engineering Faculty of Electrical Engineering Universiti Teknologi Malaysia 81310 Skudai Johor MALAYSIA. Sime Darby Research Sdn. Bhd.
Jalan Pulau Carey 42960 Pulau Carey Selangor MALAYSIA.
This paper presents development of Haar-based rectangular windows for recognition of young oil palm tree based on WorldView-2 imagery data. Haar-based rectangular windows or also known as Haar-like rectangular feature...
This paper presents development of Haar-based rectangular windows for recognition of young oil palm tree based on WorldView-2 imagery data. Haar-based rectangular windows or also known as Haar-like rectangular features have been popular in face recognition as used in Viola-Jones object detection framework. Similar to face recognition, the oil palm tree recognition would also need a suitable Haar-based rectangular windows that best suit to the characteristics of oil palm tree. A set of seven Haar-based rectangular windows have been designed to better match specifically the young oil palm tree as the crown size is much smaller compared to the matured ones. Determination of features for oil palm tree is an essential task to ensure a high successful rate of correct oil palm tree detection. Furthermore, features that reflects the identification of oil palm tree indicate distinctiveness between an oil palm tree and other objects in the image such as buildings, roads and drainage. These features will be trained using support vector machine (SVM) to model the oil palm tree for classifying the testing set and subimages of WorldView-2 imagery data. The resulting classification of young oil palm tree with sensitivity of 98.58% and accuracy of 92.73% shows a promising result that it can be used for intention of developing automatic young oil palm tree counting.
Large scale digitization campaigns are simplifying the accessibility of a rapidly increasing number of images from cultural heritage. However, digitization alone is not sufficient to effectively open up these valuable...
详细信息
Human object classification is an important problem for smart video surveillance applications. In this paper we have proposed a method for human object classification, which classify the objects into two classes: huma...
详细信息
In this contribution, we present a segmentation algorithm based on thresholding to subdivide an intensity image in the regions of object and background. The optimal threshold is found by maximizing a likelihood functi...
详细信息
In this contribution, we present a segmentation algorithm based on thresholding to subdivide an intensity image in the regions of object and background. The optimal threshold is found by maximizing a likelihood function derived from a novel intensity probability density function model, which consists of the sum of two weighted four-parameter gamma distributions, as a more flexible alternative to currently used models consisting of the sum of two weighted two-parameter Gaussian distributions. According to our experiments with 132 images, the proposed algorithm is in average slightly better than the best found in the scientific literature, performing particularly good in low contrast images. The additional parameters and complexity of its likelihood function resulted in an increase of the processing time by a factor of 3, from 0.003 sec/image to 0.009 sec/image.
暂无评论