MPEG-4 simple profile video is used as the video compression standard in mobile video communications. MPEG-4 video requires more computational power because of its high complexity. Currently, ARM cores are widely used...
详细信息
ISBN:
(纸本)0780373006
MPEG-4 simple profile video is used as the video compression standard in mobile video communications. MPEG-4 video requires more computational power because of its high complexity. Currently, ARM cores are widely used in mobile applications because of their low power consumption. Design of fully standard-compliant MPEG-4 video encoder with real time speed on a RISC processor like ARM for embedded applications requires optimizations at all levels. This paper describes in detail about the efficient implementation of MPEG-4 simple profile video encoder on ARM9TDMI core requiring 50 Mega Cycles to encode QCIF resolution video at 15 frames per second with minimum processing power and memory requirements.
In this paper, we present a robust algorithm to capture rapid human motion with self-occlusion. Instead of predicting the position of each human feature, the interest-region of full body is estimated. Then candidate f...
详细信息
ISBN:
(纸本)3540002626
In this paper, we present a robust algorithm to capture rapid human motion with self-occlusion. Instead of predicting the position of each human feature, the interest-region of full body is estimated. Then candidate features are extracted through the overall search in the interest-region. To establish the correspondence between candidate features and actual features, an adaptive Bayes classifier is constructed based on the time-varied models of feature attributions. At last, a hierarchical human feature model is adopted to verify and accomplish the feature correspondence. To improve the efficiency, we propose a multi-resolution search strategy: the initial candidate feature set is estimated at the low resolution image and successively refined at higher resolution levels. The experiment demonstrates the effectiveness of our algorithm.
This paper describes a portable Pattern Recognition System (PRS) based on embedded technology for intelligent volatile detection (Electronic Nose). This instrument is designed to hold advanced signal processing and di...
详细信息
ISBN:
(纸本)076951695X
This paper describes a portable Pattern Recognition System (PRS) based on embedded technology for intelligent volatile detection (Electronic Nose). This instrument is designed to hold advanced signal processing and digital communications services in a contained size. A summary of the hardware is presented followed by an application to the identification of extra virgin olive oils. The instrument of the example is able to classify eleven different classes of Spanish olive oil with a 79% of accuracy and relatively simple pattern recognition techniques.
In this paper, we consider applications of perception-based video quality metrics to improve the performance of global lighting computations for dynamic environments. For this purpose we extend the Visible Difference ...
详细信息
ISBN:
(纸本)0819444022
In this paper, we consider applications of perception-based video quality metrics to improve the performance of global lighting computations for dynamic environments. For this purpose we extend the Visible Difference Predictor (VDP) developed by Daly to handle computer animations. We incorporate into the VDP the spatio-velocity CSF model developed by Kelly. The CSF model requires data on the velocity of moving patterns across the image plane. We use the 3D image warping technique to compensate for the camera motion, and we conservatively assume that the motion of animated objects (usually strong attractors of the visual attention) is fully compensated by the smooth pursuit eye motion. Our global illumination solution is based on stochastic photon tracing and takes advantage of temporal coherence of lighting distribution, by processing photons both in the spatial and temporal domains. The VDP is used to keep noise inherent in stochastic methods below the sensitivity level of the human observer. As a result a perceptually-consistent quality across all animation frames is obtained.
Applications in the creation of virtual auditory spaces (FAS) and sonification require individualized head related transfer functions (HRTFs) for perceptual fidelity. HRTFs exhibit significant variation from person to...
详细信息
In the walk-through via an Internet, we move a mouse to imitate a walk by our foot [3]. Here, we consider the case which maps a real motion to an immersive virtual reality environment. For example, we wander in comput...
详细信息
ISBN:
(纸本)0769517846;0769517854
In the walk-through via an Internet, we move a mouse to imitate a walk by our foot [3]. Here, we consider the case which maps a real motion to an immersive virtual reality environment. For example, we wander in computer room as if we do in a museum for watching some arts. The user's head motion at a relative large scale environment is needed to be known for realizing this-kind-like immersive virtual reality. This research proposes a method of estimating the head motion based upon vision approaches for mapping a head motion to an immersive virtual reality environment. We put fiducial markers around the room to be changed virtually and use an onmidirectional image sensor to observe these markers. By mounting the visual sensor on a helmet, the head motion of a user who wears the helmet can be estimated by processingimages captured from the onmidirectional image sensor Since it has a 360 degree view, it can cope with a big head rotation motion compared with a normal camera Since a head motion at every time is directly estimated from the observed markers, there is no accumulated errors in our method compared with a inertial sensor.
We describe audio-to-visual conversion techniques for efficient multimedia communications. The audio signals are automatically converted to visualimages of mouth shape. The visual speech can be represented as a seque...
详细信息
作者:
Cooper, TSony Elect Inc
Media Proc Div Network & Software Technol Ctr Amer San Jose CA 95134 USA
This paper addresses the Frankle-McCann Retinex algorithm by altering its properties to provide a distance-weighting function to every ratio-product path from a source to a destination pixel. The algorithm is further ...
详细信息
ISBN:
(纸本)0819444022
This paper addresses the Frankle-McCann Retinex algorithm by altering its properties to provide a distance-weighting function to every ratio-product path from a source to a destination pixel. The algorithm is further modified to permit the hard RESET function to be replaced by a piece-wise smoother function that allows ratio-product propagation to slightly exceed the maximum brightness in each visual channel. Investigations of how segmentation can aid in reducing the computational complexity and provide a more realistic white balance are presented.
Automatic recognition of compressed speech in such applications as voice mail or call centers has significantly degraded performance compared to non-compressed data when background noise is present. Recognition of tra...
详细信息
Automatic recognition of compressed speech in such applications as voice mail or call centers has significantly degraded performance compared to non-compressed data when background noise is present. Recognition of transmitted speech, such as in cellular, voice over IP, or networked PDA input, may also face the problem of frame erasures. There have been various attempts to compensate for these two distortions using receiver-based techniques, but room for improvement may be limited. Since the demand for recognition of coded and transmitted speech is expected to increase significantly in the near future, it is of interest to determine what modifications can be made on the encoder/transmitter side. In this paper we explore issues in designing a speech coder aimed at improving recognition performance over a packet-lossy channel with minimal degradation in perceptual quality. We propose a multiple description version of a speech coder to alleviate distortions caused by frame erasures. We also propose a coder variation that uses mel-cepstral coefficients instead of linear prediction parameters as spectral specifier, allowing better recognition in noisy environments when access to the raw coder parameters is available at the receiver.
This paper conducts an empirical evaluation of MPEG-7 visual part of experimentation model (XM) color descriptors in a challenging problem of content-based retrieval of semantic image categories. The performance of th...
详细信息
ISBN:
(纸本)076951695X
This paper conducts an empirical evaluation of MPEG-7 visual part of experimentation model (XM) color descriptors in a challenging problem of content-based retrieval of semantic image categories. The performance of the four color descriptors provided in the current XM reference implementation, Color Layout, Color Structure, Dominant Color and Scalable Color is compared to that of HSV autocorrelogram, which has done well in recent empirical studies. Experimental results show that Color Structure provides best retrieval accuracy, whereas the computationally most expensive descriptor Dominant Color, is worst in this problem.
暂无评论