This paper proposes a novel method for automatically summarizing MPEG coded audio-video contents on compressed domain, and describing the audio-video summaries in the MPEG-7 standard format. Semantically important fea...
详细信息
ISBN:
(纸本)0780376226
This paper proposes a novel method for automatically summarizing MPEG coded audio-video contents on compressed domain, and describing the audio-video summaries in the MPEG-7 standard format. Semantically important features for summary generation are carefully analyzed, and two types of summaries, digest and highlight, are extracted. A digest extraction is based on an audio level estimation and visual features analysis, where digest shots are adaptively determined for user-specified duration by utilizing the MPEG-7 based motion features. A highlight-based. summarization from TV broadcasted sport programs is achieved by analyzing audio class and audio level. The extracted summaries and related features are described by the MPEG-7 description tools. The experimental results show that the digests and highlights were successfully extracted in various kinds of contents without a priori knowledge of content characteristics.
This paper studies a progressive image transmission technique over waveform channels. The Channel Optimized Vector Quantization codec (COVQ) [1] is applied to the image wavelet coefficients creating a robust progressi...
详细信息
ISBN:
(纸本)0780376226
This paper studies a progressive image transmission technique over waveform channels. The Channel Optimized Vector Quantization codec (COVQ) [1] is applied to the image wavelet coefficients creating a robust progressive image transmission technique that mitigates the effects of a noisy channel on the reconstructed image. In order to evaluate the performance of our proposal, a Gaussian and slow-fading Rayleigh channel model, with several different values of Channel Signal to Noise Ratio (CSNR) were simulated in our experiments. Examples show a significant visual improvement of our application compared to other progressive image transmission techniques.
Content-based retrieval (CBIR) methods in medical databases have been designed to support specific tasks, such as retrieval of digital mammograms or 3D MRI images. These methods cannot be transferred to other medical ...
详细信息
ISBN:
(纸本)0780376226
Content-based retrieval (CBIR) methods in medical databases have been designed to support specific tasks, such as retrieval of digital mammograms or 3D MRI images. These methods cannot be transferred to other medical applications since different imaging modalities require different types of processing. To enable content-based queries in diverse collections of medical images, the retrieval system must be familiar with the current image class prior to the query processing. We describe a novel approach for the automatic categorization of medical images according to their modalities. We propose a semantically based set of visual features, their relevance and organization for capturing the semantics of different imaging modalities. The features are used in conjunction with a new categorization metric, enabling "intelligent" annotation, browsing/searching of medical databases. Our algorithm provides basic semantic knowledge about the image, and may serve as a front-end to the domain specific medical image analysis methods. To demonstrate the effectiveness of our approach, we have designed and implemented an Internet portal for browsing/querying online medical databases, and applied it to a large number of images. Our results demonstrate that accurate categorization can be achieved by exploiting the important visual properties of each modality.
Standard DCT-based video coding techniques perform good results in terms of data compaction, making feasible the use of digital video in several application frameworks. The price to pay is the introduction of annoying...
详细信息
ISBN:
(纸本)0780374029
Standard DCT-based video coding techniques perform good results in terms of data compaction, making feasible the use of digital video in several application frameworks. The price to pay is the introduction of annoying visual distortions/artefacts in the reconstructed video, The lower the encoding bit-rate, the larger the number of artefacts. Post-processing is a practical solution that achieves a visual enhancement of the compressed images after decoding. Some of the artefacts, such as blocking (tiled-effect aspect) and ringing (ghost effect) have already been widely studied. Therefore, this paper focuses on a novel post-processing technique, that reduces a temporal phenomenon known as "mosquito noise" ("flying bunch of mosquitoes" aspect). This temporal "busyness" is discarded by applying in the frequency domain an adaptive temporal filtering that preserves sharpness and naturalness of the reconstructed video signal. The proposed algorithm. is straightforward, painless to implement, and effective: experimental results show that the presented temporal post-filtering process improves significantly the visual quality of video.
In this paper, we describe a visual feedback system using a stereoscopic microscope that controls a micromanipulator so that a needle head may pierce a target as much length as desired At first, we developed an image ...
详细信息
ISBN:
(纸本)0780372182
In this paper, we describe a visual feedback system using a stereoscopic microscope that controls a micromanipulator so that a needle head may pierce a target as much length as desired At first, we developed an imageprocessing algorithm for the lip of needle head to touch the target. Secondarily, we developed an algorithm for prediction of the tip position of the needle head within the target. By performing a preoperation, the shape of the needle head Is preserved as a reference pattern. When the needle head piercing the target, the shape of the needle head within the target is predicted by pattern matching. Thus, we developed a microinjection system that axially pierces the target. Experimental results show that the proposed system may be useful in micromanipulation such as microinjection to brain areas in neuroanatomy.
This paper begins by introducing biometrics and their underlying performance factors. Biometrics are sometimes classed as either behavioural or physiological. Difficulties with these classes are discussed in terms of ...
详细信息
ISBN:
(纸本)0780374029
This paper begins by introducing biometrics and their underlying performance factors. Biometrics are sometimes classed as either behavioural or physiological. Difficulties with these classes are discussed in terms of the importance of dynamics, highlighting the key point that definitions are clarified if the biometric information-bearing signal itself is considered. Emphasis is then given to visual speech in the form of lip profiles The case is made that these are a special case in that they privide a vehicle for a twin biometric: both behavioural. and physiological. It is argued that lips might well be unique in providing a practical twin biometric. Illustration is presented in the form of practical experiments based around visual speech and lip profiles. Experimental results using short, test and training segments from video recordings give recognition error rates as: physiological lip-profiles 2% and behavioural lip-profiles 15%.
In many applications such as construction, manufacturing, ground robotic vehicles, and rescue operations, there are many issues that necessitate the capability of transmitting digital video and that such transmissions...
详细信息
ISBN:
(纸本)0780376226
In many applications such as construction, manufacturing, ground robotic vehicles, and rescue operations, there are many issues that necessitate the capability of transmitting digital video and that such transmissions should be performed wirelessly and in an ad-hoc manner. Recently, we proposed an ad-hoc, cluster-based, multihop network architecture for video communications. For implementation, the IEEE 802.11 FHSS wireless LAN system using 2GFSK modulation has been deployed. To enhance the overall throughput rate for higher quality video communications, we present a performance evaluation of the IEEE 802.11 FHSS when 4GFSK modulation option is selected. Unfortunately, the 2 Mb/s system utilizing 4GFSK modulation is not very efficient in terms of RF range. Therefore, to improve its performance for multihop applications, a combination of diversity and non-coherent Viterbi based receiver is considered. For the video transmission part, we have considered a bitstream splitting technique together with a packet-based error protection strategy to combat packet drops under multipath fading conditions. Finally, the paper presents the simulation results, including the effects of the receiver design and diversity on the quality of the received video signals.
The rationale, formulae, and examples for assessing the color inconstancy of consumer digital cameras are presented. Its use is appropriate for evaluating the color/neutral balance precision of consumer digital camera...
详细信息
ISBN:
(纸本)0892082380
The rationale, formulae, and examples for assessing the color inconstancy of consumer digital cameras are presented. Its use is appropriate for evaluating the color/neutral balance precision of consumer digital camera systems with respect to illumination quality;specifically for a given rendering intent such as sRGB or ROMM. Adopting the DeltaE(94)* color difference measure, an extended color inconstancy index (CII) is used to track the stability of color balancing algorithms across multiple illuminant types by effectively performing;a variance analysis with respect to the illuminant variable for, multiple color stimuli. Five differently branded 2 megapixel cameras were tested. The CH numerical results correlated well with visual engineering judgments. Results for both chromatic and nonchromatic samples are presented.
In assessment of image quality, as well as in the image coding, end users are human beings. They need some type of image for work or entertainment. Sol it is logical to pursuit the goal of achieving better image quali...
详细信息
ISBN:
(纸本)9537044017
In assessment of image quality, as well as in the image coding, end users are human beings. They need some type of image for work or entertainment. Sol it is logical to pursuit the goal of achieving better image quality by investigating what is better to human eyes. This paper is a little contribution in that direction. Just-noticeable-distortion and some aspects of it are discussed and experimental results on visibility threshold of low gray tones are presented.
Most wavelet-based image coders fail to model the joint coherent behavior of wavelet coefficients near edges. Wedgelets offer a convenient parameterization for the edges in an image, but they have yet to yield a viabl...
详细信息
ISBN:
(纸本)0780376226
Most wavelet-based image coders fail to model the joint coherent behavior of wavelet coefficients near edges. Wedgelets offer a convenient parameterization for the edges in an image, but they have yet to yield a viable compression algorithm. In this paper, we propose an extension of the zerotree-based Space-Frequency Quantization (SFQ) algorithm by adding a wedgelet symbol to its tree-pruning optimization. This incorporates wedgelets into a rate-distortion compression framework and allows simple, coherent descriptions of the wavelet coefficients near edges. The resulting method yields improved visual quality and increased compression efficiency over the standard SFQ technique.
暂无评论