General-purpose graphics processing units (GPG-PUs) are at their best in accelerating computation by exploiting abundant thread-level parallelism (TLP) offered by many classes of HPC applications. To facilitate such h...
详细信息
ISBN:
(纸本)9781479910205
General-purpose graphics processing units (GPG-PUs) are at their best in accelerating computation by exploiting abundant thread-level parallelism (TLP) offered by many classes of HPC applications. To facilitate such high TLP, emerging programming models like CUDA and OpenCL allow programmers to create work abstractions in terms of smaller work units, called cooperative thread arrays (CTAs). CTAs are groups of threads and can be executed in any order, thereby providing ample opportunities for TLP. The state-of-the-art GPGPU schedulers allocate maximum possible CTAs per-core (limited by available on-chip resources) to enhance performance by exploiting TLP. However, we demonstrate in this paper that executing the maximum possible number of CTAs on a core is not always the optimal choice from the performance perspective. High number of concurrently executing threads might cause more memory requests to be issued, and create contention in the caches, network and memory, leading to long stalls at the cores. To reduce resource contention, we propose a dynamic CTA scheduling mechanism, called DYNCTA, which modulates the TLP by allocating optimal number of CTAs, based on application characteristics. To minimize resource contention, DYNCTA allocates fewer CTAs for applications suffering from high contention in the memory subsystem, compared to applications demonstrating high throughput. Simulation results on a 30-core GPGPU platform with 31 applications show that the proposed CTA scheduler provides 28% average improvement in performance compared to the existing CTA scheduler.
With the advent of the MPEG-4 and MPEG-7 standards [2, 3], a further impulse to the representation and description of multimedia information has been given. In particular, object-based coding and description are nowad...
详细信息
With the advent of the MPEG-4 and MPEG-7 standards [2, 3], a further impulse to the representation and description of multimedia information has been given. In particular, object-based coding and description are nowadays possible (or about to become so). In both object-based coding and description environments the estimation of video object's relevance can be very useful since it has a major role in segmentation quality evaluation, which is responsible for selecting the appropriate segmentation algorithm to use for identification of the objects to work with [1]. Object relevance information is also very valuable for the rate control module of an object-based coder. In description creation, relevance can be used directly as an object descriptor, or indirectly to ensure that more relevant objects receive more detailed and complete descriptions. Recognizing the importance of object relevance estimation, this paper proposes an objective metric, automatically calculated, for video object relevance evaluation, both when objects are considered individually and within a given context.
The impact of using different lossless compression algorithms when compressing biometric iris sample data from several public iris databases is investigated. In particular, we relate the application of dedicated lossl...
详细信息
The impact of using different lossless compression algorithms when compressing biometric iris sample data from several public iris databases is investigated. In particular, we relate the application of dedicated lossless image codecs like lossless JPEG, JPEG-LS, PNG, and GIF, lossless variants of lossy codecs like JPEG2000, JPEG XR, and SPIHT, and a few general purpose compression schemes to rectilinear iris imagery. The results are discussed in the light of the recent ISO/IEC FDIS 19794-6 and ANSI/NIST-ITL 1-2011 standards and the IREX recommendations.
It has long been realized that the current JPEG standard does not provide state-of-the-art performance in its lossless mode. In view of this, the International Standards Organization (ISO) recently solicited proposals...
详细信息
It has long been realized that the current JPEG standard does not provide state-of-the-art performance in its lossless mode. In view of this, the International Standards Organization (ISO) recently solicited proposals for a new lossless/nearly lossless compression standard for continuous-tone still images. A total of nine proposals were submitted in the summer of 1995. Seven of these used a prediction step for 'decorrelating' the image prior to modelling and encoding. In this paper we investigate the efficacy of the different prediction schemes that were proposed. We also discuss their computational complexity and the price/performance trade-offs that emerge from our study.
In JPEG2000, the wavelet transformed image is subdivided into equally sized data blocks, which are independently entropy coded. It is therefore often strongly suggested that these blocks can be decoded on the instant ...
详细信息
In JPEG2000, the wavelet transformed image is subdivided into equally sized data blocks, which are independently entropy coded. It is therefore often strongly suggested that these blocks can be decoded on the instant of their creation at the encoder, making temporary memory quasi obsolete. Unfortunately, data dependencies in the forward and inverse wavelet transform prevent their complete, instantaneous creation (consumption) at the encoder (decoder), resulting in their partial, temporary storage and thus also in increased memory needs. This paper shows that the main burden is the unique, large data block size, which is used over all levels of the wavelet image. Relaxing this constraint substantially diminishes the memory needs over the full JPEG2000 codec chain.
This paper deals with the efficient transmission of JPEG compressed images over Multiple Input Multiple Output (MIMO) systems using Spatial Multiplexing (SM). By exploiting the advantages of multiple antenna systems, ...
详细信息
This paper deals with the efficient transmission of JPEG compressed images over Multiple Input Multiple Output (MIMO) systems using Spatial Multiplexing (SM). By exploiting the advantages of multiple antenna systems, data-rate, reliability, and throughput can be improved. The images are compressed using a progressive Discrete Cosine transform (DCT) based JPEG coder with spectral selection mode and an antenna selection scheme is performed (assuming perfect channel knowledge at the transmitter and receiver). The Signal to Interference Noise Ratio (SINR) of different antenna paths are calculated and the antennas are arranged in decreasing order of SINR's to transmit the different layers with a novel power allocation scheme. Linear Least Square Estimate (LLSE) algorithm is used to estimate the transmitted vector since it has the desirable property of not enhancing noise as much as the ZF estimator. The proposed scheme provides significant image quality improvement and less distortion compared to the existing EPA without SM.
The design of a speech transform coder which is robust against channel errors is presented. The bit rate is 8 kb/s including 1.3 kb/s redundancy bits for bit-selective error correction. The coder is based on transform...
详细信息
The design of a speech transform coder which is robust against channel errors is presented. The bit rate is 8 kb/s including 1.3 kb/s redundancy bits for bit-selective error correction. The coder is based on transform coding with weighted vector quantization and two-channel conjugate vector quantization. Each scheme improves the robustness against channel errors without sacrificing the performance of the error-free case. Real-time operation can be achieved with 10 MIPS DSP and 8-kword memory with a 80-ms coding delay. The mean opinion score of the coded speech is comparable to that of 5-bit log PCM even at a 1% burst error rate.< >
This paper describes an innovative compression method of panoramic images based on MPEG-7 descriptors. The proposed solution employs a detection of a series of individual video frame overlaps in order to produce conca...
详细信息
This paper describes an innovative compression method of panoramic images based on MPEG-7 descriptors. The proposed solution employs a detection of a series of individual video frame overlaps in order to produce concatenated panoramic images. The presented method is easy to implement even in simple devices such as low power consuming chipsets installed in remote cameras having limited power supplies. Under subjective tests it has been proved that the concatenation method allows for achieving lower transmission rates while sustaining picture quality.
As many special attentions have been paid to modern teaching methods, the demands of multimedia teaching equipments are growing rapidly. However, most of the multimedia teaching equipments are still at a low level. In...
详细信息
As many special attentions have been paid to modern teaching methods, the demands of multimedia teaching equipments are growing rapidly. However, most of the multimedia teaching equipments are still at a low level. In order to improve the situation, researching an embedded video terminal for teaching systems has practical significance. In this paper, we introduce an embedded video terminal that supports MPEG-4 and fits for multimedia teaching system. The terminal based on Z228 is able to decode MPEG-4 bit stream up to 30 frames per second at VGA (640 x 480) resolution. Additionally, as a highlight feature of the design, the implemented terminal is a low price solution in which only one chip is applied to realize complex MPEG-4 decoder.
Indexing requirement for efficient accessing to visual data has been increased with the widespread use of multimedia applications. Satisfaction of this requirement mostly depends on the automatic extraction of objects...
详细信息
Indexing requirement for efficient accessing to visual data has been increased with the widespread use of multimedia applications. Satisfaction of this requirement mostly depends on the automatic extraction of objects in the visual data. In this study, component-based object extraction method is compared with object extraction in its entirety. Applied method, implemented system and conducted tests are presented in this paper. Test results show that, even in the case of a good segmentation is achieved for whole object, object components are classified more successfully compared to whole object.
暂无评论