The proceedings contain 65 papers. The topics discussed include: a 13.5 Mhz single chip multiformat discrete cosine transform;a 16x16 discrete cosine transform chip;block list transform (BLT) coding of images;applicat...
The proceedings contain 65 papers. The topics discussed include: a 13.5 Mhz single chip multiformat discrete cosine transform;a 16x16 discrete cosine transform chip;block list transform (BLT) coding of images;applications of vector quantization to progressive compression and transmission of images;efficient transmission of ‘most important’ successive image approximations;orthogonal pyramid transforms for image coding;image coding for Chinese character patterns;pattern recognition with a spiral sampling technique;improved edge-based image segmentation;parallel fitting of quadric patches for structural analysis of range imagery;feature extraction from intensity images;and experiments on image coding with distortion below visual threshold.
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, ap...
详细信息
ISBN:
(纸本)9783031804373;9783031804380
Text-to-image generation is a cutting-edge technology that enables computers to generate images from textual descriptions. While this technology has been extensively researched and applied to English language text, applying it to Arabic language text is still in its early stages. Additionally, the Arabic language is challenging due to its right-to-left writing system and extensive vocabulary of 1.3 million words. In this paper, we explore text-to-image generation for generating images from Arabic language text descriptions. Firstly, we fine-tune a transformer-based model pre-trained on the Arabic text to transform the text information into affine transformation within the DF-GAN generator. Secondly, we present a text transformer that combines LSTM layers to address the limitation of unrecognized words. Thirdly, a mask predictor is trained into the generator using a weakly supervised method and incorporated into the affine transformation for a more effective integration of image and text features. In addition, we add the DAMSM loss function as a regularization to the loss function to achieve convergences and stability in the training phase. The experiment on two challenging datasets CUB and Oxford-flower shows that our architectures can accurately generate high-quality images faithfully representing the Arabic textual descriptions. We believe the scaling of this task could have critical applications in fields such as Arabic visual learning, e-commerce, advertising, and entertainment.
Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection...
详细信息
ISBN:
(纸本)9783031821523;9783031821530
Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection. Due to the higher computing complexity of this technique and the large amount of data carried by the video signal, it is nearly impossible for ordinary general-purpose processors GPPs or CPUs to run these techniques in real-time, especially for embedded systems applications. Therefore, special hardware that can acquire, control, or execute in parallel is required. These specialized hardware systems include Digital Signal Processors DSPs, Field Programmable Gate Arrays FPGAs, visualprocessing Units VPUs, Tensor processing Units TPUs, Neural processing Units NPUs or Graphics processing Units GPUs. This work presents the benefits of accelerating traditional object detection methods on a high-end embedded system, the Jetson Nano Developer Kit. This single computer board is equipped with the Tegra K1 System on Chip SoC, which is composed of a quad-core ARM A15 and 192 cores of Kepler-embedded GPU. Computing acceleration was ensured via the use of the CUDA OpenCV library for both the Histogram of Oriented Gradients HOG and the Haar Cascade Classifier. For VGA resolution, results reveal that the GPU implementation on this embedded system is 1.4x faster than the CPU for the HOG method and 2x for the Haar Cascade Classifier method.
As a continuation of Part I, the spectral correlation function is presented for a variety of types of digitally modulated signals. These include digital pulse-amplitude, pulse-width, and pulse-position modulation, and...
详细信息
As a continuation of Part I, the spectral correlation function is presented for a variety of types of digitally modulated signals. These include digital pulse-amplitude, pulse-width, and pulse-position modulation, and various types of phase-shift keying and frequency-shift keying. The magnitudes of the spectral correlation functions are graphed as the heights of surfaces above a bifrequency plane, and these graphs are used as visual aids for comparison and contrast of the spectral correlation properties of different modulation types.
The Random Scan Computer Vision System acquires two-dimensional data along task-dependent scanning patterns matched to image structure. A scanning trajectory is generated based on a-priori information concerning image...
详细信息
The primary goal of an image algebra is the development of a mathematical environment in which to express the various algorithms employed in imageprocessing. From a practical standpoint, this means that the algorithm...
详细信息
The primary goal of an image algebra is the development of a mathematical environment in which to express the various algorithms employed in imageprocessing. From a practical standpoint, this means that the algorithms should appear as strings in an operational calculus, where each operator can ultimately be expressed as a string composed of some collection of elemental, or "basis," operators and where the action of the string upon a collection of input images is determined by function composition. For instance, rather than defining operations such as convolution and dilation in a pointwise manner, we desire closed-form expressions of these operators in terms of low-level operations that are close to the algebraic structure of the underlying mathematical entities upon which images are modeled. It is precisely such an approach that will yield a natural symbolic language for the expression of imageprocessing algorithms.
This paper describes imageprocessing equipment developed for a data entry system aimed at creating a plant-record database. The equipment employs a mu1ti-processor architecture allowing parallel processing of very la...
详细信息
Since the input problem of Chinese character is the barrier of the integration of computer, Chinese and communication { C & C & C), we studied and developed a powerful Chinese multifont recognition system. In ...
详细信息
暂无评论