This paper develops a remote video monitoring system based on ARM;the hardware of the system is based on the ARM S3C2440 embedded chip, and the circuit structure of the main modules such as power supply, Ethernet inte...
详细信息
video frame interpolation is an important low-level task in the field of imageprocessing, which is widely applied to videoimage restoration enhancement, media players and display devices. In this work, we design an ...
详细信息
The current ground-based and space-based testing target infrared image inversion at the missile range adopts a sequential execution method of "image reading - target tracking extraction - inversion calculation - ...
详细信息
ISBN:
(纸本)9781510677661
The current ground-based and space-based testing target infrared image inversion at the missile range adopts a sequential execution method of "image reading - target tracking extraction - inversion calculation - result storage", which is inefficient and requires post-processing. With the new experimental tasks requiring real-timeprocessing of infrared images, the current processing mode is unable to meet the requirements. This article proposes a multi-thread real-time improvement scheme based on the producer/consumer pattern. By establishing data buffers between producers and consumers, producers and consumers can execute independently, achieving decoupling between each other and improving execution efficiency. Firstly, the core processing process of current inversion algorithms is analyzed and a flowchart is provided, including four key steps: image reading, target tracking extraction, inversion calculation, and result storage. Using program instrumentation, the execution time of each step in the inversion calculation process is obtained. The image reading takes about 5.4ms, the target tracking extraction takes about 22.5ms, the inversion calculation takes about 1.0ms, and the result saving takes about 16.2ms. Secondly, we propose a "producer/consumer-producer/consumer"pattern of infrared image inversion. Each part can be executed synchronously by different threads or thread groups. We test the execution time of each step and divide them into three weakly coupled modules: producer, consumer-producer, and consumer. The producer corresponds to image reading;the consumer corresponds to result storage;since the target tracking extraction process takes much longer than the inversion calculation process and the two processes are closely related, we combine them into the consumer-producer, which is both a consumer (relative to upstream producer) and a producer (relative to downstream consumer). Thirdly, we determine the number of threads for producer, consumer-producer,
In order for object detection and tracking in videos obtained from unmanned aerial vehicles (UAVs) by deep convolutional neural networks (DCNN), extensive ground truth optical flow, occlusion and segmentation datasets...
详细信息
ISBN:
(纸本)9798350343557
In order for object detection and tracking in videos obtained from unmanned aerial vehicles (UAVs) by deep convolutional neural networks (DCNN), extensive ground truth optical flow, occlusion and segmentation datasets, of various objects or vehicles, are required during the training and testing processes. The mentioned ground truth informations are not widely available in the literature due to the difficulty of labeling or extracting them from real-life recorded UAV videoimages. In this study, ground truth optical flow, occlusion and segmentation datasets were produced synthetically for the first time with the UAV point of view in a novel way, so as to fill the gap in literature. The ground truth datasets were created for each vehicle by subjecting the triangles (mesh) automatically generated by the Unity engine to the homography method. With this method, 1920x1080 and 250x250 sized synthetic datasets consisting of 100 scenarios were obtained.
Capturing motion vehicle information from satellite videos is crucial for real-time traffic monitoring and emergency response. However, vehicles in satellite videos are small in size, lack detailed textural features a...
详细信息
The Karhunen-Loeve transform (KLT) is often used for data decorrelation and dimensionality reduction. The KLT is able to optimally retain the signal energy in only few transform components, being mathematically suitab...
详细信息
The Karhunen-Loeve transform (KLT) is often used for data decorrelation and dimensionality reduction. The KLT is able to optimally retain the signal energy in only few transform components, being mathematically suitable for image and video compression. However, in practice, because of its high computational cost and dependence on the input signal, its application in real-time scenarios is precluded. This work proposes low-computational cost approximations for the KLT. We focus on the blocklengths N is an element of{4,8, 16, 32} because they are widely employed in image and video coding standards such as JPEG and high efficiency video coding (HEVC). Extensive computational experiments demonstrate the suitability of the proposed low-complexity transforms for image and video compression.
image aesthetics assessment (IAA) aims to estimate the aesthetics of images. Depending on the content of an image, diverse criteria need to be selected to assess its aesthetics. Existing works utilize pre-trained visi...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
image aesthetics assessment (IAA) aims to estimate the aesthetics of images. Depending on the content of an image, diverse criteria need to be selected to assess its aesthetics. Existing works utilize pre-trained vision backbones based on content knowledge to learn image aesthetics. However, training those backbones is time-consuming and suffers from attention dispersion. Inspired by learnable queries in vision-language alignment, we propose the image Aesthetics Assessment via Learnable Queries (IAA-LQ) approach. It adapts learnable queries to extract aesthetic features from pre-trained image features obtained from a frozen image encoder. Extensive experiments on real-world data demonstrate the advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1% in terms of SRCC and PLCC, respectively.
An imageprocessing algorithm for real-time examination of LED light strips is proposed, which enables quick detection of blind LED beads in strips. It is successfully used in production line to replace manual inspect...
详细信息
With the growth of digital data, its protection has become a requirement to dissemination it through telecommunication networks. Nowadays, people can easily generate, edit, and share images with their own electronic d...
详细信息
In the context of the advancing digital landscape, there is a discernible demand for robust and defensible methodologies in addressing the challenges in multi-class image classification. The evolution of intelligent s...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
In the context of the advancing digital landscape, there is a discernible demand for robust and defensible methodologies in addressing the challenges in multi-class image classification. The evolution of intelligent systems mandates swift evaluations of environmental variables to facilitate decision-making within an authorized workflow. Recognizing the imperative role of ensemble models, this paper undertakes an exploration into the efficacy of layered Convolutional Neural Network (CNN) architectures for the nuanced task of multi-class image classification, specifically applied to traffic signage recognition in the dynamic context of a moving vehicle. The research methodology employs a YOLO (You Only Look Once) model to establish a comprehensive training and testing dataset. Subsequently, a stratified approach is adopted, leveraging layered CNN architectures to categorize clusters of objects and, ultimately, extrapolate the pertinent speed limit values. Our endeavor aims to elucidate the procedural framework for integrating CNN models, providing insights into their accuracy within the application domain.
暂无评论