More and more governments and authorities around the world are promoting the use of bicycles in cities, as this is healthy for the bicyclist and improves the quality of life in general. Safety and efficiency of bicycl...
详细信息
ISBN:
(纸本)9780819494290
More and more governments and authorities around the world are promoting the use of bicycles in cities, as this is healthy for the bicyclist and improves the quality of life in general. Safety and efficiency of bicyclists has become a major focus. To achieve this, there is a need for a smarter approach towards the control of signalized intersections. Various traditional detection technologies, such as video, microwave radar and electromagnetic loops, can be used to detect vehicles at signalized intersections, but none of these can consistently separate bikes from other traffic, day and night and in various weather conditions. As bikes should get a higher priority and also require longer green time to safely cross the signalized intersection, traffic managers are looking for alternative detection systems that can make the distinction between bicycles and other vehicles near the stop bar. In this paper, the drawbacks of a video-based approach are presented, next to the benefits of a thermal-video-based approach for vehicle presence detection with separation of bicycles. Also, the specific technical challenges are highlighted in developing a system that combines thermal image capturing, imageprocessing and output triggering to the traffic light controller in near real-time and in a single housing.
In the pursuit of effective real-timevideo transmission for First-Person View (FPV) drone systems, optimizing the encoding process is paramount. Traditional encoding methods, reliant on pre-encoding demosaicking, oft...
详细信息
In the pursuit of effective real-timevideo transmission for First-Person View (FPV) drone systems, optimizing the encoding process is paramount. Traditional encoding methods, reliant on pre-encoding demosaicking, often fall short in balancing the trade-off between video quality and latency, essential for seamless real-time feedback. This work proposes a novel approach by deferring the demosaicking process to the decoder side, thereby encoding the rearranged Bayer pattern (RGGB) data directly. This deferment significantly reduces the input data size, to the tune of a threefold reduction, thereby achieving a more expeditious encoding process. The tailored encoder and decoder architecture ensures the accurate reconstruction of the full-color image on the decoder side. Through a comprehensive evaluation, leveraging a specialized video quality assessment framework designed for FPV drone footage, our findings illuminate the substantial benefits of our proposed method. Specifically, it achieves faster encoding times and reduced computational overhead, pivotal for low-latency applications. Furthermore, this study opens avenues for integrating advanced encoding techniques into commercial FPV drone systems, potentially enriching user experiences across various applications. Our research not only addresses a critical gap in real-timevideo transmission but also sets the stage for future exploration into optimizing encoding methodologies for the next generation of FPV drone technologies.
In this study, we implemented a real-timevideo avatar generation method using imageprocessing and machine learning to develop a new advanced communication tool that enables information sharing and decision-making. U...
详细信息
In this study, we implemented a real-timevideo avatar generation method using imageprocessing and machine learning to develop a new advanced communication tool that enables information sharing and decision-making. Using the proposed method, users can easily send their real-timevideo avatars to the metaverse. We evaluated a questionnaire survey on 12 subjects to evaluate the effectiveness of the proposed real-timevideo avatar generation method. This evaluation was conducted based on the sense of being in the same room, degree of concentration, degree of communication of non-verbal information, ease of timing utterances and naturalness of conversation for 12 subjects. The 12 subjects held discussions using video conferencing and our proposed metaverse conferencing systems with real-timevideo avatars and answered some evaluation questionnaires. The evaluation results revealed that our proposed metaverse conferencing system using real-timevideo avatars is superior to the video conferencing system in all evaluation items.
An intelligent retrievable object-tracking system assists users in quickly and accurately locating lost objects. However, challenges such as real-timeprocessing on edge devices, low image resolution, and small-object...
详细信息
An intelligent retrievable object-tracking system assists users in quickly and accurately locating lost objects. However, challenges such as real-timeprocessing on edge devices, low image resolution, and small-object detection significantly impact the accuracy and efficiency of video-stream-based systems, especially in indoor home environments. To overcome these limitations, a novel real-time intelligent retrievable object-tracking system is designed. The system incorporates a retrievable object-tracking algorithm that combines DeepSORT and sliding window techniques to enhance tracking capabilities. Additionally, the YOLOv7-small-scale model is proposed for small-object detection, integrating a specialized detection layer and the convolutional batch normalization LeakyReLU spatial-depth convolution module to enhance feature capture for small objects. TensorRT and INT8 quantization are used for inference acceleration on edge devices, doubling the frames per second. Experiments on a Jetson Nano (4 GB) using YOLOv7-small-scale show an 8.9% improvement in recognition accuracy over YOLOv7-tiny in video stream processing. This advancement significantly boosts the system's performance in efficiently and accurately locating lost objects in indoor home settings.
This study offers a fresh technique for translating subtitles in sports events, addressing the issues of real-time translation with improved accuracy and efficiency. Different from standard methods, which often result...
详细信息
This study offers a fresh technique for translating subtitles in sports events, addressing the issues of real-time translation with improved accuracy and efficiency. Different from standard methods, which often result in delayed or inaccurate subtitles, the proposed method integrates advanced annotation techniques and machine learning algorithms to increase subtitle recognition and extraction. Annotation techniques in this study include systematically labeling spoken elements like commentary and dialogue, enabling accurate subtitle recognition and real-time adjustments in live sports broadcasts to ensure both accuracy and contextual relevance. These novel ideas allow for seamless adjustments to multiple language types, including the voices of commentators, off-site hosts, and athletes, while maintaining critical information within strict word count limits. Key improvements include faster processingtimes and increased translation precision, which are crucial for the dynamic environment of live sports broadcasts. The study builds on past studies in audiovisual translation, specifically tailoring its strategy to the unique demands of sports media. By emphasizing the importance of clear and contextually appropriate real-time subtitles, this research presents significant advancements over existing methods, providing valuable insights for future translation projects in sports and similar contexts. The results contribute to a more effective subtitle translation framework, enhancing the accessibility and viewing experience for audiences during live sports events.
In this paper, we present an end-to-end holographic video conferencing system that enables real-time high-quality free-viewpoint rendering of participants in different spatial regions, placing them in a unified virtua...
详细信息
In this paper, we present an end-to-end holographic video conferencing system that enables real-time high-quality free-viewpoint rendering of participants in different spatial regions, placing them in a unified virtual space for a more immersive display. Our system offers a cost-effective, complete holographic conferencing process, including multiview 3D data capture, RGB-D stream compression and transmission, high-quality rendering, and immersive display. It employs a sparse set of commodity RGB-D cameras that capture 3D geometric and textural information. We then remotely transmit color and depth maps via standard video encoding and transmission protocols. We propose a GPU-parallelized rendering pipeline based on an image-based virtual view synthesis algorithm to achieve real-time and high-quality scene rendering. This algorithm uses an on-the-fly Truncated Signed Distance Function (TSDF) approach, which marches along virtual rays within a computed precise search interval to determine surface intersections. We then design a multiweight projective texture mapping method to fuse color information from multiple views. Furthermore, we introduce a method that uses a depth confidence map to weight the rendering results from different views, which mitigates the impact of sensor noise and inaccurate measurements on the rendering results. Finally, our system places conference participants from different spaces into a virtual conference environment with a global coordinate system through coordinate transformation, which simulates a realconference scene in physical space, providing an immersive remote conferencing experience. Experimental evaluations confirm our system's real-time, low-latency, high-quality, and immersive capabilities.
Conventional methods that merge multiple images with different exposure levels often suffer from blur and ghosting due to object movement. Existing ghosting removal algorithms are usually complex and slow, making them...
详细信息
Conventional methods that merge multiple images with different exposure levels often suffer from blur and ghosting due to object movement. Existing ghosting removal algorithms are usually complex and slow, making them unsuitable for real-timevideo applications. To address this challenge, on an FPGA. IMX662 image sensor is employed, which simultaneously captures both HCG and LCG images with the same exposure time, enabling efficient HDR image synthesis. The proposed method directly addresses the source of the problem, eliminating the need for post-processing steps, thereby preserving algorithmic simplicity. Experimental results reveal that the proposed method not only removes ghosting by 100% but also processes data on an FPGA 98.79% faster than traditional software-based HDR fusion techniques, enabling real-timevideo stream processing. This dual gain, ghosting-free fusion algorithm demonstrates promising potential for use in high-speed photography and surveillance.
Healthcare monitoring depends on the accuracy of the measured physiological parameters in real-time, given the ongoing increase in the number of patients as compared to the limited medical physicians. Imaging photople...
详细信息
Healthcare monitoring depends on the accuracy of the measured physiological parameters in real-time, given the ongoing increase in the number of patients as compared to the limited medical physicians. Imaging photoplethysmography (IPPG) is one of the emerging non-invasive techniques for the measurement of vital signs, including oxygen saturation (SpO2), heart rate (HR), and respiratory rate (RR). This work explores a comprehensive sensitivity analysis to evaluate the impact of the critical acquisition parameters such as (1) image resize, from 100 to 2%, (2) the region of interest (ROI) within the images, and (3) acquisition duration, from 5 s to 30 s, using image sequences obtained at 30 frames per second. To evaluate and validate the performance of the system, the study consists of several mouse examinations to enhance both precision and consistency in real-time monitoring. The analysis reveals that how image resize influences signal integrity, image resolution, and processing efficiency, which is crucial for resource-limited applications. The ROI selection analysis discovers the key regions to optimize the accuracy of measured vital signs, while the evaluation of acquisition duration provides insights in terms of ensuring the reliable minimum duration for vital signs. These comprehensive analysis advances the current state of the art and addresses the previously overlooked but important factors that offers a robust framework for effective real-time monitoring for research and medical applications.
Simultaneous Localization and Mapping is intended for robotic and autonomous vehicle applications. These targets require an optimal embedded implementation that respects real-time constraints, limited hardware resourc...
详细信息
Simultaneous Localization and Mapping is intended for robotic and autonomous vehicle applications. These targets require an optimal embedded implementation that respects real-time constraints, limited hardware resources, and energy consumption. SLAM algorithms are computationally intensive to run on embedded targets, and often, the algorithms are deployed on CPUs or CPU-GPGPU architectures. With the growth of embedded heterogeneous computing systems, research work is increasingly interested in the algorithm-architecture mapping of existing SLAM algorithms. The latest trend is pushing processing closer to the sensor. FPGAs constitute the perfect architecture for designing smart sensors by providing low latency suitable for real-time applications, such as video streaming, as they supply data directly into the FPGA without needing a CPU. In this work, we propose the implementation of the HOOFR-SLAM front end on a CPU-FPGA architecture, including both feature extraction and matching processing blocks. A high-level synthesis (HLS) approach based on OpenCL paradigm has been used to design a new system architecture. The performance of the FPGA-based architecture was compared to a high-performance CPU. This innovative architecture delivers superior performance compared to existing state-of-the-art systems.
Aiming at the problem of low defect detection rate of PCB images captured by cameras in industrial scenarios under low-light environments, an MGIE (Mean-Gamma image Enhancement) image brightness enhancement algorithm ...
详细信息
Aiming at the problem of low defect detection rate of PCB images captured by cameras in industrial scenarios under low-light environments, an MGIE (Mean-Gamma image Enhancement) image brightness enhancement algorithm and the corresponding FPGA design scheme are proposed. Firstly, the RGB image is converted into the YCrCb color space, and the illumination component Y is separated. Then, the illumination component Y is enhanced by the MSR (Multi-Scale Retinex) algorithm based on multi-scale mean filtering, and the Gamma correction algorithm is used to adjust the brightness. Subsequently, the processed Y channel is fused with the Cr and Cb channels to obtain the final output. Secondly, after algorithm research, this paper elaborates on the algorithm design and deployment scheme based on FPGA. The MGIE IP core is designed in the HLS (High-Level Synthesis) environment, and optimization and acceleration are carried out by means of creating look-up tables and constructing PIPELINE. Significantly, this research is capable of real-timeprocessing of images in video. Specifically, images are captured in realtime by the OV5640 camera, and the processed images are immediately displayed on the LCD screen. The experimental results show that the MGIE algorithm has remarkable effectiveness in processing low-light PCB images, with a PSNR (Peak Signal-to-Noise Ratio) reaching 17.34 and an SSIM (Structural Similarity Index Measure) reaching 0.79. After the end-to-end deployment, the processing speed of 1280 x 720 and 640 x 640 pixel images reaches 30fps/s and 70fps/s, respectively, meeting the needs of real-timeprocessing.
暂无评论