Recent advancements in volumetric displays have opened doors to immersive, glass-free holographic experiences in our everyday environments. This paper introduces Holoportal, a real-time, low-latency system that captur...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
Recent advancements in volumetric displays have opened doors to immersive, glass-free holographic experiences in our everyday environments. This paper introduces Holoportal, a real-time, low-latency system that captures, processes, and displays 3D video of two physically separated individuals as if they are conversing face-to-face in the same location. The evolution of work in multi-view immersive video communication from a Space-time-Flow (STF) media technology to realtime Holoportal communication is also discussed. Multiple cameras at each location capture subjects from various angles, with wireless synchronization for precise video-frame alignment. Through this technology we envision a future where any living space can transform into a Holoportal with a wireless network of cameras placed on various objects, including TVs, speakers, and refrigerators.
In response to the challenge of monitoring the quality of ink droplet injection in the field of digital inkjet printing, this study designs and implements a visual measurement system for ink droplets based on high-def...
详细信息
In response to the challenge of monitoring the quality of ink droplet injection in the field of digital inkjet printing, this study designs and implements a visual measurement system for ink droplets based on high-definition videoimageprocessing technology. The aim is to provide a convenient and accurate method to alert users on time to the quality of ink droplet injection in inkjets. The system can capture and analyze the image of a sprayed ink droplet by an inkjet in realtime, effectively monitoring and evaluating the quality of ink droplet injection. This study uses high-definition camera equipment to capture real-timeimages of ink droplets sprayed by an inkjet head. By using imageprocessing algorithms, the system can accurately extract key parameters such as the number, position, volume, and flight speed of ink droplets. Through detailed experimental verification, the algorithm and system developed by our research institute have demonstrated excellent performance in detecting ink droplet spray anomalies, achieving precise detection and evaluation of ink droplets. The ink droplet visual detection system can not only capture high-definition images of ink droplets in realtime but also extract crucial information for quality evaluation, providing users with an accurate and reliable tool for evaluating the quality of ink droplets. Experimental results demonstrate that the proposed droplet visual inspection system significantly outperforms other systems, validating its effectiveness in droplet detection applications. The results of this study not only provide strong technical support for quality control of inkjet printing technology but also significantly improve traditional ink droplet detection methods through real-time monitoring and automated processing. This improves the efficiency and accuracy of inkjet printing and also greatly promotes the application of inkjet printing technology in various fields through innovative system applications, especially in hi
Multiply-Accumulate (MAC) operation is widely used in various real-timeimageprocessing tasks, ranging from Convolutional Neural Networks to digital filtering, significantly impacting overall system performance. In t...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Multiply-Accumulate (MAC) operation is widely used in various real-timeimageprocessing tasks, ranging from Convolutional Neural Networks to digital filtering, significantly impacting overall system performance. In this work the Self-Adapting Reconfigurable Multiply-Accumulate (SR-MAC) is proposed as a new instrument to find the optimal trade-off between operation throughput, power consumption and physical resources utilization in real-timeimageprocessing applications. Operations of the proposed system rely on the dynamic reconfiguration of the hardware resources on the basis of the current computational requirements. This is achieved by monitoring overflow and over-representation occurrences at each accumulation cycle, and properly considering the relevant portion of the accumulation result. A custom architecture of the proposed algorithm has been designed and implemented on an AMD Xilinx Artix-7 FPGA through a Verilog description and compared to the AMD Xilinx fixed-point macro (floating-point fused multiply-accumulate). The SR-MAC achieves reductions of 83% (82%), 79% (93%) and 87.2% (94%) in the number of LUTs, FFs, and the power dissipation, P-dynN, respectively. The SR-MAC has also been used to replace arithmetic units in typical real-timeimageprocessing applications. In these cases, its employment has allowed the reduction up to 6% and 14% of FFs and P-dynN, respectively, while increasing up to 14% the f(Max). These results highlight the significant performance enhancement achieved with respect to both single operators and entire systems, making SR-MAC an excellent design choice in real-timeimageprocessing applications.
In imageprocessing and machine vision, corner detection is pivotal in diverse applications, including computer vision, 3D reconstruction, face detection, object tracking, and video technologies. Despite the wide usag...
详细信息
In imageprocessing and machine vision, corner detection is pivotal in diverse applications, including computer vision, 3D reconstruction, face detection, object tracking, and video technologies. Despite the wide usage, the real-time and energy -efficient hardware implementation of corner detection algorithms remains a critical challenge because of the computational resource limitations. On the other hand, owing to the complicated nature of corner detection algorithms, their hardware implementation has been limited to the graphics processing unit (GPU) and field programmable gate arrays (FPGA) platforms. In this regard, this work aims to propose a novel and ultra -efficient carbon nanotube field-effect transistor (CNTFET)-based hardware for image corner detection. Thanks to the proposed corner detection algorithm, the designed hardware has been realized using 2742 transistors with competitive accuracy. The proposed corner detection hardware indicated a remarkable salt -and -pepper noise immunity without using any noise reduction circuit. Our comprehensive simulations demonstrate 78%, 87%, and 94.5% total average improvements in delay, power, and energy compared to the other related corner detection hardware. Moreover, the proposed CNTFET-based corner detection hardware shows a 43 ps propagation delay, demonstrating its real-time operation. The proposed corner detection algorithm at the system level shows suitable accuracy metrics such as Recall, Precision, and error of detection (EoD) compared to the other well-known corner detectors. Our method has established a new pathway for real-time circuit -level hardware design for imageprocessing and machine vision applications.
In the dynamic landscape of modern communication, the demand for innovative virtual video conferencing solutions is ever-increasing. Our work presents an innovative approach to building a virtual video conferencing sy...
详细信息
ISBN:
(纸本)9798350391893;9798350391886
In the dynamic landscape of modern communication, the demand for innovative virtual video conferencing solutions is ever-increasing. Our work presents an innovative approach to building a virtual video conferencing system that can be used by remote users with the help of a web page. Our system allows remote participants, joining via a web browser, to freely navigate and view the virtual environment from any angle, enhancing spatial awareness and engagement. Additionally, our system grants participants the freedom to view the environment independently, even if the host restricts certain views which is one of the main drawbacks of the current video conferencing systems. Unlike current platforms, our solution also allows users to choose their appearance location within the virtual space and this feature is missing in the current systems. Furthermore, our system is highly customizable, enabling the integration of features such as recording specific portions of the screen, which are not available in existing video conferencing tools. This flexibility ensures a more immersive, interactive, and personalized meeting experience, significantly advancing the capabilities of remote collaboration technologies. Our work highlights the results of research carried out to create a virtual conference setting in the Unity environment and establish successful real-time communication between the webpage and the Unity environment. In this virtual setting, monitors act as participants, and participants can choose on which monitor they want to appear. Participants join this virtual meeting setup from a webpage, which consists of two windows: the first window shows the participants themselves, and the second window displays the virtual meeting setup. Participants can observe this environment from any perspective they want, navigating using a keyboard and a mouse. Since we are implementing everything from scratch, we have full control over every feature and functionality with Agora video SDK
video Object Detection (VOD) is one of the fundamental problems in video understanding with applications ranging from surveillance to autonomous driving. But many such real-world applications are unable to leverage th...
详细信息
ISBN:
(纸本)9781728198354
video Object Detection (VOD) is one of the fundamental problems in video understanding with applications ranging from surveillance to autonomous driving. But many such real-world applications are unable to leverage the existing VOD models owing to their higher computational complexity which reduces inference speed. Single-stage still-image object detection models are naively used without any use of video information. In this paper, we present YOLOX based VOD model, YOLO-MaxVOD, which provides a better trade-off between accuracy and inference time than the current real-time VOD solutions. Specifically, we propose a temporal fusion module that integrates within the YOLOX architecture to take advantage of the high speed that the YOLOX model offers. In our experimentation on the imagenet-VID dataset, we show that YOLO-MaxVOD shows 4.4-5.6% AP50 improvement over the baseline YOLOX, across different versions, with just a 1-2 ms increase in latency on NVIDIA 1080Ti GPU.
Road damage detection is a crucial task of road inspection systems. Although traditional object detection models achieve promising performance, the presence of shadows exacerbates the difficulty of road damage detecti...
详细信息
Road damage detection is a crucial task of road inspection systems. Although traditional object detection models achieve promising performance, the presence of shadows exacerbates the difficulty of road damage detection in practical scenarios. To tackle these challenges, we introduce a novel shadow-image enhancement network named global-local enhancement network and joint it with the YOLOv7-tiny detection network augmented with components by us to craft an end-to-end detection framework. We integrate deep neural networks with conventional methods and propose the global statistical texture enhancement module to enhance global statistical texture information. We propose the local enhancement module to enhance road damage edge information in shadow regions. Furthermore, we craft a shadow region loss to optimize the enhancement models and employ dynamic snake convolution to replace certain traditional convolution in detection network. We evaluate our method on shadow linear road damage datasets, SRoad and DRoad, which comprise road images from different perspectives in Beijing, China. The results demonstrate that our approach surpasses the performance of low-light enhancement models and low-light detection models. The method achieves mAP of 71.2% and FPS of 98.8 on SRoad dataset while reaching mAP of 79.7% and FPS of 103.2 on DRoad dataset. The proposed model optimizes performance and model size, meeting the requirements for real-timeprocessing in industrial applications.
Human monitoring of surveillance cameras for anomaly detection may be a monotonous task as it requires constant attention to judge if the captured activities are anomalous or suspicious. This paper exploits background...
详细信息
Human monitoring of surveillance cameras for anomaly detection may be a monotonous task as it requires constant attention to judge if the captured activities are anomalous or suspicious. This paper exploits background subtraction (BS), convolutional autoencoder, and object detection for a fully automated surveillance system. BS was performed by modelling each pixel as a mixture of Gaussians (MoG) to concatenate only the higher-order learning in the foreground. Next, the foreground objects are fed to the convolutional autoencoders to filter out abnormal events from normal ones and automatically identify signs of threat and violence in realtime. Then, object detection is introduced on the entire scene and the region of interest is highlighted with a bounding box to minimize human intervention in video stream processing. At recognition time, the network generates an alarm for the presence of an anomaly to notify of the identification of potentially suspicious actions. Finally, the complete system is validated upon several benchmark datasets and proved to be robust for complex video anomaly detection. The (AUC) average area under the curve for the frame-level evaluation for all benchmarks is 94.94%. The best improvement ratio of AUC between the proposed system and state-of-the-art methods is 7.7%.
Recent algorithmic developments, specifically in deep learning, have propelled computer vision forward for practical applications. However, the high computational complexity and the resulting power consumption are oft...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Recent algorithmic developments, specifically in deep learning, have propelled computer vision forward for practical applications. However, the high computational complexity and the resulting power consumption are often overlooked issues. This is not only a problem if the systems need to be installed in the wild, where often only a limited electricity supply is available, but also in the context of high energy consumption. To address both aspects, we explore the intersection of green artificial intelligence and real-time computer vision, focusing on the use of single-board computers. To this end, we need to take into account the limitations of single-board computers, including limited processing power and storage capacity, and demonstrate how the algorithm and data optimization ensure high-quality results, however, at a drastically reduced computational effort. Energy efficiency can be increased, aligning with the goals of Green AI and making such systems less dependent on a permanent electrical power supply.
Optical imageprocessing, which capitalizes on the distinctive characteristics of light, facilitates the manipulation of visual data in real-time and at a high speed. This technology is instrumental in performing task...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Optical imageprocessing, which capitalizes on the distinctive characteristics of light, facilitates the manipulation of visual data in real-time and at a high speed. This technology is instrumental in performing tasks such as enhancing edges, recognizing patterns, and extracting features, all of which are crucial in fields like medical imaging, surveillance, and industrial automation. In this study, we present the successful demonstration of a photonic integrated circuit (PIC) made of Lithium niobate on insulator, enabling matrix-vector multiplications for image classification. By surpassing an electrical bandwidth of 15 GHz, our experiment showcases the PIC's ability live edge detection and video streaming. Remarkably, its energy efficiency surpasses the limit imposed by electronic systems for each operation by consuming < 10 fJ/bit.
暂无评论