In the field of autonomous driving, 3D target detection is an important technology. In view of the shortcomings of existing monocular 3D detection algorithms in terms of accuracy and real-time performance, we propose ...
详细信息
For the future cyber-physical system (CPS) society, it is necessary to construct digital twins (DTs) of a real world in realtime using a lot of cameras and sensors. Hence, the energy efficiency of both networks and c...
详细信息
ISBN:
(纸本)9798350399806
For the future cyber-physical system (CPS) society, it is necessary to construct digital twins (DTs) of a real world in realtime using a lot of cameras and sensors. Hence, the energy efficiency of both networks and computers for largescale distributed video analysis is a major challenge for the full-scale spread of CPSs and DTs. Toward this goal, we first propose a model to arbitrarily split and distribute the video analysis task to terminals, edge servers, and cloud servers and dynamically assign appropriate CNN models to them. System-wide optimization of such distributed processing can reduce overall system power consumption by reducing network bandwidth and efficiently utilizing distributed CPU/GPU resources. To realize this optimization in a real system, we also propose a model to estimate the GPU load, processingtime, and power consumption of these devices based on massive experimental measurements. Since such a large-scale optimization is difficult because of the dynamic and multi-objective nature of the problem, we propose a new optimization algorithm composed of Genetic Algorithm and Bayesian Attractor Model. Finally, simulation evaluations are performed to demonstrate that the proposed method can minimize system power consumption and satisfy latency and recognition accuracy requirements of each video analysis, even under changing environmental conditions.
This paper proposed a smart parking system that helps drivers in seeking out available parking slots based on imageprocessing. With the increased number of vehicles which leads to the parking congestion, finding an e...
详细信息
video streaming stands as the cornerstone of telecommunication networks, constituting over 60% of mobile data traffic as of June 2023. The paramount challenge faced by video streaming service providers is ensuring hig...
详细信息
ISBN:
(纸本)9798400704123
video streaming stands as the cornerstone of telecommunication networks, constituting over 60% of mobile data traffic as of June 2023. The paramount challenge faced by video streaming service providers is ensuring high Quality of Experience (QoE) for users. In HTTP Adaptive Streaming (HAS), including DASH and HLS, video content is encoded at multiple quality versions, with an Adaptive Bitrate (ABR) algorithm dynamically selecting versions based on network conditions. Concurrently, Artificial Intelligence (AI) is revolutionizing the industry, particularly in content recommendation and personalization. Leveraging user data and advanced algorithms, AI enhances user engagement, satisfaction, and video quality through super-resolution and denoising techniques. However, challenges persist, such as real-timeprocessing on resource-constrained devices, the need for diverse training datasets, privacy concerns, and model interpretability. Despite these hurdles, the promise of Generative Artificial Intelligence emerges as a transformative force. Generative AI, capable of synthesizing new data based on learned patterns, holds vast potential in the video streaming landscape. In the context of video streaming, it can create realistic and immersive content, adapt in realtime to individual preferences, and optimize video compression for seamless streaming in low-bandwidth conditions This research proposal outlines a comprehensive exploration at the intersection of advanced AI algorithms and digital entertainment, focusing on the potential of generative AI to elevate video quality, user interactivity, and the overall streaming experience. The objective is to integrate generative models into video streaming pipelines, unraveling novel avenues that promise a future of dynamic, personalized, and visually captivating streaming experiences for viewers.
Imaging through a continuously fluctuating water-air interface (WAI) is challenging. The image obtained in this way will suffer from complex refraction distortions that hinder the observer's accurate identificatio...
详细信息
Imaging through a continuously fluctuating water-air interface (WAI) is challenging. The image obtained in this way will suffer from complex refraction distortions that hinder the observer's accurate identification of the object. Reversing these distortions is an ill-posed problem, and the current restoration methods using high-resolution video streams are difficult to adapt to real-time observation scenarios. This paper proposes a method for restoring instantaneous distorted images based on structured light and local approximate registration. The scheme first uses structured light measurement technology to obtain the fluctuation information of the water surface. Then, the displacement information of the feature points on the distorted structured light image and the standard structured light image is obtained through the feature extraction algorithm and is used to estimate the distortion vector field of the corresponding sampling points in the distorted scene image. On this basis, the local approximate algorithm is used to reconstruct the distortion-free scene image. Experimental results show that the proposed algorithm can not only reduce image distortion and improve image visualization, but also has significantly better computational efficiency than other methods, achieving an "end-to-end" processing effect.
Diabetic retinopathy (DR) is a complication of diabetes that damages the retina and can cause blindness if untreated due to high blood sugar levels. To accurately diagnose and grade DR, it is important to identify ret...
详细信息
Diabetic retinopathy (DR) is a complication of diabetes that damages the retina and can cause blindness if untreated due to high blood sugar levels. To accurately diagnose and grade DR, it is important to identify retinal lacerations or biomarkers. Optical coherence tomography (OCT) imaging is a commonly used tool by ophthalmologists due to its detailed visualisation of retinal lacerations, which aids in the precise treatment of retinal abnormalities. However, the number of scans obtained daily exceeds the ophthalmologist's capacity to meaningfully analyse them, given the wide range of severe OCT applications and the prevalence of visual disorders. In the past, several research studies have attempted to address this issue using OCT scans. However, none of them have tried to simultaneously perform retinal laceration segmentation and DR grading. To address this problem, we have proposed a new architecture-a cutting-edge decoupled convolutional network consisting of three distinct modules that work together to achieve accurate DR grading based on clinical standards aided by retinal laceration segmentation. Our proposed paper introduces a deep learning framework that leverages dual guidance to improve performance on two related tasks. It was extensively tested using 26,841 multi-vendor scans, four publicly available datasets, and a real-time dataset containing 307 OCT scans from various patients. The results confirmed the effectiveness of our design, with a mean Dice score of 0.88 (4.76% improvement) in retinal laceration segmentation and 98.93% accuracy in DR grading, with an actual positive rate of about 98.46% and a true negative rate of 99.37%.
The solution to the problem of road environmental perception is one of the essential prerequisites to realizing the autonomous driving of intelligent vehicles, and road lane detection plays a crucial role in road envi...
详细信息
The solution to the problem of road environmental perception is one of the essential prerequisites to realizing the autonomous driving of intelligent vehicles, and road lane detection plays a crucial role in road environmental per-ception. However, road lane detection in complex road scenes is challenging due to poor illumination conditions, the occlusion of other objects, and the influence of unrelated road markings. It also hinders the commercial appli-cation of autonomous driving technology in various road scenes. In order to minimize the impact of illumination factors on road lane detection tasks, researchers use deep learning (DL) technology to enhance low-light images. In this study, road lane detection is regarded as an image segmentation problem, and road lane detection is studied based on the DL approach to meet the challenge of rapid environmental changes during driving. First, the Zero-DCE++ approach is used to enhance the video frame of the road scene under low-light conditions. Then, based on the bilateral segmentation network (BiSeNet) approach, the approach of associate self-attention with BiSeNet (ASA-BiSeNet) integrating two attention mechanisms is designed to improve the road lane detection ability. Finally, the ASA-BiSeNet approach is trained based on the self-made road lane dataset for the road lane detection task. At the same time, the approach based on the BiSeNet approach is compared with the ASA-BiSeNet approach. The experimental results show that the frames per second (FPS) of the ASA-BiSeNet approach is about 152.5 FPS, and its mean intersection over union is 71.39%, which can meet the requirements of real-time autonomous driving. & COPY;2023 Optica Publishing Group
The recent widespread increase of the Mpox (formerly monkeypox) virus infections in South Asian and African countries has raised concerns among medical professionals regarding the potential emergence of another pandem...
详细信息
The recent widespread increase of the Mpox (formerly monkeypox) virus infections in South Asian and African countries has raised concerns among medical professionals regarding the potential emergence of another pandemic in those regions. According to the World Health Organization (WHO) "emergency meeting" on May 20, 2022, there were 82,809 confirmed cases reported in 110 countries. With the number of available test kits surpassing the count of positive/probable cases, there is a pressing need to develop a robust and lightweight classifier model that can alleviate the burden of physical testing kits and expedite the detection process. The existing state-of-the-art primarily focuses on achieving high accuracy in modeling Mpox without considering factors such as modeling suitability, real-time inferencing, and adaptability to resource-constrained CPU-only mobile devices. In this research, we propose a novel lightweight binarized DarkNet53 model, referred to as BinaryDNet53, which is approximately similar to 20x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 20\times $$\end{document} more computationally efficient and similar to 2x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim 2\times $$\end{document} more power-efficient than the current state-of-the-art. This model demonstrates smooth detection capabilities when deployed on small hand-held or embedded devices. Firstly, we binarize the weights and biases of the DarkNet53 model to prevent high computational costs and memory usage. Next, our work introduces large-margin feature learning and weighted loss calculation to enhance results, particularly on c
In today's Flight Test Instrumentation (FTI) video telemetry applications, parallel video channels of the same video signal are acquired with the on-board data recorder. One is typically a high-quality video chann...
详细信息
By combining cloud computing, computer vision, and Internet of Things (IoT), it would be able to make the most of both sides. Because the IoT is mostly composed of connected, contained gadgets, it can store and proces...
详细信息
暂无评论