With the rapid development of artificial intelligence technology, deep learning has become one of the key technologies in the field of image recognition. PyTorch has become the preferred framework for researchers due ...
详细信息
Background: The evolution of AI applications in dental imaging, covering caries detection, anatomical structure segmentation, and pathology identification, highlights the importance of high-quality datasets for effect...
详细信息
ISBN:
(纸本)9781510673199;9781510673182
Background: The evolution of AI applications in dental imaging, covering caries detection, anatomical structure segmentation, and pathology identification, highlights the importance of high-quality datasets for effective detection models. This paper focuses on optimizing dataset quality for real-time AI-based dental bitewing radiograph detection. Methods: We systematically analyze preprocessing methods suitable for dental bitewing radiographs, covering image enhancement, noise reduction, and contrast adjustment. These techniques are strategically chosen to address common challenges in dental radiograph images, including variations in lighting, contrast disparities, and noise fluctuations. We employ optimized algorithms to meet real-time constraints, ensuring efficient model training and inference. Results: Our study assesses the impact of each preprocessing step on dataset quality and its influence on AI model performance. Practical recommendations are provided to empower researchers and practitioners in creating datasets optimized for dental bitewing radiograph detection tasks, aiming to improve AI model accuracy while adhering to real-time requirements. In addition, a comparative analysis is conducted, evaluating datasets enhanced using conventional methods against the ResNet18 model for the segmentation of bitewing dental images. Conclusion: This paper serves as a valuable guide for the dental imaging community, offering insights into preprocessing steps that elevate dataset quality for AI-driven dental bitewing radiograph detection. By emphasizing the relevance of real-time performance and providing a comparison with conventional enhancements on the ResNet18 model, we contribute to advancing early diagnosis and enhancing oral healthcare outcomes.
Fuzzy image target classification detection plays an important role in imageprocessing. Traditional classification detection methods are easily affected by environmental and equipment factors, and there are certain l...
详细信息
Lower resolutions and a lack of distinguishing features in large satellite imagery datasets make identification tasks challenging for traditional image classification models. Vision Transformers (ViT) address these is...
详细信息
ISBN:
(纸本)9781510673878;9781510673861
Lower resolutions and a lack of distinguishing features in large satellite imagery datasets make identification tasks challenging for traditional image classification models. Vision Transformers (ViT) address these issues by creating deeper spatial relationships between image features. Self attention mechanisms are applied to better understand not only what features correspond to which classification profile, but how the features correspond to each other within each separate category. These models, integral to computer vision machine learning systems, depend on extensive datasets and rigorous training to develop highly accurate yet computationally demanding systems. Deploying such models in the field can present significant challenges on resource constrained devices. This paper introduces a novel approach to address these constraints by optimizing an efficient Vision Transformer (TinEVit) for real-time satellite image classification that is compatible with ST Microelectronics AI integration tool, X-Cube-AI.
Deep convolutional neural networks have achieved great progress in image denoising tasks. However, their complicated architectures and heavy computational cost hinder their deployments on mobile devices. Some recent e...
详细信息
ISBN:
(纸本)9781728198354
Deep convolutional neural networks have achieved great progress in image denoising tasks. However, their complicated architectures and heavy computational cost hinder their deployments on mobile devices. Some recent efforts in designing lightweight denoising networks focus on reducing either FLOPs (floating-point operations) or the number of parameters. However, these metrics are not directly correlated with the on-device latency. In this paper, we identify the real bottlenecks that affect the CNN-based models' run-time performance on mobile devices: memory access cost and NPU-incompatible operations, and build the model based on these. To further improve the denoising performance, the mobile-friendly attention module MFA and the model reparameterization module RepConv are proposed, which enjoy both low latency and excellent denoising performance. To this end, we propose a mobile-friendly denoising network, namely MFDNet. The experiments show that MFDNet achieves state-of-the-art performance on real-world denoising benchmarks SIDD and DND under real-time latency on mobile devices. The code and pre-trained models will be released.
The image quality is degraded in bad weather situations such as haze or fog. This problem can affect imageprocessing applications such as computer vision, security, and some other real-timeimageprocessing systems. ...
详细信息
This paper presents a real-time semantic segmentation framework for camera-based environment perception of objects and infrastructure elements in autonomous scale cars. It is specifically targeted towards student comp...
详细信息
ISBN:
(纸本)9798350394283;9798350394276
This paper presents a real-time semantic segmentation framework for camera-based environment perception of objects and infrastructure elements in autonomous scale cars. It is specifically targeted towards student competitions such as the Carolo Cup or the Bosch Future Mobility Challenge. To reduce pixel-wise manual annotation efforts, our framework involves a mixture of both synthetic and realimage data, carefully tuned towards the unique requirements of the given scenario. realimages are acquired from a 1:10 scale vehicle equipped with a single monocular camera and are manually annotated. Synthetic image data with automatic pixel-wise annotation is obtained via a custom Unity-based simulation pipeline. We evaluate various mixed real-synthetic data strategies to train different state-of-the-art deep neural networks with a focus on both segmentation performance and real-time capability using an NVIDIA Jetson AGX Xavier platform as in-vehicle test bed. Our experimental results show a significant improvement in semantic segmentation performance of the mixed real-synthetic data approach at real-time speeds of approximately 60 FPS on the target platform.
The increasing demand for high-quality, real-time visual communication and the growing user expectations, coupled with limited network resources, necessitate novel approaches to semantic image communication. This pape...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
The increasing demand for high-quality, real-time visual communication and the growing user expectations, coupled with limited network resources, necessitate novel approaches to semantic image communication. This paper presents a method to enhance semantic image communication that combines a novel lossy semantic encoding approach with spatially adaptive semantic image synthesis models. By developing a model-agnostic training augmentation strategy, our approach substantially reduces susceptibility to distortion introduced during encoding, effectively eliminating the need for lossless semantic encoding. Comprehensive evaluation across two spatially adaptive conditioning methods and three popular datasets indicates that this approach enhances semantic image communication at very low bit rate regimes.
In this paper, we present EdgeRelight360, an approach for real-time video portrait relighting on mobile devices, utilizing text-conditioned generation of 360-degree high dynamic range image (HDRI) maps. Our method pro...
详细信息
ISBN:
(纸本)9798350365474
In this paper, we present EdgeRelight360, an approach for real-time video portrait relighting on mobile devices, utilizing text-conditioned generation of 360-degree high dynamic range image (HDRI) maps. Our method proposes a diffusion-based text-to-360-degree image generation in the HDR domain, taking advantage of the HDR10 standard. This technique facilitates the generation of high-quality, realistic lighting conditions from textual descriptions, offering flexibility and control in portrait video relighting task. Unlike the previous relighting frameworks, our proposed system performs video relighting directly on-device, enabling real-time inference with real 360-degree HDRI maps. This on-device processing ensures both privacy and guarantees low runtime, providing an immediate response to changes in lighting conditions or user inputs. Our approach paves the way for new possibilities in real-time video applications, including video conferencing, gaming, and augmented reality, by allowing dynamic, text-based control of lighting conditions.
With the rapid development of science and technology, Virtual reality (VR) technology based on imageprocessing shows great potential and innovation in the field of interior design. The purpose of this paper is to exp...
详细信息
暂无评论