The precise detection of plant centres is important for growth monitoring, enabling the continuous tracking of plant development to discern the influence of diverse factors. It holds significance for automated systems...
详细信息
ISBN:
(纸本)9798350372977;9798350372984
The precise detection of plant centres is important for growth monitoring, enabling the continuous tracking of plant development to discern the influence of diverse factors. It holds significance for automated systems like robotic harvesting, facilitating machines in locating and engaging with plants. In this paper, we explore the YOLOv4 (You Only Look Once) real-time neural network detector for plant centre detection. Our dataset, comprising over 12,000 images from 151 Arabidopsis thaliana accessions, is used to fine-tune the model. Evaluation of the dataset reveals the model's proficiency in centre detection across various accessions, boasting an mAP of 99.79% at a 50% IoU threshold. The model demonstrates real-time processing capabilities, achieving a frame rate of approximately 50 FPS. This outcome underscores its rapid and efficient analysis of video or image data, showcasing practical utility in time-sensitive applications.
The goal of image enhancement is to improve specific features or details of an image and enhance its overall visual quality. We introduce a novel image enhancement algorithm based on block-rooting processing combined ...
详细信息
Attention-only Transformers [34] have been applied to solve Natural Language processing (NLP) tasks and Computer vision (CV) tasks. One particular Transformer architecture developed for CV is the vision Transformer (V...
详细信息
ISBN:
(纸本)9783031234798;9783031234804
Attention-only Transformers [34] have been applied to solve Natural Language processing (NLP) tasks and Computer vision (CV) tasks. One particular Transformer architecture developed for CV is the vision Transformer (ViT) [15]. ViT models have been used to solve numerous tasks in the CV area. One interesting task is the pose estimation of a human subject. We present our modified ViT model, Un-TraPEs (UNsupervised TRAnsformer for Pose Estimation), that can reconstruct a subject's pose from its monocular image and estimated depth. We compare the results obtained with such a model against a ResNet [17] trained from scratch and a ViT finetuned to the task and show promising results.
Diabetic retinopathy (DR), a severe complication arising from diabetes, make a significant threat to vision due to the deterioration of retinal blood vessels. This research work proposes a comprehensive methodology fo...
详细信息
ISBN:
(纸本)9798350373301;9798350373295
Diabetic retinopathy (DR), a severe complication arising from diabetes, make a significant threat to vision due to the deterioration of retinal blood vessels. This research work proposes a comprehensive methodology for the automated detection, grading, and segmentation of DR, leveraging advanced imageprocessing, deep learning techniques and machine learning. The study utilizes the Indian Diabetic Retinopathy image dataset (IDRID), comprising 81 fundus images and labels, to rigorously evaluates the proposed methodology. Key steps include detailed image preprocessing, VGG16-based feature extraction, Random Forest classifier-based grading, and innovative segmentation techniques for lesion localization. The evaluation demonstrates exceptional performance, with both VGG16 and ResNet50 architectures achieving over 99% accuracy. The process of semantic segmentation enhances interpretability, supporting clinical decision-making in retinopathy diagnosis. While the results are promising, future validation on diverse datasets and careful consideration of ethical implications are essential for responsible deployment in clinical settings. The proposed methodology signifies a significant step toward precise diagnostics and improved patient outcomes in diabetic retinopathy and holds potential for broader applications in retinal disease diagnosis.
The verification of IP core with imageprocessing algorithm is important for SoC and FPGA application in the field of machinevision. This paper proposes a verification framework with general purpose, real-time perfor...
详细信息
Facial Attribute Manipulation (FAM) aims to aesthetically modify a given face image to render desired attributes, which has received significant attention due to its broad practical applications ranging from digital e...
详细信息
Facial Attribute Manipulation (FAM) aims to aesthetically modify a given face image to render desired attributes, which has received significant attention due to its broad practical applications ranging from digital entertainment to biometric forensics. In the last decade, with the remarkable success of Generative Adversarial Networks (GANs) in synthesizing realistic images, numerous GAN-based models have been proposed to solve FAM with various problem formulation approaches and guiding information representations. This paper presents a comprehensive survey of GAN-based FAM methods with a focus on summarizing their principal motivations and technical details. The main contents of this survey include: (i) an introduction to the research background and basic concepts related to FAM, (ii) a systematic review of GAN-based FAM methods in three main categories, and (iii) an in-depth discussion of important properties of FAM methods, open issues, and future research directions. This survey not only builds a good starting point for researchers new to this field but also serves as a reference for the vision community.
Our skin is the hefty organ that envelops and shields body. It prevents us from numerous fatal and non fatal diseases. It is observed that due to bacteria or other causes of infection, skin faces certain minor or life...
详细信息
In the era of digitization and big data, the world is inundated with an ever-growing volume of visual content, be it images or videos. As organizations strive to harness the potential of these multimedia data sources,...
详细信息
In recent years there has been an increased interest towards edge computing, i.e., computing performed on distributed devices as opposed to centralized high-power hubs. Examples of edge computing would be the local im...
详细信息
In recent years there has been an increased interest towards edge computing, i.e., computing performed on distributed devices as opposed to centralized high-power hubs. Examples of edge computing would be the local imageprocessing performed on Unmanned Autonomous Vehicles (UAV's) or the specialized machinevision systems on drones. These edge computing applications require schemes that are efficient with power and memory and typically must operate real-time. Many state-of-the-art imageprocessing solutions that employ advanced optimization and deep neural networks (NNs) achieve impressive benchmark results, but are computationally demanding and thus on many occasions, impractical. The additional requirement for a range of applications is noise robustness or the ability to work in (extreme) low-light conditions; reasonable quality image or accurate object classification may be critical when there is low light flux or when the environment is over-saturated with other signals. Here, we approach edge computing with a combination of optical preprocessing and shallow NN and we show that this hybrid approach greatly reduces the computational requirements. For low-SNR imaging, we develop a technique that reconstructs objects and scenes from their Fourier-plane images. The optical preprocessing is performed via encoded diffraction with optical vortex singularities. The optical vortex encoder achieves differentiation of the already-compressed Fourier-plane patterns and enables facile inverse inference of the original object scene. We demonstrate that our method is robust to noise. And for a simple NN architecture (one or two layers), leads to generalization, i.e., reconstruction of objects from classes that are greatly different from the ones the NN was trained on. Our research identifies strong potential for swift hybrid imaging systems with edge computing applications and highlights the valuable function of the vortex encoder for spectral differentiation.
The fashion industry’s traditional price-setting methods, based on historical sales and Fashion Week trends, are inadequate in the digital era. Rapid changes in collections and consumer preferences necessitate advanc...
详细信息
暂无评论