Recovery of true color from underwater images is an ill-posed problem. This is because the wide-band attenuation coefficients for the RGB color channels depend on object range, reflectance, etc. which are difficult to...
详细信息
ISBN:
(纸本)9781450398220
Recovery of true color from underwater images is an ill-posed problem. This is because the wide-band attenuation coefficients for the RGB color channels depend on object range, reflectance, etc. which are difficult to model. Also, there is backscattering due to suspended particles in water. Thus, most existing deep-learning based color restoration methods, which are trained on synthetic underwater datasets, do not perform well on real underwater data. This can be attributed to the fact that synthetic data cannot accurately represent real conditions. To address this issue, we use an image to image translation network to bridge the gap between the synthetic and real domains by translating images from synthetic underwater domain to real underwater domain. Using this multimodal domain adaptation technique, we create a dataset that can capture a diverse array of underwater conditions. We then train a simple but effective CNN based network on our domain adapted dataset to perform color restoration. Code and pre-trained models can be accessed at https://***/nehamjain10/TRUDGCR
Hand gestures are one of the nonverbal communication modalities used in sign language. It is most often used by deaf and/or dumb individuals who have hearing or speech impairments to communicate with other deaf and/or...
详细信息
Hand gestures are one of the nonverbal communication modalities used in sign language. It is most often used by deaf and/or dumb individuals who have hearing or speech impairments to communicate with other deaf and/or dumb people or with normal people. Many manufacturers across the world have created various sign language systems;however they are neither adaptable nor cost- effective for end users. To address this, a model has been developed that provides a system prototype that can detect sign language automatically, allowing deaf and dumb individuals to transfer the message more successfully with normal people. The static visuals were processed with a convolutional neural network and the feature extraction approach, and each sign was trained with ten samples. imageprocessing methods are used to determine the fingertip location of static pictures and convert them to text. The suggested technique can recognize the signer's pictures that are taken in real time during the testing phase. The results demonstrate that the suggested Sign Language Recognition System is capable of accurately recognizing *** (c) 2022 Elsevier Ltd. All rights *** and peer-review under responsibility of the scientific committee of the International conference on Artificial Intelligence & Energy Systems.
Digital imaging aims to replicate realistic scenes, but Low Dynamic Range (LDR) cameras cannot represent the wide dynamic range of real scenes, resulting in under-/overexposed images. This paper presents a deep learni...
Digital imaging aims to replicate realistic scenes, but Low Dynamic Range (LDR) cameras cannot represent the wide dynamic range of real scenes, resulting in under-/overexposed images. This paper presents a deep learning-based approach for recovering intricate details from shadows and highlights while reconstructing High Dynamic Range (HDR) images. We formulate the problem as an image-to-image (I2I) translation task and propose a conditional Denoising Diffusion Probabilistic Model (DDPM) based framework using classifier-free guidance. We incorporate a deep CNN-based autoencoder in our proposed framework to enhance the quality of the latent representation of the input LDR image used for conditioning. Moreover, we introduce a new loss function for LDR-HDR translation tasks, termed Exposure Loss. This loss helps direct gradients in the opposite direction of the saturation, further improving the results’ quality. By conducting comprehensive quantitative and qualitative experiments, we have effectively demonstrated the proficiency of our proposed method. The results indicate that a simple conditional diffusion-based method can replace the complex camera pipeline-based architectures.
Distinguishing computer-generated (CG) images from natural images is an effortless job for the human eye but not for a machine. Artificial or CG imageprocessing services are growing rapidly due to popular smartphone ...
详细信息
ISBN:
(纸本)9781665406451
Distinguishing computer-generated (CG) images from natural images is an effortless job for the human eye but not for a machine. Artificial or CG imageprocessing services are growing rapidly due to popular smartphone applications and filters in social media applications. The traditional image quality assessment (IQA) metrics are mostly defined for real-world images in terms of attributes describing noises and distortions. In contrast with traditional natural image content, artificial or CG image content has special characteristics that differentiate them from natural images. This difference opens new opportunities of research towards designing metrics that define 'naturalness' in terms of image attributes. In this paper, we investigate how curvelet features of a natural image can represent naturalness of the image. By training various classifiers using differential curvelet features of artwork-like images, we report a novel approach to estimate image naturalness and its performance for various levels of naturalness. The reported results are promising and show potential of improvement using detailed feature engineering.
In agro-processing industries, cereal grains acquire a significant place but, many times the consistency in grain quality gets compromised during the supply chain, owing to improper and/or inefficient supervision. In ...
详细信息
Augmented reality (AR) is emerging rapidly in the field of Heritage Interpretation due to its feasible availability and accessibility. Heritage tourism is more concerned about the visitor experience at Historical site...
详细信息
The proceedings contain 45 papers. The special focus in this conference is on . The topics include: Skin Cancer Recognition Using CNN, VGG16 and VGG19;diagnosis of Cardiovascular Disease Using Machine Learning Algorit...
ISBN:
(纸本)9789819940394
The proceedings contain 45 papers. The special focus in this conference is on . The topics include: Skin Cancer Recognition Using CNN, VGG16 and VGG19;diagnosis of Cardiovascular Disease Using Machine Learning Algorithms and Feature Selection Method for Class Imbalance Problem;similarity Based Answer Evaluation in Academic Questions Using Natural Language processing Techniques;fake News Detection Using Machine Learning and Deep Learning Classifiers;survey on Pre-Owned Car Price Prediction Using Random Forest Algorithm;sentiment Analysis of Youtube Comment Section in indian News Channels;deep Learning Framework for Speaker Verification Under Multi Sensor, Multi Lingual and Multi Session Conditions;DLLACC: Design of an Efficient Deep Learning Model for Identification of Lung Air Capacity in COPD Affected Patients;Content Based Document image Retrieval Using computervision and AI Techniques;drowsiness Detection System;Monitor the Effectiveness of Cardiovascular Disease Illness Diagnostics Utilizing AI and Supervised Machine Learning Classifiers;architecture Based Classification for Intrusion Detection System Using Artificial Intelligence and Machine Learning Classifiers;a Novel Privacy-Centric Training Routine for Maintaining Accuracy in Traditional Machine Learning Systems;outside the Closed World: On Using Machine Learning for Network Intrusion Detection;data Collection for a Machine Learning Model to Suggest Gujarati Recipes to Cardiac Patients Using Gujarati Food and Fruit with Nutritive Values;plant and Weed Seedlings Classification Using Deep Learning Techniques;a Comprehensive Review on Various Artificial Intelligence Based Techniques and Approaches for Cyber Security;applicability of Machine Learning for Personalized Medicine;I-LAA: An Education Chabot;A Comparison of Machine Learning Approaches for Forecasting Heart Disease with PCA Dimensionality Reduction.
In excision biopsy, a tumor mass is surgically removed from the body. Subsequently, it is sliced at an appropriate location and investigated microscopically through a process called histopathology. Any bias in tumor s...
详细信息
ISBN:
(纸本)9781450398220
In excision biopsy, a tumor mass is surgically removed from the body. Subsequently, it is sliced at an appropriate location and investigated microscopically through a process called histopathology. Any bias in tumor slicing severely influences histopathology outcomes, such as if malignant foci do not appear in the sliced location, then, the tumor would be accidentally reported non-malignant. The standard approach adopted to solve this challenge by a histopathologist is to overcome this bias by slicing at multiple locations for their investigation. Till now, this process has been manual, time-consuming, and error-prone. We aim to design a system and develop a data-driven deep learning approach to assist histopathologists by providing them with a representative slice location for reporting to increase their efficiency and accuracy. We have developed a low cost linear gantry scanner that can acquire images and integrated with a deep learning model to predict the optimal slice representative of pathology in a tumor mass. We achieve an F1 score of 0.97 and an accuracy of 97.5% in predicting an optimal slice using this approach.
We present a general learning-based solution for restoring images suffering from spatially-varying degradations. Prior approaches are typically degradation-specific and employ the same processing across different imag...
详细信息
ISBN:
(纸本)9781665428125
We present a general learning-based solution for restoring images suffering from spatially-varying degradations. Prior approaches are typically degradation-specific and employ the same processing across different images and different pixels within. However, we hypothesize that such spatially rigid processing is suboptimal for simultaneously restoring the degraded pixels as well as reconstructing the clean regions of the image. To overcome this limitation, we propose SPAIR, a network design that harnesses distortion-localization information and dynamically adjusts computation to difficult regions in the image. SPAIR comprises of two components, (1) a localization network that identifies degraded pixels, and (2) a restoration network that exploits knowledge from the localization network in filter and feature domain to selectively and adaptively restore degraded pixels. Our key idea is to exploit the non-uniformity of heavy degradations in spatial-domain and suitably embed this knowledge within distortion-guided modules performing sparse normalization, feature extraction and attention. Our architecture is agnostic to physical formation model and generalizes across several types of spatially-varying degradations. We demonstrate the efficacy of SPAIR individually on four restoration tasks-removal of rain-streaks, raindrops, shadows and motion blur. Extensive qualitative and quantitative comparisons with prior art on 11 benchmark datasets demonstrate that our degradation-agnostic network design offers significant performance gains over state-of-the-art degradation-specific architectures. Code available at https://***/humananalysis/spatially-adaptive-image-restoration.
Dense crowd counting is one of the challenging problems where creating large labeled datasets turns out to be difficult. Typical crowd images have thousands of people positioned close to each other and annotating the ...
详细信息
ISBN:
(纸本)9781450398220
Dense crowd counting is one of the challenging problems where creating large labeled datasets turns out to be difficult. Typical crowd images have thousands of people positioned close to each other and annotating the locations of every person is tedious. Add to these the growing need to include crowds from as many diverse scenarios as possible for better generalization. In this context, labeling every head for various settings under consideration is not scalable and directly affects the performance of deep models on account of limited data. We mitigate this issue with a new binary labeling scheme. Every image is simply labeled to either dense or sparse crowd category, instead of annotating every single person in the scene. This leads to dramatic reduction in the amount of annotations required and becomes proportional to the number of images rather than the crowd count. For training counting models, we create noisy density maps directly from the edge density of the images, which are then improved through rectifier networks. There are separate rectifier networks for dense and sparse categories, trained in an unsupervised fashion. The proposed counting model is composed of a self-supervised backbone feature network and a regressor head. The ground truth density maps are generated using the binary labels and the rectifier networks for training the regressor. Experiments show that the proposed architecture achieves competitive performance than existing models at an extremely low annotation cost.
暂无评论