With the advent of smartphones and social media, images have become a popular medium for sharing information. As a result understanding the perceived quality of images has gained importance. In the recent years No-Ref...
详细信息
ISBN:
(纸本)9798400716256
With the advent of smartphones and social media, images have become a popular medium for sharing information. As a result understanding the perceived quality of images has gained importance. In the recent years No-Reference image Quality Assessment (NR-IQA) has gained prominence due to the abundance of User Generated Content (UGC) being consumed. Applications ranging from photo capture assistance, suggestion of best images for sharing, creation of stories from user media gallery, and enhancement of images for best user experience require NR-IQA. This paper presents a novel Degradation Aware Multi-Scale NR-IQA technique that leverages the multi-scale nature of feature extraction in the Human Visual System (HVS) and also aims at understanding the perceived quality of images across various distortions. This is achieved via learning of two auxiliary tasks: (i) Identification of distortions present in images; (ii) Learning to rank images across distortions. Additionally, a unique data generation strategy is introduced to generate realistic distortions for learning cross-distortion ranking of image quality. The proposed approach achieves State-Of-The-Art (SOTA) performance on multiple IQA datasets and demonstrates generalization in cross-dataset testing. The results showcase the efficacy of the proposed method in identifying and ranking the quality of images across various distortions, making it a promising approach for NR-IQA in real-world applications.
Reconstructing images using brain signals of imagined visuals may provide an augmented vision to the disabled, leading to the advancement of Brain-computer Interface (BCI) technology. The recent progress in deep learn...
详细信息
Sign Language is a medium of communication in the Deaf and Hard of Hearing community (DHH community). According to WHO, there are approximately 63 million people in India, including 3.3 million from the state of Karna...
详细信息
Text extraction from scene images has started gaining a lot of traction in recent years in the computervision field as its applications is manifold. One of its sub-categories is scene text detection. Factors like com...
详细信息
Lately, one of the most common illegal activities include the use of shooting weapons. In such dangerous situations, there is a dire need of preventive measures that can automatically detect such munitions. This paper...
详细信息
ISBN:
(数字)9781510650459
ISBN:
(纸本)9781510650459;9781510650442
Lately, one of the most common illegal activities include the use of shooting weapons. In such dangerous situations, there is a dire need of preventive measures that can automatically detect such munitions. This paper presents the use of computervision and deep learning to detect weapons like guns, revolvers and pistols. Convolutional Neural Networks can be efficiently used for object detection. In this paper, precisely, two Convolutional Neural Network (CNN) architectures - Faster R-CNN with VGG16 and YOLOv3, have been used, to carry out the detection of such weapons. The pre-trained neural networks were fed with images of guns from the Internet Movie Firearms Database (IMFDB) which is a benchmark gun database. For negative case images, MS COCO dataset was used. The goal of this paper is to present and compare performance of the two models to bring about gun detection in any given scenario. The results of YOLOv3 outperforms Faster R-CNN with VGG16. The ultimate aim of this paper is to detect guns in an image accurately which in turn can aid crime investigation.
Retinal blood vessels are the arteries and veins that supply blood to the human eye. Fundus images obtained through a fundus camera capture retinal information like, the macula, optic disc, cup, fovea, retinal blood v...
详细信息
Car insurance claims are rising in tandem with the rising tide of car users. Every insurance claim requires an engineer’s manual assessment and a surveyor’s actual examination. This procedure can last anywhere from ...
详细信息
ISBN:
(纸本)9798400716256
Car insurance claims are rising in tandem with the rising tide of car users. Every insurance claim requires an engineer’s manual assessment and a surveyor’s actual examination. This procedure can last anywhere from a few days to several weeks. Current deep-learning techniques have paved the way for this type of mechanization. Both the business and the client would benefit from a comprehensive system. To assess the damage cost for the insurance claim process, the make and model of the car, damaged parts, damage type, and severity of the damage are important parameters. We introduced two datasets, a piqued car make and model (CMM) dataset containing images of the most popular 23 car makes and 148 vehicle models available in the indian automotive market. The second dataset consists of 11,380 images collected from insurance offices and web resources of different types of car damage, including annotations. In addition, it provides an assessment of the damage to each part, the severity of the damage (dents, scratches, bent, broken, cracks, smashed, punched, and pushed), and the location of the damage (front, back, side). The assessment helps estimate the damage cost when combined with structured data. The proposed CDA_YOLOv5 car damage assessment framework outperformed existing state-of-the-art one-stage models with an average per-class accuracy of 87.36% and an overall accuracy of 90.45%.
Lung cancer is a prevalent and life-threatening disease characterized by abnormal cell growth in lung tissue. Early detection and accurate classification of lung cancer are crucial for timely treatment and improved pa...
详细信息
ISBN:
(纸本)9798400716256
Lung cancer is a prevalent and life-threatening disease characterized by abnormal cell growth in lung tissue. Early detection and accurate classification of lung cancer are crucial for timely treatment and improved patient outcomes. Our work proposes, a novel framework for accurate multi-class categorization of lung cancer using deep learning. The proposed framework uses a customized Densenet-201 model to leverage the knowledge of transfer learning, which acts as a parent architecture, enhanced by a residual structure as the child architecture. The performance of our model was examined by conducting experiments on the LCS25000 data set. The results demonstrate the outstanding accuracy of our proposed framework, achieving a remarkable accuracy of 95% on the test data set. This signifies the model’s ability to accurately classify different types of lung cancer based on histopathology images. Our model also achieved remarkably well results in the TCGA lung cancer dataset, hence proving its generalization. These findings have important repercussions for improving pulmonary pathology diagnostic abilities and hold promise for enhancing patient care in the field of lung cancer diagnosis.
image-to-image translation is the recent trend to transform images from one domain to another domain using a generative adversarial network (GAN). The existing GAN models perform the training by only utilizing the inp...
详细信息
Retinal blood vessels are the arteries and veins that supply blood to the human eye. Fundus images obtained through a fundus camera capture retinal information like, the macula, optic disc, cup, fovea, retinal blood v...
详细信息
ISBN:
(纸本)9798400716256
Retinal blood vessels are the arteries and veins that supply blood to the human eye. Fundus images obtained through a fundus camera capture retinal information like, the macula, optic disc, cup, fovea, retinal blood vessels, and abnormalities. The retinal blood vessels are miniature and are usually measured in micrometers. It is difficult and time-consuming to study retinal vessels from a fundus image. Generally, a detailed vessel study requires costly equipment like Ocular Coherence Tomography (OCT) and is comparatively less available at hospitals than a fundus camera. Blood vessels are connected throughout the body. Studying retinal blood vessel health can help ophthalmologists and doctors understand the overall health of blood vessels across the body non-invasively. Narrowing of blood vessels can make a person prone to many diseases, like hypertension, cardiovascular disease, and stroke. A few methods aim to automate the task using computer Aided Diagnosis (CAD). The work proposes a novel and relatively less complex algorithm for calculating the diameter of a given retinal blood vessel at a given distance from the optic disc center in a fundus image. The approach aims to compute the artery-to-vein diameter ratio of retinal blood vessels at a given distance from the center of the optic disc. The proposed method has been tested on the LES-AV 2020 dataset and it achieved an approximate artery-to-vein diameter ratio (AVR) of 2:3, which is in line with the average healthy AVR in the literature. Therefore, the proposed method is accurate and can be extended to find AVR in various clinical scenarios to detect and study any abnormalities.
暂无评论