In medical image analysis, the cost of acquiring high-quality data and annotation by experts is a barrier in many medical applications. Most of the techniques used are based on a supervised learning framework and requ...
详细信息
Blind image inpainting is a crucial restoration task that does not demand additional mask information to restore the corrupted regions. Yet, it is a very less explored research area due to the difficulty in discrimina...
详细信息
Multimodal sentiment analysis on images with textual content is a research area aiming to understand the sentiment conveyed by visual and textual elements in the images. While multimodal sentiment analysis on images a...
详细信息
Usually, image binarization plays a crucial role in automatic analysis of degraded documents from their captured images. However, this binarization task is often difficult due to a number of reasons including the high...
详细信息
ISBN:
(纸本)9781450398220
Usually, image binarization plays a crucial role in automatic analysis of degraded documents from their captured images. However, this binarization task is often difficult due to a number of reasons including the high similarity between noisy background and faded foreground pixels. The study presented here is particularly focused on binarization of images of low-resource degraded quality documents based on a set of recently collected image samples of several rare, ancient and severely degraded quality printed documents of Bangla, the 2nd and 5th most popular script of India and the world respectively. This new collection of degraded document image samples will henceforth be referred as ’ISIDDI2’ and it consists of 139 images of Bangla old document pages. Samples of ’ISIDDI’, another existing database of degraded Bangla document image samples, have also been used in the present study. A novel deep architecture based on attention UNET++ with dilated convolution operation is proposed for this binarization task. The model is optimized using human vision perceptible distance reciprocal distortion (DRD) loss. Since the binarization ground truth of samples of both ’ISIDDI2’ and ’ISIDDI’ are not available, the proposed network has been trained using samples of DIBCO and H-DIBCO datasets and an unsupervised domain adaptation (DA) module is employed for adaptation of the proposed architecture to the degradation patterns of ’ISIDDI2’ or ’ISIDDI’ samples. The proposed binarization strategy includes certain post-processing operation based on a modified k-neighbourhood based approach for recovery of broken characters. Results of our extensive experimentation show that the proposed binarization strategy has improved the binarization output of state-of-the-art methods on both ISIDDI2 and ISIDDI datasets. Also, its performance on well-known DIBCO samples is satisfactory.
Usually, image binarization plays a crucial role in automatic analysis of degraded documents from their captured images. However, this binarization task is often difficult due to a number of reasons including the high...
详细信息
Learning-based methods have attracted a lot of research attention and led to significant improvements in low-light image enhancement. However, most of them still suffer from two main problems: expensive computational ...
详细信息
Text-line segmentation is still considered challenging for complex background scene images. The success of text detection and recognition depends on the success of the text segmentation. This study presents a new meth...
详细信息
Script identification of text in natural scene images is challenging due to complex backgrounds, arbitrary orientations, different-sized characters, varying fonts, and multiple styles. Most existing methods are not ef...
详细信息
In the past few decades, due to rapid growth in industrialization, there has been a steady decline of the air quality along with an increase in the concentration of PM2.5. It is well known that a high PM2.5 concentrat...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
In the past few decades, due to rapid growth in industrialization, there has been a steady decline of the air quality along with an increase in the concentration of PM2.5. It is well known that a high PM2.5 concentration adversely affects the environment and has hazardous impact on public health. Therefore, it is important to monitor the PM2.5 concentration at geographic locations where air quality monitoring stations are presently unavailable, especially in remote areas. Unfortunately, installation of such monitoring stations requires expensive instruments and constant maintenance. This paper presents a novel, low-cost and portable alternative to such measurement apparatus, where PM2.5 concentration is estimated based on image input obtained from a camera. The novelty of the present work lies in its hitherto unique attempt to capture information regarding PM2.5 content from visibility degradation caused by the pollutant which is further supplemented by important knowledge regarding seasonal and diurnal variation of it. The latter has a crucial role in the prevention of confounding effects arising from the presence of other weather and atmospheric elements. Another important highlight is the use of a full reference image metric as a feature, for which a powerful, dehazing algorithm has been employed. The results obtained are extremely promising, providing a close to accurate estimation of PM2.5 concentration with R
2
values far higher than reported in the literature. To summarize, the construction of a unique feature set, together with an appropriate machine learning algorithm, lead to an extremely reliable, stand-alone approach, deployable on a hand-held device such as a mobile and is a very significant contribution indeed of the proposed approach.
Water causes degradation of quality in optical images captured underwater due to its physical properties of absorption and scattering. This degradation is further aggravated by the increase in water depth and the pres...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Water causes degradation of quality in optical images captured underwater due to its physical properties of absorption and scattering. This degradation is further aggravated by the increase in water depth and the presence of contaminated water. Transformers in the vision domain have made a quantum leap in many vision tasks such as detection, and segmentation but yet to make any progress in enhancing degraded underwater images. We propose a transformer-based model named “Aquaformer” which makes four major contributions: an adaptive layer normalization, replacement of masked cyclic shift with symmetric padding in window partitioning, a novel aggregation mechanism, and an adjustable fusion approach. These succeed in making the model a very powerful one, producing significantly better performance compared to the latest state-of-the-art methods. Testing on multiple benchmark datasets, employing both quantitative and qualitative metrics, establishes its supremacy.
暂无评论