In medical image analysis, the cost of acquiring high-quality data and annotation by experts is a barrier in many medical applications. Most of the techniques used are based on a supervised learning framework and requ...
详细信息
Multimodal sentiment analysis on images with textual content is a research area aiming to understand the sentiment conveyed by visual and textual elements in the images. While multimodal sentiment analysis on images a...
详细信息
In the past few decades, due to rapid growth in industrialization, there has been a steady decline of the air quality along with an increase in the concentration of PM2.5. It is well known that a high PM2.5 concentrat...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
In the past few decades, due to rapid growth in industrialization, there has been a steady decline of the air quality along with an increase in the concentration of PM2.5. It is well known that a high PM2.5 concentration adversely affects the environment and has hazardous impact on public health. Therefore, it is important to monitor the PM2.5 concentration at geographic locations where air quality monitoring stations are presently unavailable, especially in remote areas. Unfortunately, installation of such monitoring stations requires expensive instruments and constant maintenance. This paper presents a novel, low-cost and portable alternative to such measurement apparatus, where PM2.5 concentration is estimated based on image input obtained from a camera. The novelty of the present work lies in its hitherto unique attempt to capture information regarding PM2.5 content from visibility degradation caused by the pollutant which is further supplemented by important knowledge regarding seasonal and diurnal variation of it. The latter has a crucial role in the prevention of confounding effects arising from the presence of other weather and atmospheric elements. Another important highlight is the use of a full reference image metric as a feature, for which a powerful, dehazing algorithm has been employed. The results obtained are extremely promising, providing a close to accurate estimation of PM2.5 concentration with R
2
values far higher than reported in the literature. To summarize, the construction of a unique feature set, together with an appropriate machine learning algorithm, lead to an extremely reliable, stand-alone approach, deployable on a hand-held device such as a mobile and is a very significant contribution indeed of the proposed approach.
Water causes degradation of quality in optical images captured underwater due to its physical properties of absorption and scattering. This degradation is further aggravated by the increase in water depth and the pres...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Water causes degradation of quality in optical images captured underwater due to its physical properties of absorption and scattering. This degradation is further aggravated by the increase in water depth and the presence of contaminated water. Transformers in the vision domain have made a quantum leap in many vision tasks such as detection, and segmentation but yet to make any progress in enhancing degraded underwater images. We propose a transformer-based model named “Aquaformer” which makes four major contributions: an adaptive layer normalization, replacement of masked cyclic shift with symmetric padding in window partitioning, a novel aggregation mechanism, and an adjustable fusion approach. These succeed in making the model a very powerful one, producing significantly better performance compared to the latest state-of-the-art methods. Testing on multiple benchmark datasets, employing both quantitative and qualitative metrics, establishes its supremacy.
Estimating the degree of multiple personality traits in a single image is challenging due to the presence of multiple people, occlusion, poor quality etc. Unlike existing methods which focus on the classification of a...
详细信息
Shaky and non-shaky videos are quite common in real-time applications such as surveillance and monitoring vehicles and human movements in protected areas. As a result, text detection in such videos is a formidable cha...
详细信息
Air-writing refers to virtually writing linguistic characters through hand gestures in three-dimensional space with six degrees of freedom. This paper proposes a generic video camera-aided convolutional neural network...
详细信息
Recognizing text extracted from multiple domains is complex and challenging because complexities vary from one domain to another. Most existing methods focus either on natural scene text or specific text type but not ...
详细信息
Personality traits identification is challenging due to unpredictable changes in the foreground and background of images. In this work, we propose a new deep learning model for personality traits image classification ...
详细信息
Early diagnosis of retinal diseases is crucial for preventing blindness. However, due to background variations and degradation in the images, retinal vessel segmentation has become challenging. As a result, accurate s...
详细信息
暂无评论