In the realm of computer vision, effectively handling multi-tasks simultaneously presents a challenge that necessitates innovative solutions. To better address multiple vision problems, we introduce SeTano, an integra...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In the realm of computer vision, effectively handling multi-tasks simultaneously presents a challenge that necessitates innovative solutions. To better address multiple vision problems, we introduce SeTano, an integrated Graph Neural Network (GNN)-based framework. This framework comprises a Dynamic Edge-Sensing GNN (DES-GNN) backbone, which can dynamically adjust edges to extract more pivotal features, and a downstream design which includes a node reduction and a separate query selection strategy. To validate our approach, we perform multi-task experiments on the imageNet and MS COCO datasets. The results indicate that the integrated design of SeTano leads to enhanced performance in various vision multi-tasks.
As a hot research topic in the field of computer vision, blind image quality assessment (BIQA) can provide high-quality images for end-users and promote the development of other fields of computer vision. Although the...
详细信息
As a hot research topic in the field of computer vision, blind image quality assessment (BIQA) can provide high-quality images for end-users and promote the development of other fields of computer vision. Although the existing BIQA based on convolution neural networks has made significant progress in synthetic distortion evaluation, it still cannot be well extended to authentic distortion and algorithm-related distortion. Therefore, this paper proposes a BIQA adversarial network with local feature enhancement to deal with this challenge. First, the ResNeSt50 network with local feature enhancement is used to extract the features of images, which effectively combines the overall semantic information and the local features of the images. Then, the mapping of distorted images to their quality scores is learned by the adversarial network. Extensive experiments demonstrate that the proposed method performs best on three categories of distorted scenario databases (nine databases) compared with state-of-the-art BIQA methods.
Saliency methods are critical tools that allow the estimation of the most important features of an input image that contribute to the network’s prediction. These tools are pivotal in high-stakes applications such as ...
详细信息
Saliency methods are critical tools that allow the estimation of the most important features of an input image that contribute to the network’s prediction. These tools are pivotal in high-stakes applications such as medical diagnosis or autonomous driving. Additionally, these tools can help identify models’ biasedness, such as a strong prior on object placement, easily distinguishable background features, or frequent object co-occurrence. We introduce RankPix, a novel saliency method for visual bias identification in image classification tasks. RankPix is a derivative-free approach that allows the identification of a minimum subset of pixels/features at a given network layer that changes the output of a classifier. Surprisingly, this approaches provides equivalent performance to gradient-based approaches on the standard pointing game benchmark. More interestingly, RankPix outperforms traditional approaches for systematic bias identification.
Medical image segmentation is a critical component in a variety of clinical applications, facilitating accurate diagnosis and treatment planning. The Segment Anything Model (SAM), a deep learning architecture, has eme...
Medical image segmentation is a critical component in a variety of clinical applications, facilitating accurate diagnosis and treatment planning. The Segment Anything Model (SAM), a deep learning architecture, has emerged as a promising solution to the challenges inherent in medical image segmentation. SAM’s superior zero-shot capability allows it to generalize effectively, even in the absence of task-specific segmentation samples. This unique characteristic broadens its application potential across various medical image modalities. This paper provides an in-depth review of SAM, focusing on its application in medical image segmentation. The review discusses the advantages of deep learning image segmentation over traditional methods, emphasizing the superior accuracy, efficiency, and automation that deep learning models offer. The paper also highlights the applications of SAM across various medical imaging modalities, demonstrating its versatility and adaptability. A taxonomy of SAM approaches in medical image segmentation is presented, categorizing them based on modality, dimension, organ, dataset, prompt, and performance. Despite the promising results of SAM, challenges remain in the field of medical image segmentation. The paper identifies these challenges and suggests potential directions for future research. In conclusion, this review aims to provide a comprehensive understanding of SAM and its potential to revolutionize medical image analysis and contribute to advancements in healthcare.
Canny edge detection based on optimization algorithm has the advantages of high accuracy and high signal-to-noise ratio. However, because the Canny algorithm was based on gray image, some edge information will be lost...
详细信息
Drowsy driving is one of the major causes of car accidents leading to a huge number of injuries and deaths every year. This paper presents a comprehensive study about developing an AI-based system to detect drowsiness...
详细信息
ISBN:
(数字)9798331518592
ISBN:
(纸本)9798331518608
Drowsy driving is one of the major causes of car accidents leading to a huge number of injuries and deaths every year. This paper presents a comprehensive study about developing an AI-based system to detect drowsiness consisting of three sub-systems which are eye's state detection, yawn detection, and head tilde detection. The first approach is using CNN and an augmented image dataset to detect the eye's state and yawning where an accuracy of around 91% was achieved for eye detection and around 75% for yawn detection. The other technique used was using face mesh which extracts facial landmarks and identifies 468 points in the face. MAR and EAR can be calculated using these landmarks. In addition, the head tilde angle can also be calculated using these landmarks. Finally, all the sub-systems will be integrated into one whole system with a GUI for easieruse by users and multi-threaded to run in parallel and avoid single point of failure.
Road safety for automated vehicles requires accurate and early detection of stationary objects in the vehicle’s path. Radar can use doppler to effectively identify stationary objects and make these identifications at...
详细信息
Road safety for automated vehicles requires accurate and early detection of stationary objects in the vehicle’s path. Radar can use doppler to effectively identify stationary objects and make these identifications at long range and in severe weather and poor light conditions. In this paper, we propose a radar-based stationary object detection system that combines signal processing techniques with machine learning technology to detect stationary in-path objects from the low level spectra of front looking radars. The proposed system consists of novel signal and image processing methods to extract key features from the raw data, which are fed into a long short-term memory (LSTM) to determine the probability of a stationary object in-lane at each range. Experiments with collected data in controlled and uncontrolled scenarios demonstrate the effectiveness of our approach.
Recently, our group has been developing novel semiconductor photodetectors and image sensors based on silicon-on-insulator (SOI) substrate. The interface coupled photodetector (ICPD) utilizes the interface coupling ef...
详细信息
In this paper a new implementation of the Saint-Marc-Chen-Medioni (SMCM) filter is proposed in a 3D variant. It uses adaptive selection of the threshold k, found from a 2D projection of the input image. The algorithm ...
详细信息
ISBN:
(纸本)9781665426053
In this paper a new implementation of the Saint-Marc-Chen-Medioni (SMCM) filter is proposed in a 3D variant. It uses adaptive selection of the threshold k, found from a 2D projection of the input image. The algorithm has been tested over Computed Tomography (CT) images with Additive White Gaussian Noise (AWGN) at variance levels 0.01 and 0.001. It leads to better quality of filtered images in terms of Peak signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM), compared to 3D Gaussian and 3D Averaging filters. The proposed 3D SMCM filter is considered applicable for filtering of other types of 3D images, such as Magnetic Resonance images (MRI).
Photothermal power generation is a promising technique for converting solar radiation into electricity with high efficiency and stability. However, the performance and maintenance of photothermal power plants depend o...
详细信息
ISBN:
(数字)9798350366556
ISBN:
(纸本)9798350366563
Photothermal power generation is a promising technique for converting solar radiation into electricity with high efficiency and stability. However, the performance and maintenance of photothermal power plants depend on the cleanliness and reflectivity of the heliostats. This paper introduces an innovative approach to addressing the challenges of dirt detection and segmentation on heliostats. Leveraging the capabilities of deep learning, we propose the Multi-Scale Heliostat Dirt Segmentation and Classification (MSHDSC) framework, integrating a novel multi-scale feature fusion module (MSFFM) with an enhanced DeepLabV3+ network. This framework effectively segments small dirt areas on heliostat images, facilitating precise cleaning strategies. A unique aspect of our work is the introduction of an unsupervised clustering algorithm post-segmentation, which categorizes dirt based on color and texture, assigning a severity score to each category. This categorization assists in determining the cleaning complexity and prioritizing maintenance efforts. Experimental results show that our method outperforms several state-of-the-art image segmentation models in terms of accuracy and efficiency and provides useful information for targeted and prioritized cleaning of heliostats by robots or drones.
暂无评论