Diabetic Retinopathy is a degenerative ocular condition resulting from elevated levels of insulin in the bloodstream, which, if not promptly identified, may lead to vision loss. Figure 1 displays the primary component...
详细信息
In today's medical image, analyses are captured very rapidly due to early detection of the brain tumour being very important. The tumour could be plainly visible in the neurological magnetic resonance imaging stud...
详细信息
Localization and detection is a vital task in emergency rescue operations. Devastating natural disasters can create environments that are inaccessible or dangerous for human rescuers. Contaminated areas or buildings i...
详细信息
Generative adversarial network (GAN) generated high-realistic human faces are visually challenging to discern from real ones. They have been used as profile images for fake social media accounts, which leads to high n...
详细信息
ISBN:
(纸本)9781665405409
Generative adversarial network (GAN) generated high-realistic human faces are visually challenging to discern from real ones. They have been used as profile images for fake social media accounts, which leads to high negative social impacts. In this work, we show that GAN-generated faces can be exposed via irregular pupil shapes. This phenomenon is caused by the lack of physiological constraints in the GAN models. We demonstrate that such artifacts exist widely in high-quality GAN-generated faces. We design an automatic method to segment the pupils from the eyes and analyze their shapes to distinguish GAN-generated faces from real ones. Qualitative and quantitative evaluations of our method on the Flickr-Faces-HQ dataset and a StyleGAN2 generated face dataset demonstrate the effectiveness and simplicity of our method.
Existing end-to-end image compression mainly concentrates on the coding of natural scene images. However, few works have been dedicated to the end-to-end compression of screen contents. In this paper, we propose an en...
详细信息
ISBN:
(数字)9781665496209
ISBN:
(纸本)9781665496209
Existing end-to-end image compression mainly concentrates on the coding of natural scene images. However, few works have been dedicated to the end-to-end compression of screen contents. In this paper, we propose an end-to-end compression scheme for screen content images inspired by the ideology of transform skip, with the goal of improving the compression performance for screen content images. In particular, the proposed model takes full consideration of the characteristics of screen contents and involves transform skip branches to the analyses and synthesis process. The transform skip branch equips with coarse feature extraction and reconstruction. As such, the visual signals could be more briefly interpreted at the encoder-side and recovered at the decoder-side. Experimental results show that the proposed method outperforms the existing hyperprior-based model for screen content compression, achieving 10.16% BD-Rate savings in high bit-rate coding scenario and 5.38% BD-Rate savings in low bit-rate coding scenario.
—This article provides a comprehensive examination of the roles played by Central processing Units (CPUs) and Field-Programmable Gate Arrays (FPGAs) in the realms of Artificial Intelligence (AI) and Machine Learning ...
详细信息
Retinal vessel segmentation can improve the judgment ability of intelligent disease diagnosis system. Although a large number of retinal vessel segmentation models have been proposed with the development of deep learn...
详细信息
ISBN:
(纸本)9781665450850
Retinal vessel segmentation can improve the judgment ability of intelligent disease diagnosis system. Although a large number of retinal vessel segmentation models have been proposed with the development of deep learning, it is still a challenging task. In this work, we propose a new retinal vessel segmentation network via spatial-temporal and self-attention encoding modules, called STSANet, which can significantly improve the performance and robustness of segmentation. The spatial-temporal information of fundus images are extracted by a Spatial-Temporal encoding module in the STSANet. In addition, the internal correlation of features is captured by the Self-Attention module. By fusing spatial-temporal and self-attention features, the final result contains both spatial-temporal information and internal feature information of fundus images. The experimental results indicate that our STSANet outperforms other state-of-the-art retinal segmentation models on the published standard datasets.
Benchmark datasets for visual recognition assume that data is uniformly distributed, while real-world datasets obey long-tailed distribution. Current approaches handle the long-tailed problem to transform the long-tai...
详细信息
ISBN:
(纸本)9781665405409
Benchmark datasets for visual recognition assume that data is uniformly distributed, while real-world datasets obey long-tailed distribution. Current approaches handle the long-tailed problem to transform the long-tailed dataset to uniform distribution by re-sampling or re-weighting strategies. These approaches emphasize the tail classes but ignore the hard examples in head classes, which result in performance degradation. In this paper, we propose a novel gradient harmonized mechanism with category-wise adaptive precision to decouple the difficulty and sample size imbalance in the long-tailed problem, which arc correspondingly solved via intra- and inter-category balance strategies. Specifically, infra-category balance focuses on the hard examples in each category to optimize the decision boundary, while inter-category balance aims to correct the shift of decision boundary by taking each category as a unit. Extensive experiments demonstrate that the proposed method consistently outperforms other approaches on all the datasets.
The proceedings contain 26 papers. The special focus in this conference is on Optimization, Learning Algorithms and applications. The topics include: Pest Detection in Olive Groves Using YOLOv7 and YOLOv8 Mo...
ISBN:
(纸本)9783031530357
The proceedings contain 26 papers. The special focus in this conference is on Optimization, Learning Algorithms and applications. The topics include: Pest Detection in Olive Groves Using YOLOv7 and YOLOv8 Models;Using LiDAR Data as image for AI to Recognize Objects in the Mobile Robot Operational Environment;an Evaluation of image Preprocessing in Skin Lesions Detection;an Artificial Intelligence-Based Method to Identify the Stage of Maturation in Olive Oil Mills;vehicle Industry Big Data Analysis Using Clustering Approaches;enhancing Forest Fire Detection and Monitoring Through Satellite image Recognition: A Comparative Analysis of Classification Algorithms Using Sentinel-2 Data;An Efficient GPU Parallelization of the Jaya Optimization Algorithm and Its Application for Solving Large Systems of Nonlinear Equations;Multi-objective Optimal Sizing of an AC/DC Grid Connected Microgrid System;sub-system Integration and Health Dashboard for Autonomous Mobile Robots;movement Pattern Recognition in Boxing Using Raw Inertial Measurements;optimization Models for Hydrokinetic Energy Generated Downstream of Hydropower Plants;deep Conditional Measure Quantization;fault Classification of Wind Turbine: A Comparison of Hyperparameter Optimization Methods;Assessing the Reliability of AI-Based Angle Detection for Shoulder and Elbow Rehabilitation;deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets;performance of Heuristics for Classifying Leftovers from Cutting Stock Problem;Deep Learning-Based Classification and Quantification of Emulsion Droplets: A YOLOv7 Approach;identification of Late Blight in Potato Leaves Using imageprocessing and Machine Learning;adaptive Convolutional Neural Network for Predicting Steering Angle and Acceleration on Autonomous Driving Scenario;Impact of EMG signal Filters on Machine Learning Model Training: A Comparison with Clustering on Raw signal;assessing the 3D Position of a Car with a Single 2D Camera Usin
作者:
Agudo, AntonioUPC
Inst Robbt & Informat Ind CSIC Barcelona Spain
In this paper we propose a convex approach for recovering a detailed 3D volumetric geometry of several objects from visual signals. To this end, we first present a minimal detailed surface energy that is optimized tog...
详细信息
ISBN:
(纸本)9781665405409
In this paper we propose a convex approach for recovering a detailed 3D volumetric geometry of several objects from visual signals. To this end, we first present a minimal detailed surface energy that is optimized together with a volume constraint by considering some geometrical priors, and without requiring neither additional training data nor templates in order to constrain the solution. Our problem can be efficiently solved by means of a gradient descent, and be applied for single RGB images or monocular videos even with very small rigid motions. Temporal-aware solutions and driven by point correspondences are incorporated without assuming any 2D tracking data over time. Thanks to this formulation, both rigid and non-rigid objects can be considered. We have extensively validated our approach in a wide variety of scenarios in the wild, recovering challenging type of shapes that have not been previously attempted without assuming any training data.
暂无评论