Image deblurring and super-resolution (SR) are computervision tasks aiming to restore image detail and spatial scale, respectively. Besides, only a few recent works of literature contribute to this task, as conventio...
详细信息
ISBN:
(纸本)9781665448994
Image deblurring and super-resolution (SR) are computervision tasks aiming to restore image detail and spatial scale, respectively. Besides, only a few recent works of literature contribute to this task, as conventional methods deal with SR or deblurring separately. We focus on designing a novel Pixel-Guided dual-branch attention network (PDAN) that handles both tasks jointly to address this issue. Then, we propose a novel loss function better focus on large and medium range errors. Extensive experiments demonstrated that the proposed PDAN with the novel loss function not only generates remarkably clear HR images and achieves compelling results for joint image deblurring and SR tasks. In addition, our method achieves second place in NTIRE 2021 Challenge on track 1 of the Image Deblurring Challenge.
Few-shot learning is an important research topic in image classification, which aims to train robust classifiers to categorize images coming from new classes where only a few labeled samples are available. Recently, m...
详细信息
ISBN:
(纸本)9781728193601
Few-shot learning is an important research topic in image classification, which aims to train robust classifiers to categorize images coming from new classes where only a few labeled samples are available. Recently, metric learning based methods have achieved promising performance, and in those methods a distance metric is learned to directly compare query images against training samples. In this work, we consider finer information from image feature maps and propose a new approach. Specifically, we newly develop Relative Position Network (RPN) based on the attention mechanism to compare different pairs of activation cells from each query and training images, which captures their intrinsic correspondences. Moreover, we introduce Relative Map Network (RMN) to learn a distance metric based on the attention maps obtained from RPN, which better measures the similarity between query and training images. Extensive experiments demonstrate the effectiveness of our proposed method. Our codes will be released at https://***/chrisyxue/RMN-RPN-for-FSL.
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with IEEE International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW C...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with IEEE International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW Competition is a continuation of the Competitions held at ICCV 2021, IEEE FG 2020 and IEEE CVPR 2017 conferences, and aims at automatically analyzing affect. This year the Competition encompasses four Challenges: i) uni-task Valence-Arousal Estimation, ii) uni-task Expression Classification, iii) uni-task Action Unit Detection, and iv) MultiTask-Learning. All the Challenges are based on a common benchmark database, Aff-Wild2, which is a large scale in-the-wild database and the first one to be annotated in terms of valence-arousal, expressions and action units. In this paper, we present the four Challenges, with the utilized Competition corpora, we outline the evaluation metrics and present both the baseline systems and the top performing teams' per Challenge. Finally we illustrate the obtained results of the baseline systems and of all participating teams.
The use of 3D technologies to represent elements and interact with them is an open and interesting research area. In this article we discuss a novel human computer interaction method that integrates mobile computing a...
详细信息
ISBN:
(纸本)9780769549903
The use of 3D technologies to represent elements and interact with them is an open and interesting research area. In this article we discuss a novel human computer interaction method that integrates mobile computing and 3D visualization techniques with applications on free viewpoint visualization and 3D rendering for interactive and realistic environments. Especially this approach is focused on augmented reality and home entertainment and it was developed and tested on mobiles and particularly on tablet computers. Finally, an evaluation mechanism on the accuracy of this interaction system is presented.
With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In thi...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
With the recent advances of Convolutional Neural Networks (CNN) in computervision, there have been rapid progresses in extracting roads and other features from satellite imagery for mapping and other purposes. In this paper, we propose a new method for road extraction using stacked U-Nets with multiple output. A hybrid loss function is used to address the problem of unbalanced classes of training data. Post-processing methods, including road map vectorization and shortest path search with hierarchical thresholds, help improve recall. The overall improvement of mean IoU compared to the vanilla VGG network is more than 20%.
When creating a new labeled dataset, human analysts or data reductionists must review and annotate large numbers of images. This process is time consuming and a barrier to the deployment of new computervision solutio...
详细信息
ISBN:
(纸本)9781665448994
When creating a new labeled dataset, human analysts or data reductionists must review and annotate large numbers of images. This process is time consuming and a barrier to the deployment of new computervision solutions, particularly for rarely occurring objects. To reduce the number of images requiring human attention, we evaluate the utility of images created from 3D models refined with a generative adversarial network to select confidence thresholds that significantly reduce false alarms rates. The resulting approach has been demonstrated to cut the number of images needing to be reviewed by 50% while preserving a 95% recall rate, with only 6 labeled examples of the target.
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows fo...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows for rapid acquisition and annotation of data with corresponding ground truth. While more constrained in its scopes - the iCub world is essentially a robotics research lab - we demonstrate how the proposed data-set poses challenges to current recognition systems. The iCubWorld data-set is publicly available (1).
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments a...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments and has to take photos of fish by clicking on them. The initial ground truth is provided by object detection algorithms and, subsequent, cluster analysis and user evaluation techniques, allow for the generation of ground truth based on the weighted combination of these "photos". Evaluation of the platform and comparison of the obtained results against a hand drawn ground truth confirmed that reliable ground truth generation is not necessarily a cumbersome task both in terms of effort and time needed.
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-traine...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper, we study deep transfer learning as a way of overcoming object recognition challenges encountered in the field of digital pathology. Through several experiments, we investigate various uses of pre-trained neural network architectures and different combination schemes with random forests for feature selection. Our experiments on eight classification datasets show that densely connected and residual networks consistently yield best performances across strategies. It also appears that network fine-tuning and using inner layers features are the best performing strategies, with the former yielding slightly superior results.
We have been researching three dimensional (3D) ground-truth systems for performance evaluation of vision and perception systems in the fields of smart manufacturing and robot safety. In this paper we first present an...
详细信息
ISBN:
(纸本)9780769549903
We have been researching three dimensional (3D) ground-truth systems for performance evaluation of vision and perception systems in the fields of smart manufacturing and robot safety. In this paper we first present an overview of different systems that have been used to provide ground-truth (GT) measurements and then we discuss the advantages of physically-sensed ground-truth systems for our applications. Then we discuss in detail the three ground- truth systems that we have used in our experiments: ultra wide-band, indoor GPS, and a camera-based motion capture system. Finally, we discuss three different perception-evaluation experiments where we have used these GT systems
暂无评论