Traditional empirical risk minimization (ERM) for semantic segmentation can disproportionately advantage or disadvantage certain target classes in favor of an (unfair but) improved overall performance. Inspired by the...
详细信息
ISBN:
(纸本)9781665448994
Traditional empirical risk minimization (ERM) for semantic segmentation can disproportionately advantage or disadvantage certain target classes in favor of an (unfair but) improved overall performance. Inspired by the recently introduced tilted ERM (TERM), we propose tilted cross-entropy (TCE) loss and adapt it to the semantic segmentation setting to minimize performance disparity among target classes and promote fairness. Through quantitative and qualitative performance analyses, we demonstrate that the proposed Stochastic TCE for semantic segmentation can offer improved overall fairness by efficiently minimizing the performance disparity among the target classes of Cityscapes.
Translation symmetry is one of the most important pattern characteristics in natural and man-made environments. Detecting translation symmetry is a grand challenge in computervision. This has a large spectrum of real...
详细信息
ISBN:
(纸本)9780769549903
Translation symmetry is one of the most important pattern characteristics in natural and man-made environments. Detecting translation symmetry is a grand challenge in computervision. This has a large spectrum of real-world applications from industrial settings to design, arts, entertainment and eduction. This paper describes the algorithm we have submitted for the Symmetry Detection Competition 2013. We introduce two new concepts in our symmetric repetitive pattern detection algorithm. The first concept is the bottom-up detection-inference approach. This extends the versatility of current detection methods to a higher level segmentation. The second concept is the framework of a new theoretical analysis of invariant repetitive patterns. This is crucial in symmetry/non-symmetry structure extraction but has less coverage in the previous literature on pattern detection and classification.
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development...
详细信息
ISBN:
(纸本)9798350302493
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development. In this paper, we investigate the impact of image anonymization for training computervision models on key computervision tasks (detection, instance segmentation, and pose estimation). Specifically, we benchmark the recognition drop on common detection datasets, where we evaluate both traditional and realistic anonymization for faces and full bodies. Our comprehensive experiments reflect that traditional image anonymization substantially impacts final model performance, particularly when anonymizing the full body. Furthermore, we find that realistic anonymization can mitigate this decrease in performance, where our experiments reflect a minimal performance drop for face anonymization. Our study demonstrates that realistic anonymization can enable privacy-preserving computervision development with minimal performance degradation across a range of important computervision benchmarks.
We present a new state-of-the-art on the text-to-video retrieval task on MSRVTT and LSMDC benchmarks where our model outperforms all previous solutions by a large margin. Moreover, state-of-the-art results are achieve...
详细信息
ISBN:
(纸本)9781665448994
We present a new state-of-the-art on the text-to-video retrieval task on MSRVTT and LSMDC benchmarks where our model outperforms all previous solutions by a large margin. Moreover, state-of-the-art results are achieved using a single model and without finetuning. This multidomain generalisation is achieved by a proper combination of different video caption datasets. We show that our practical approach for training on different datasets can improve test results of each other. Additionally, we check intersection between many popular datasets and show that MSRVTT as well as ActivityNet contains a significant overlap between the test and the training parts. More details are available at https://***/papermsucode/mdmmt.
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including ...
详细信息
ISBN:
(纸本)9781665448994
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including those in the computervision (CV) community. In this work, we analyze the total cost of CO2 emissions by breaking it into (1) the architecture creation cost and (2) the life-time evaluation cost. We show that over time, these costs are non-negligible and are having a direct impact on our future. Importantly, we conduct an ethical analysis of how the CV-community is unintentionally overlooking its own ethical AI principles by emitting this level of CO2. To address these concerns, we propose adding "enforcement" as a pillar of ethical AI and provide some recommendations for how architecture designers and broader CV community can curb the climate crisis.
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of...
详细信息
ISBN:
(纸本)9781424439942
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of coordinate maps, whereas the other is formulated strictly in the image domain (thus making it also applicable to manifolds which require multiple coordinate charts). We discuss their intrinsic relation as well as the advantages and disadvantages of the two approaches. Further we propose a novel formulation for unbiased image registration, which naturally extends to the case of time-series of images. We discuss numerical implementation details and carefully evaluate the properties of the alternative algorithms.
We present a novel approach for accelerating convolutions during inference for CPU-based architectures. The most common method of computation involves packing the image into the columns of a matrix (im2col) and perfor...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We present a novel approach for accelerating convolutions during inference for CPU-based architectures. The most common method of computation involves packing the image into the columns of a matrix (im2col) and performing general matrix multiplication (GEMM) with a matrix of weights. This results in two main drawbacks: (a) im2col requires a large memory buffer and can experience inefficient memory access, and (b) while GEMM is highly optimized for scientific matrices multiplications, it is not well suited for convolutions. We propose an approach that takes advantage of scalar-matrix multiplication and reduces memory overhead. Our experiments with commonly used network architectures demonstrate a significant speedup compared to existing indirect methods.
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-reso...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-resolution imagery, the significant development of computervision, especially with a major breakthrough in the field of convolutional neural networks, a wide range of industry verticals for usage and still a quite empty market. Roads are one of the most popular objects for recognition. In this article, we want to present you the combination of work of neural network and postprocessing algorithm, due to which we get not only the coverage mask but also the vectors of all of the individual roads that are present in the image and can be used to address the higher-level tasks in the future. This approach was used to solve the DeepGlobe Road Extraction Challenge.
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments a...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present a flash game that aims at generating easily ground truth for testing object detection algorithms. Flash the Fish is an online game where the user is shown videos from underwater environments and has to take photos of fish by clicking on them. The initial ground truth is provided by object detection algorithms and, subsequent, cluster analysis and user evaluation techniques, allow for the generation of ground truth based on the weighted combination of these "photos". Evaluation of the platform and comparison of the obtained results against a hand drawn ground truth confirmed that reliable ground truth generation is not necessarily a cumbersome task both in terms of effort and time needed.
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with ieee International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW C...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with ieee International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW Competition is a continuation of the Competitions held at ICCV 2021, ieee FG 2020 and ieee CVPR 2017 conferences, and aims at automatically analyzing affect. This year the Competition encompasses four Challenges: i) uni-task Valence-Arousal Estimation, ii) uni-task Expression Classification, iii) uni-task Action Unit Detection, and iv) MultiTask-Learning. All the Challenges are based on a common benchmark database, Aff-Wild2, which is a large scale in-the-wild database and the first one to be annotated in terms of valence-arousal, expressions and action units. In this paper, we present the four Challenges, with the utilized Competition corpora, we outline the evaluation metrics and present both the baseline systems and the top performing teams' per Challenge. Finally we illustrate the obtained results of the baseline systems and of all participating teams.
暂无评论