检索结果-内蒙古大学图书馆

Learning Deep Sensorimotor Policies for vision-based Autonomous Drone Racing

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Fu, Jiawei Song, Yunlong Wu, Yan Yu, Fisher Scaramuzza, Davide Robotics and Perception Group Department of Informatics University of Zurich Department of Neuroinformatics University of Zurich ETH Zurich Switzerland Visual Intelligence and Systems Group in the Computer Vision Lab at ETH Zurich Switzerland

unstructured environments, enabling various real-world applications. However, the lack of effective vision-based algorithms has been a stumbling block to achieving this goal. Existing systems often require hand-engineered components for state estimation, planning, and control. Such a sequential design involves laborious tuning, human heuristics, and compounding delays and errors. This paper tackles the vision-based autonomous-drone-racing problem by learning deep sensorimotor policies. We use contrastive learning to extract robust feature representations from the input images and leverage a two-stage learning-by-cheating framework for training a neural network policy. The resulting policy directly infers control commands with feature representations learned from raw images, forgoing the need for globally-consistent state estimation, trajectory planning, and handcrafted control design. Our experimental results indicate that our vision-based policy can achieve the same level of racing performance as the state-based policy while being robust against different visual disturbances and distractors. We believe this work serves as a stepping-stone toward developing intelligent vision-based autonomous systems that control the drone purely from image inputs, like human pilots. Copyright © 2022, The Authors. All rights reserved.

关键词： Drones

SGNet: Salient Geometric Network for Point Cloud Registration

学校读者我要写书评

暂无评论

SGNet: Salient Geometric Network for Point Cloud Registratio...

IEEE/RSJ International Conference on Intelligent Robots and systems (IROS)

作者： Qianliang Wu Yaqing Ding Lei Luo Haobo Jiang Shuo Gu Chuanwei Zhou Jin Xie Jian Yang PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Visual Recognition Group Faculty of Electrical Engineering Czech Technical University in Prague Prague Czech Republic State Key Laboratory for Novel Software Technology Nanjing University Nanjing China School of Intelligence Science and Technology Nanjing University Suzhou China

ISBN: (数字)9798350377705

ISBN: (纸本)9798350377712

Point Cloud Registration (PCR) is a critical and challenging task in computer vision and robotics. One of the primary difficulties in PCR is identifying salient and meaningful points that exhibit consistent semantic and geometric properties across different scans. Previous methods have encountered challenges with ambiguous matching due to the similarity among patch blocks throughout the entire point cloud and the lack of consideration for efficient global geometric consistency. To address these issues, we propose a new framework that includes several novel techniques. Firstly, we introduce a semantic-aware geometric encoder that combines object-level and patch-level semantic information. This encoder significantly improves registration recall by reducing ambiguity in patch-level superpoint matching. Additionally, we incorporate a prior knowledge approach that utilizes an intrinsic shape signature to identify salient points. This enables us to extract the most salient super points and meaningful dense points in the scene. Secondly, we introduce an innovative transformer that encodes High-Order (HO) geometric features. These features are crucial for identifying salient points within initial overlap regions while considering global high-order geometric consistency. We introduce an anchor node selection strategy to optimize this high-order transformer further. By encoding inter-frame triangle or polyhedron consistency features based on these anchor nodes, we can effectively learn high-order geometric features of salient super points. These high-order features are then propagated to dense points and utilized by a Sinkhorn matching module to identify critical correspondences for successful registration. The experiments conducted on the 3DMatch/3DLoMatch and KITTI datasets demonstrate the effectiveness of our method.

关键词： Point cloud compression computer vision Accuracy Shape Semantics Transformers Feature extraction Encoding Intelligent robots

Diff-Reg v2: Diffusion-Based Matching Matrix Estimation for Image Matching and 3D Registration

学校读者我要写书评

暂无评论

arXiv 2025年

作者： Wu, Qianliang Jiang, Haobo Ding, Yaqing Luo, Lei Xie, Jin Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education China School of Computer Science and Engineering Nanjing University of Science and Technology China Nanyang Technological University Singapore Visual Recognition Group Faculty of Electrical Engineering Czech Technical University in Prague Prague Czech Republic State Key Laboratory for Novel Software Technology Nanjing University Nanjing China School of Intelligence Science and Technology Nanjing University Suzhou China

Establishing reliable correspondences is crucial for all registration tasks, including 2D image registration, 3D point cloud registration, and 2D-3D image-to-point cloud registration. However, these tasks are often complicated by challenges such as scale inconsistencies, symmetry, and large deformations, which can lead to ambiguous matches. Previous feature-based and correspondence-based methods typically rely on geometric or semantic features to generate or polish initial potential correspondences. Some methods typically leverage specific geometric priors, such as topological preservation, to devise diverse and innovative strategies tailored to a given enhancement goal, which cannot be exhaustively enumerated. Additionally, many previous approaches rely on a single-step prediction head, which can struggle with local minima in complex matching scenarios. To address these challenges, we introduce an innovative paradigm that leverages a diffusion model in matrix space for robust matching matrix estimation. Our model treats correspondence estimation as a denoising diffusion process in the matching matrix space, gradually refining the intermediate matching matrix to the optimal one. Specifically, we apply the diffusion model in the doubly stochastic matrix space for 3D-3D and 2D-3D registration tasks. In the 2D image registration task, we deploy the diffusion model in a matrix sub-space, where dual-softmax projection regularization is applied. For all three registration tasks, we provide adaptive matching matrix embedding implementations tailored to the specific characteristics of each task while maintaining a consistent"match-to-warp" encoding pattern. Furthermore, we adopt a lightweight design for the denoising module. In inference, once points or image features are extracted and fixed, this module performs multi-step denoising predictions through reverse sampling. Evaluations on both 2D and 3D registration tasks demonstrate the effectiveness of our approach. Copyrigh

关键词： Stochastic systems

Diff-Reg: Diffusion Model in Doubly Stochastic Matrix Space for Registration Problem

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Wu, Qianliang Jiang, Haobo Luo, Lei Li, Jun Ding, Yaqing Xie, Jin Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing China State Key Laboratory for Novel Software Technology Nanjing University Nanjing China School of Intelligence Science and Technology Nanjing University Suzhou China Visual Recognition Group Faculty of Electrical Engineering Czech Technical University in Prague Prague Czech Republic National University of Singapore Singapore

Establishing reliable correspondences is essential for 3D and 2D-3D registration tasks. Existing methods commonly leverage geometric or semantic point features to generate potential correspondences. However, these features may face challenges such as large deformation, scale inconsistency, and ambiguous matching problems (e.g., symmetry). Additionally, many previous methods, which rely on single-pass prediction, may struggle with local minima in complex scenarios. To mitigate these challenges, we introduce a diffusion matching model for robust correspondence construction. Our approach treats correspondence estimation as a denoising diffusion process within the doubly stochastic matrix space, which gradually denoises (refines) a doubly stochastic matching matrix to the ground-truth one for high-quality correspondence estimation. It involves a forward diffusion process that gradually introduces Gaussian noise into the ground truth matching matrix and a reverse denoising process that iteratively refines the noisy one. In particular, we deploy a lightweight denoising strategy during the inference phase. Specifically, once points/image features are extracted and fixed, we utilize them to conduct multiple-pass denoising predictions in the reverse sampling process. Evaluation of our method on both 3D and 2D-3D registration tasks confirms its effectiveness. The code is available at https://***/wuqianliang/Diff-Reg. Copyright © 2024, The Authors. All rights reserved.

关键词： Stochastic systems

NTIRE 2023 Image Shadow Removal Challenge Report

学校读者我要写书评

暂无评论

NTIRE 2023 Image Shadow Removal Challenge Report

2023 IEEE/CVF Conference on computer vision and Pattern Recognition Workshops, CVPRW 2023

作者： Vasluianu, Florin-Alexandru Seizinger, Tim Timofte, Radu Cui, Shuhao Huang, Junshi Tian, Shuman Fan, Mingyuan Zhang, Jiaqi Zhu, Li Wei, Xiaoming Wei, Xiaolin Luo, Ziwei Gustafsson, Fredrik K. Zhao, Zheng Sjölund, Jens Schön, Thomas B. Dong, Xiaoyi Zhang, Xi Sheryl Li, Chenghua Leng, Cong Yeo, Woon-Ha Oh, Wang-Taek Lee, Yeo-Reum Ryu, Han-Cheol Luo, Jinting Jiang, Chengzhi Han, Mingyan Wu, Qi Lin, Wenjie Yu, Lei Li, Xinpeng Jiang, Ting Fan, Haoqiang Liu, Shuaicheng Xu, Shuning Song, Binbin Chen, Xiangyu Zhang, Shile Zhou, Jiantao Zhang, Zhao Zhao, Suiyi Zheng, Huan Gao, Yangcheng Wei, Yanyan Wang, Bo Ren, Jiahuan Luo, Yan Kondo, Yuki Miyata, Riku Yasue, Fuma Naruki, Taito Ukita, Norimichi Chang, Hua-En Yang, Hao-Hsiang Chen, Yi-Chung Chiang, Yuan-Chun Huang, Zhi-Kai Chen, Wei-Ting Chen, I-Hsiang Hsieh, Chia-Hsuan Kuo, Sy-Yen Xianwei, Li Fu, Huiyuan Liu, Chunlin Ma, Huadong Fu, Binglan He, Huiming Wang, Mengjia She, Wenxuan Liu, Yu Nathan, Sabari Kansal, Priya Zhang, Zhongjian Yang, Huabin Wang, Yan Zhang, Yanru Phutke, Shruti S. Kulkarni, Ashutosh Khan, Md Raqib Murala, Subrahmanyam Vipparthi, Santosh Kumar Ye, Heng Liu, Zixi Yang, Xingyi Liu, Songhua Wu, Yinwei Jing, Yongcheng Yu, Qianhao Zheng, Naishan Huang, Jie Long, Yuhang Yao, Mingde Zhao, Feng Zhao, Bowen Ye, Nan Shen, Ning Cao, Yanpeng Xiong, Tong Xia, Weiran Li, Dingwen Xia, Shuchen Computer Vision Lab Ifi Caidas University of Würzburg Germany Computer Vision Lab Eth Zürich Switzerland Meituan Group China Department of Information Technology Uppsala University Sweden Institute of Automation Chinese Academy of Sciences Beijing China Nanjing China Maicro Nanjing China Department of Artificial Intelligence Convergence Sahmyook University Seoul Korea Republic of Megvii Technology China University of Electronic Science and Technology of China China University of Macau China China Toyota Technological Institute Japan Graduate Institute of Electronics Engineering National Taiwan University Taiwan Department of Electrical Engineering National Taiwan University Taiwan Graduate Institute of Communication Engineering National Taiwan University Taiwan ServiceNow United States Beijing University of Post and Teleconmunication Beijing China Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education China Couger Inc. Computer Vision and Pattern Recognition Lab Indian Institute of Technology Ropar Punjab Rupnagar India Research Institute Singapore National University of Singapore Singapore Research Institute Singapore University of Sydney Australia Brain-Inspired Vision Laboratory Information Science and Technology Institution University of Science and Technology of China China State Key Laboratory of Fluid Power and Mechatronic Systems School of Mechanical Engineering Zhejiang University Hangzhou310027 China Key Laboratory of Advanced Manufacturing Technology of Zhejiang Province School of Mechanical Engineering Zhejiang University Hangzhou310027 China South China University of Technology China

ISBN: (纸本)9798350302493

This work reviews the results of the NTIRE 2023 Challenge on Image Shadow Removal. The described set of solutions were proposed for a novel dataset, which captures a wide range of object-light interactions. It consists of 1200 roughly pixel aligned pairs of real shadow free and shadow affected images, captured in a controlled environment. The data was captured in a white-box setup, using professional equipment for lights and data acquisition sensors. The challenge had a number of 144 participants registered, out of which 19 teams were compared in the final ranking. The proposed solutions extend the work on shadow removal, improving over the performance level describing state-of-the-art methods. © 2023 IEEE.

关键词： Data acquisition

Self-supervised linear motion deblurring

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Liu, Peidong Janai, Joel Pollefeys, Marc Sattler, Torsten Geiger, Andreas Computer Vision and Geometry Group Department of Computer Science ETH Zürich Switzerland Autonomous Vision Group Max Planck Institute for Intelligent Systems Univeristy of Tübingen Tübingen Germany Microsoft Mixed Reality and Artificial Intelligence Lab Zürich Switzerland Computer Vision and Medical Image Analysis Group Chalmers University of Technology Sweden

Motion blurry images challenge many computer vision algorithms, e.g., feature detection, motion estimation, or object recognition. Deep convolutional neural networks are state-of-the-art for image deblurring. However, obtaining training data with corresponding sharp and blurry image pairs can be difficult. In this paper, we present a differentiable reblur model for self-supervised motion deblurring, which enables the network to learn from real-world blurry image sequences without relying on sharp images for supervision. Our key insight is that motion cues obtained from consecutive images yield sufficient information to inform the deblurring task. We therefore formulate deblurring as an inverse rendering problem, taking into account the physical image formation process: we first predict two deblurred images from which we estimate the corresponding optical flow. Using these predictions, we re-render the blurred images and minimize the difference with respect to the original blurry inputs. We use both synthetic and real dataset for experimental evaluations. Our experiments demonstrate that self-supervised single image deblurring is really feasible and leads to visually compelling results. Both the code and datasets are available at https://***/ethliup/SelfDeblur. Copyright © 2020, The Authors. All rights reserved.

关键词： Image enhancement

REFUGE2 CHALLENGE: A TREASURE TROVE FOR MULTI-DIMENSION ANALYSIS AND EVALUATION IN GLAUCOMA SCREENING

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Fang, Huihui Li, Fei Wu, Junde Fu, Huazhu Sun, Xu Son, Jaemin Yu, Shuang Zhang, Menglu Yuan, Chenglang Bian, Cheng Lei, Baiying Zhao, Benjian Xu, Xinxing Li, Shaohua Fumero, Francisco Sigut, José Almubarak, Haidar Bazi, Yakoub Guo, Yuanhao Zhou, Yating Baid, Ujjwal Innani, Shubham Guo, Tianjiao Yang, Jie Orlando, José Ignacio Bogunović, Hrvoje Zhang, Xiulan Xu, Yanwu The REFUGE2 Challenge Australia State Key Laboratory of Ophthalmology Zhongshan Ophthalmic Center Sun Yat-Sen University Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science Guangzhou China Intelligent Healthcare Unit Baidu Inc. Beijing China The Institute of High Performance Computing Agency for Science Technology and Research Singapore Yatiris Group PLADEMA Institute CONICET UNICEN Tandil Argentina Christian Doppler Lab for Artificial Intelligence in Retina Department of Ophthalmology and Optometry Medical University of Vienna Vienna Austria VUNO Inc Seoul Korea Republic of Tencent HealthCare Tencent Shenzhen China Computer Vision Institute College of Computer Science and Software Engineering of Shenzhen University Shenzhen China School of Biomedical Engineering Health Science Center Shenzhen University China Xiaohe Healthcare ByteDance Guangdong Guangzhou510000 China School of Biomedical Engineering Shenzhen University China College of Computer Science & Software Engineering Shenzhen University China Department of Computer Science and Systems Engineering Universidad de La Laguna Spain Saudi Electronic University Saudi Arabia King Saud University Saudi Arabia Institute of Automation Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China SGGS Institute of Engineering and Technology India Institute of Medical Robotics Shanghai Jiao Tong University China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge-Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application. © 2022, CC BY-NC-ND.

关键词： Color

Assessing Trustworthy AI in Times of COVID-19: Deep Learning for Predicting a Multiregional Score Conveying the Degree of Lung Compromise in COVID-19 Patients

学校读者我要写书评

暂无评论

IEEE Transactions on Technology and Society

IEEE Transactions on Technology and Society 2022年第4期3卷 272-289页

作者： Allahabadi, Himanshi Amann, Julia Balot, Isabelle Beretta, Andrea Binkley, Charles Bozenhard, Jonas Bruneault, Frederick Brusseau, James Candemir, Sema Cappellini, Luca Alessandro Chakraborty, Subrata Cherciu, Nicoleta Cociancig, Christina Coffee, Megan Ek, Irene Espinosa-Leal, Leonardo Farina, Davide Fieux-Castagnet, Genevieve Frauenfelder, Thomas Gallucci, Alessio Giuliani, Guya Golda, Adam Van Halem, Irmhild Hildt, Elisabeth Holm, Sune Kararigas, Georgios Krier, Sebastien A. Kuhne, Ulrich Lizzi, Francesca Madai, Vince I. Markus, Aniek F. Masis, Serg Mathez, Emilie Wiinblad Mureddu, Francesco Neri, Emanuele Osika, Walter Ozols, Matiss Panigutti, Cecilia Parent, Brendan Pratesi, Francesca Moreno-Sanchez, Pedro A. Sartor, Giovanni Savardi, Mattia Signoroni, Alberto Sormunen, Hanna-Maria Spezzatti, Andy Srivastava, Adarsh Stephansen, Annette F. Theng, Lau Bee Tithi, Jesmin Jahan Tuominen, Jarno Umbrello, Steven Vaccher, Filippo Vetter, Dennis Westerlund, Magnus Wurth, Renee Zicari, Roberto V. Ey Netherlands Enterprise Intelligence Department Amsterdam1083 HP Netherlands Eth Zurich Health Ethics and Policy Lab Department of Health Sciences and Technology Zürich8092 Switzerland Center for Diplomatic and Strategic Studies Postgraduate Studies in Diplomacy and International Relations Paris75015 France Pisa56124 Italy Hackensack Meridian Health Bioethics Center EdisonNJ08820 United States University of Oxford Faculty of Philosophy OxfordOX2 6GG United Kingdom Collège André- Laurendeau Philosophie Department MontrealQCH8N 2J4 Canada Université du Québec À Montréal École des Médias MontrealQCH2L 2C4 Canada Pace University Philosophy Department New YorkNY10038 United States The Ohio State University Wexner Medical Center Department of Radiology ColumbusOH43210 United States Humanitas Research Hospital Department of Radiology Milan20089 Italy Humanitas University Department of Biomedical Sciences Milan20089 Italy University of New England Faculty of Science Agriculture Business and Law ArmidaleNSW2351 Australia University of Technology Sydney Faculty of Engineering and Information Technology SydneyNSW2007 Australia Scuola Superiore Sant'Anna European Centre of Excellence on the Regulation of Robotics and Ai Pisa56127 Italy University of Bremen Group of Computer Architecture Bremen28359 Germany New York University Grossman School of Medicine Division of Infectious Diseases and Immunology Department of Medicine New YorkNY10016 United States Digital Institute Ai Research Section Stockholm16731 Sweden Arcada University of Applied Sciences Department of Business Management and Analytics Helsinki00550 Finland University of Brescia Radiological Sciences and Public Health Department of Medical and Surgical Specialties Brescia25121 Italy Sncf Reseau Sa Ethique Groupe La Plaine93418 France Institute of Diagnostic and Interventional Radiology University Hospital Zurich Zürich8091 Switzerland Eindhoven University of Tech

This article's main contributions are twofold: 1) to demonstrate how to apply the general European Union's High-Level Expert group's (EU HLEG) guidelines for trustworthy AI in practice for the domain of healthcare and 2) to investigate the research question of what does 'trustworthy AI' mean at the time of the COVID-19 pandemic. To this end, we present the results of a post-hoc self-assessment to evaluate the trustworthiness of an AI system for predicting a multiregional score conveying the degree of lung compromise in COVID-19 patients, developed and verified by an interdisciplinary team with members from academia, public hospitals, and industry in time of pandemic. The AI system aims to help radiologists to estimate and communicate the severity of damage in a patient's lung from Chest X-rays. It has been experimentally deployed in the radiology department of the ASST Spedali Civili clinic in Brescia, Italy, since December 2020 during pandemic time. The methodology we have applied for our post-hoc assessment, called Z-Inspection®, uses sociotechnical scenarios to identify ethical, technical, and domain-specific issues in the use of the AI system in the context of the pandemic. © 2020 IEEE.

关键词： COVID-19

Why is the Winner the Best?

学校读者我要写书评

暂无评论

Why is the Winner the Best?

Conference on computer vision and Pattern Recognition (CVPR)

作者： M. Eisenmann A. Reinke V. Weru M. D. Tizabi F. Isensee T. J. Adler S. Ali V. Andrearczyk M. Aubreville U. Baid S. Bakas N. Balu S. Bano J. Bernal S. Bodenstedt A. Casella V. Cheplygina M. Daum M. De Bruijne A. Depeursinge R. Dorent J. Egger D. G. Ellis S. Engelhardt M. Ganz N. Ghatwary G. Girard P. Godau A. Gupta L. Hansen K. Harada M. Heinrich N. Heller A. Hering A. Huaulmé P. Jannin A. E. Kavur O. Kodym M. Kozubek J. Li H. Li J. Ma C. Martín-Isla B. Menze A. Noble V. Oreiller N. Padoy S. Pati K. Payette T. Rädsch J. Rafael-Patiño V. Singh Bawa S. Speidel C. H. Sudre K. Van Wijnen M. Wagner D. Wei A. Yamlahi M. H. Yap C. Yuan M. Zenk A. Zia D. Zimmerer D. Aydogan B. Bhattarai L. Bloch R. Brüngel J. Cho C. Choi Q. Dou I. Ezhov C. M. Friedrich C. Fuller R. R. Gaire A. Galdran Á. García Faura M. Grammatikopoulou S. Hong M. Jahanifar I. Jang A. Kadkhodamohammadi I. Kang F. Kofler S. Kondo H. Kuijf M. Li M. Luu T. Martinčič P. Morais M. A. Naser B. Oliveira D. Owen S. Pang J. Park S. Park S. Płotka E. Puybareau N. Rajpoot K. Ryu N. Saeed A. Shephard P. Shi D. Štepec R. Subedi G. Tochon H. R. Torres H. Urien J. L. Vilaça K. A. Wahid H. Wang J. Wang L. Wang X. Wang B. Wiestler M. Wodzinski F. Xia J. Xie Z. Xiong S. Yang Y. Yang Z. Zhao K. Maier-Hein P. F. Jäger A. Kopp-Schneider L. Maier-Hein Division of Intelligent Medical Systems German Cancer Research Center (DKFZ) Heidelberg Germany Helmholtz Imaging German Cancer Research Center (DKFZ) Heidelberg Germany Faculty of Mathematics and Computer Science Heidelberg University Heidelberg Germany Division of Biostatistics German Cancer Research Center (DKFZ) Heidelberg Germany Division of Medical Image Computing German Cancer Research Center (DKFZ) Heidelberg Germany Faculty of Engineering and Physical Sciences School of Computing University of Leeds Leeds UK Institute of Informatics School of Management HES-SO Valais-Wallis University of Applied Sciences and Arts Western Switzerland Sierre Switzerland Department of Nuclear Medicine and Molecular Imaging Lausanne University Hospital Lausanne Switzerland Technische Hochschule Ingolstadt Ingolstadt Germany Center for Artificial Intelligence and Data Science for Integrated Diagnostics (AI2D) and Center for Biomedical Image Computing and Analytics (CBICA) University of Pennsylvania Philadelphia PA USA Department of Pathology and Laboratory Medicine Perelman School of Medicine University of Pennsylvania Philadelphia PA USA Department of Radiology Perelman School of Medicine University of Pennsylvania Philadelphia PA USA Department of Radiology University of Washington Seattle WA USA Department of Computer Science Wellcome/EPSRC Centre for Interventional and Surgical Sciences (WEISS) University College London London UK Universitat Autònoma de Barcelona & Computer Vision Center Barcelona Spain Division of Translational Surgical Oncology National Center for Tumor Diseases (NCT/UCC) Dresden Dresden Germany Department of Advanced Robotics Istituto Italiano di Tecnologia Italy Department of Electronics Information and Bioengineering Politecnico di Milano Milan Italy IT University of Copenhagen Copenhagen Denmark Department of General Visceral and Transplantation Surgery Heidelberg University Hospital Heidelberg Germany Department of Radiology and Nuc

International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multicenter study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and post-processing (66%). The “typical” lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.

关键词：