The aim of the work presented in this paper is to develop and evaluate an integrated lecture style evaluation methodology that provides, teachers instant feedback related to the quality of their lecturing style. The p...
详细信息
The elderly need to communicate with their loved ones but they also need to get engaged in activities that require mental awareness as a means of preventing negative side-effects related to brain inactivity. This area...
详细信息
Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairnes...
详细信息
ISBN:
(数字)9798350353006
ISBN:
(纸本)9798350353013
Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairness of medical vision-language (VL) models remains unexplored due to the scarcity of medical VL datasets for studying fairness. To bridge this research gap, we introduce the first fair vision-language medical dataset (Harvard-FairVLMed) that provides detailed demographic attributes, ground-truth labels, and clinical notes to facilitate an in-depth examination of fairness within VL foundation models. Using Harvard-FairVLMed, we conduct a comprehensive fairness analysis of two widely-used VL models (CLIP and BLIP2), pre-trained on both natural and medical domains, across four different protected attributes. Our results highlight significant biases in all VL models, with Asian, Male, Non-Hispanic, and Spanish being the preferred subgroups across the protected attributes of race, gender, ethnicity, and language, respectively. In order to alleviate these biases, we propose FairCLIP an optimal-transport-based approach that achieves a favorable trade-off between performance and fairness by reducing the Sinkhorn distance between the overall sample distribution and the distributions corresponding to each demographic group. As the first VL dataset of its kind, Harvard-FairVLMed holds the potential to catalyze advancements in the development of machine learning models that are both ethically aware and clinically effective. Our dataset and code are available at https://***/datasets/harvard-fairvlmed10k.
The aim of the work presented in this paper is to develop and evaluate an integrated system that provides automated lecture style evaluation, allowing teachers to get instant feedback related to the goodness of their ...
详细信息
People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments. Furthermore, besides initially locating and orie...
详细信息
We present a novel method for 3D shape representation learning using multi-scale wavelet decomposition. Previous works often decompose 3D shapes into complementary components in spatial domain at a single scale. In th...
详细信息
Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairnes...
详细信息
The ability to identify the artworks that a museum visitor is looking at, using first-person images seamlessly captured by wearable cameras can be used as a means for invoking applications that provide information abo...
详细信息
This paper concerns the research problem of point cloud registration to find the rigid transformation to optimally align the source point set with the target one. Learning robust point cloud registration models with d...
This paper concerns the research problem of point cloud registration to find the rigid transformation to optimally align the source point set with the target one. Learning robust point cloud registration models with deep neural networks has emerged as a powerful paradigm, offering promising performance in predicting the global geometric transformation for a pair of point sets. Existing methods first leverage an encoder to regress the global shape descriptor, which is then decoded into a shape-conditioned transformation via concatenation-based conditioning. However, different regions of a 3D shape vary in their geometric structures which makes it more sense that we have a region-conditioned transformation instead of the shape-conditioned one. In this paper, we define our 3D registration function through the introduction of a new design of 3D region partition module that is able to divide the input shape to different regions with a self-supervised 3D shape reconstruction loss without the need for ground truth labels. We further propose the 3D shape transformer module to efficiently and effectively capture short-and long-range geometric dependencies for regions on the 3D shape Consequently, the region-aware decoder module is proposed to predict the transformations for different regions respectively. The global geometric transformation from the source point set to the target one is then formed by the weighted fusion of region-aware transformation. Compared to the state-of-the-art approaches, our experiments show that our 3D-URRT achieves superior registration performance over various benchmark datasets (e.g. ModelNet40).
Encouraging people to walk rather than using other means of transportation is an important factor towards personal health and environmental sustainability. However, given the large number of pedestrian accidents recor...
详细信息
暂无评论