检索结果-内蒙古大学图书馆

An integrated framework for developing and evaluating a lecture style assessment methodology

multimedia Tools and Applications 2024年 1-28页

作者： Dimitriadou, Eleni Lanitis, Andreas Visual Media Computing Lab Department of Multimedia and Graphic Arts Cyprus University of Technology Limassol Cyprus CYENS Centre of Excellence Nicosia Cyprus

The aim of the work presented in this paper is to develop and evaluate an integrated lecture style evaluation methodology that provides, teachers instant feedback related to the quality of their lecturing style. The proposed method aims to promote improvement of lecture quality, that could upgrade the overall student learning experience. The proposed methodology utilizes specific measurable visual, and audio biometric characteristics extracted from a video showing the lecturer from the audience’s point of view. Measurable biometric features extracted during a lecture are combined to provide teachers with a score reflecting lecture style quality both at frame rate and by providing lecture quality metrics for the whole lecture. The results of a comprehensive quantitative evaluation indicate that the proposed methodology can be used for obtaining metrics that reflect lecture style quality. Furthermore, the performance evaluation of the proposed methodology was compared with the performance of humans in the task of lecture style evaluation. Results indicate that the proposed method not only achieves similar performance to human observers, but in some cases, it outperforms them. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Biometrics

来源：评论

学校读者我要写书评

暂无评论

Development and Evaluation of a Prototype VR Application for the Elderly, that can Help to Prevent Effects Related to Social Isolation 2

Development and Evaluation of a Prototype VR Application for...

引用

2nd International Conference on Interactive Media, Smart Systems and Emerging Technologies, IMET 2022

作者： Anastasiadou, Zoe Lanitis, Andreas Cyprus University of Technology Visual Media Computing Lab Department of Multimedia and Graphic Arts Limassol Cyprus Cyens Centre of Excellence Cyprus

ISBN: (数字)9781665470162

ISBN: (纸本)9781665470162

The elderly need to communicate with their loved ones but they also need to get engaged in activities that require mental awareness as a means of preventing negative side-effects related to brain inactivity. This area of research is becoming increasingly important during periods of social isolation caused either by external factors, such as a pandemic, or by factors associated with reduced mobility in the elderly. In this paper, a prototype Virtual Reality application that will allow elderly users to deal with the problems of social isolation, while providing an entertaining brain-triggering activity, is presented. While this is work in progress, the results of an initial user evaluation provides insights related to the strengths and limitations of the prototype application, allowing the derivations of conclusions that can guide further development of the final application. © 2022 IEEE.

关键词： Brain

来源：评论

学校读者我要写书评

暂无评论

FairCLIP: Harnessing Fairness in Vision-Language Learning

FairCLIP: Harnessing Fairness in Vision-Language Learning

引用

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Yan Luo Min Shi Muhammad Osama Khan Muhammad Muneeb Afzal Hao Huang Shuaihang Yuan Yu Tian Luo Song Ava Kouhana Tobias Elze Yi Fang Mengyu Wang Harvard Ophthalmology AI Lab Harvard University Tandon School of Engineering New York University Multimedia and Visual Computing Lab New York University Abu Dhabi

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairness of medical vision-language (VL) models remains unexplored due to the scarcity of medical VL datasets for studying fairness. To bridge this research gap, we introduce the first fair vision-language medical dataset (Harvard-FairVLMed) that provides detailed demographic attributes, ground-truth labels, and clinical notes to facilitate an in-depth examination of fairness within VL foundation models. Using Harvard-FairVLMed, we conduct a comprehensive fairness analysis of two widely-used VL models (CLIP and BLIP2), pre-trained on both natural and medical domains, across four different protected attributes. Our results highlight significant biases in all VL models, with Asian, Male, Non-Hispanic, and Spanish being the preferred subgroups across the protected attributes of race, gender, ethnicity, and language, respectively. In order to alleviate these biases, we propose FairCLIP an optimal-transport-based approach that achieves a favorable trade-off between performance and fairness by reducing the Sinkhorn distance between the overall sample distribution and the distributions corresponding to each demographic group. As the first VL dataset of its kind, Harvard-FairVLMed holds the potential to catalyze advancements in the development of machine learning models that are both ethically aware and clinically effective. Our dataset and code are available at https://***/datasets/harvard-fairvlmed10k.

关键词： Deep learning Bridges Analytical models Ethics Computer vision Codes Computational modeling

来源：评论

学校读者我要写书评

暂无评论

AN INTEGRATED FRAMEWORK FOR DEVELOPING AND EVALUATING AN AUTOMATED LECTURE STYLE ASSESSMENT SYSTEM

arXiv

引用

arXiv 2023年

The aim of the work presented in this paper is to develop and evaluate an integrated system that provides automated lecture style evaluation, allowing teachers to get instant feedback related to the goodness of their lecturing style. The proposed system aims to promote improvement of lecture quality, that could upgrade the overall student learning experience. The proposed application utilizes specific measurable biometric characteristics, such as facial expressions, body activity, speech rate and intonation, hand movement, and facial pose, extracted from a video showing the lecturer from the audience point of view. Measurable biometric features extracted during a lecture are combined to provide teachers with a score reflecting lecture style quality both at frame rate and by providing lecture quality metrics for the whole lecture. The acceptance of the proposed lecture style evaluation system was evaluated by chief education officers, teachers and students regarding the functionality, usefulness of the application, and possible improvements. The results indicate that participants found the application novel and useful in providing automated feedback regarding lecture quality. Furthermore, the performance evaluation of the proposed system was compared with the performance of humans in the task of lecture style evaluation. Results indicate that the proposed system not only achieves similar performance to human observers, but in some cases, it outperforms them. Copyright © 2023, The Authors. All rights reserved.

关键词： Biometrics

来源：评论

学校读者我要写书评

暂无评论

Detect and Approach: Close-Range Navigation Support for People with Blindness and Low Vision 17th

Detect and Approach: Close-Range Navigation Support for Pe...

引用

17th European Conference on Computer Vision, ECCV 2022

作者： Hao, Yu Feng, Junchi Rizzo, John-Ross Wang, Yao Fang, Yi NYU Multimedia and Visual Computing Lab New York United States NYU Tandon School of Engineering New York University New York United States New York University Abu Dhabi Abu Dhabi United Arab Emirates NYU Langone Health New York United States

ISBN: (纸本)9783031250743

People with blindness and low vision (pBLV) experience significant challenges when locating final destinations or targeting specific objects in unfamiliar environments. Furthermore, besides initially locating and orienting oneself to a target object, approaching the final target from one’s present position is often frustrating and challenging, especially when one drifts away from the initial planned path to avoid obstacles. In this paper, we develop a novel wearable navigation solution to provide real-time guidance for a user to approach a target object of interest efficiently and effectively in unfamiliar environments. Our system contains two key visual computing functions: initial target object localization in 3D and continuous estimation of the user’s trajectory, both based on the 2D video captured by a low-cost monocular camera mounted on in front of the chest of the user. These functions enable the system to suggest an initial navigation path, continuously update the path as the user moves, and offer timely recommendation about the correction of the user’s path. Our experiments demonstrate that our system is able to operate with an error of less than 0.5 m both outdoor and indoor. The system is entirely vision-based and does not need other sensors for navigation, and the computation can be run with the Jetson processor in the wearable system to facilitate real-time navigation assistance. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Eye protection

来源：评论

学校读者我要写书评

暂无评论

ADAPTIVE WAVELET TRANSFORMER NETWORK FOR 3D SHAPE REPRESENTATION LEARNING 10

ADAPTIVE WAVELET TRANSFORMER NETWORK FOR 3D SHAPE REPRESENTA...

引用

10th International Conference on Learning Representations, ICLR 2022

作者： Huang, Hao Fang, Yi NYU Multimedia and Visual Computing Lab United States Abu Dhabi United Arab Emirates NYU Tandon School of Engineering New York University United States New York University Abu Dhabi United Arab Emirates

We present a novel method for 3D shape representation learning using multi-scale wavelet decomposition. Previous works often decompose 3D shapes into complementary components in spatial domain at a single scale. In this work, we study to decompose 3D shapes into sub-bands components in frequency domain at multiple scales, resulting in a hierarchical decomposition tree in a principled manner rooted in multi-resolution wavelet analysis. Specifically, we propose Adaptive Wavelet Transformer Network (AWT-Net) that firstly generates approximation or detail wavelet coefficients per point, classifying each point into high or low sub-bands components, using lifting scheme at multiple scales recursively and hierarchically. Then, AWT-Net exploits Transformer to enhance the original shape features by querying and fusing features from different but integrated sub-bands. The wavelet coefficients can be learned without direct supervision on coefficients, and AWT-Net is fully differentiable and can be learned in an end-to-end fashion. Extensive experiments demonstrate that AWT-Net achieves competitive performance on 3D shape classification and segmentation benchmarks. © 2022 ICLR 2022 - 10th International Conference on Learning Representationss. All rights reserved.

关键词： Wavelet decomposition

来源：评论

学校读者我要写书评

暂无评论

FairCLIP: Harnessing Fairness in Vision-Language Learning

arXiv

引用

arXiv 2024年

作者： Luo, Yan Shi, Min Khan, Muhammad Osama Afzal, Muhammad Muneeb Huang, Hao Yuan, Shuaihang Tian, Yu Song, Luo Kouhana, Ava Elze, Tobias Fang, Yi Wang, Mengyu Harvard Ophthalmology AI Lab Harvard University United States Tandon School of Engineering New York University United States Multimedia and Visual Computing Lab New York University Abu Dhabi United States

Fairness is a critical concern in deep learning, especially in healthcare, where these models influence diagnoses and treatment decisions. Although fairness has been investigated in the vision-only domain, the fairness of medical vision-language (VL) models remains unexplored due to the scarcity of medical VL datasets for studying fairness. To bridge this research gap, we introduce the first fair vision-language medical dataset (Harvard-FairVLMed) that provides detailed demographic attributes, ground-truth labels, and clinical notes to facilitate an in-depth examination of fairness within VL foundation models. Using Harvard-FairVLMed, we conduct a comprehensive fairness analysis of two widely-used VL models (CLIP and BLIP2), pre-trained on both natural and medical domains, across four different protected attributes. Our results highlight significant biases in all VL models, with Asian, Male, Non-Hispanic, and Spanish being the preferred subgroups across the protected attributes of race, gender, ethnicity, and language, respectively. In order to alleviate these biases, we propose FairCLIP, an optimal-transport-based approach that achieves a favorable trade-off between performance and fairness by reducing the Sinkhorn distance between the overall sample distribution and the distributions corresponding to each demographic group. As the first VL dataset of its kind, Harvard-FairVLMed holds the potential to catalyze advancements in the development of machine learning models that are both ethically aware and clinically effective. Our dataset and code are available at https://***/datasets/ harvard-fairvlmed10k. © 2024, CC BY-NC-ND.

关键词： Population statistics

来源：评论

学校读者我要写书评

暂无评论

Artwork Identification in a Museum Environment: A Quantitative Evaluation of Factors Affecting Identification Accuracy 8th

Artwork Identification in a Museum Environment: A Quantitati...

引用

8th European-Mediterranean Conference, EuroMed 2020

作者： Lanitis, A. Theodosiou, Z. Partaourides, H. Visual Media Computing Lab Department of Multimedia and Graphic Arts Cyprus University of Technology Limassol Limassol Cyprus CYENS Centre of Excellence Nicosia Cyprus

ISBN: (纸本)9783030730420

The ability to identify the artworks that a museum visitor is looking at, using first-person images seamlessly captured by wearable cameras can be used as a means for invoking applications that provide information about the exhibits, and provide information about visitors’ activities. As part of our efforts to optimize the artwork recognition accuracy of an artwork identification system under development, an investigation aiming to determine the effect of different conditions on the artwork recognition accuracy in a gallery/exhibition environment is presented. Through the controlled introduction of different distractors in a virtual museum environment, it is feasible to assess the effect on the recognition performance of different conditions. The results of the experiment are important for improving the robustness of artwork recognition systems, and at the same time the conclusions of this work can provide specific guidelines to curators, museum professionals and visitors, that will enable the efficient identification of artworks, using images captured with wearable cameras in a museum environment. © 2021, Springer Nature Switzerland AG.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

3D Unsupervised Region-Aware Registration Transformer

3D Unsupervised Region-Aware Registration Transformer

引用

IEEE International Conference on Image Processing

作者： Yu Hao Yi Fang NYU Multimedia and Visual Computing Lab New York University Abu Dhabi Abu Dhabi UAE NYUAD Center for Artificial Intelligence and Robotics New York University Abu Dhabi Abu Dhabi UAE

This paper concerns the research problem of point cloud registration to find the rigid transformation to optimally align the source point set with the target one. Learning robust point cloud registration models with deep neural networks has emerged as a powerful paradigm, offering promising performance in predicting the global geometric transformation for a pair of point sets. Existing methods first leverage an encoder to regress the global shape descriptor, which is then decoded into a shape-conditioned transformation via concatenation-based conditioning. However, different regions of a 3D shape vary in their geometric structures which makes it more sense that we have a region-conditioned transformation instead of the shape-conditioned one. In this paper, we define our 3D registration function through the introduction of a new design of 3D region partition module that is able to divide the input shape to different regions with a self-supervised 3D shape reconstruction loss without the need for ground truth labels. We further propose the 3D shape transformer module to efficiently and effectively capture short-and long-range geometric dependencies for regions on the 3D shape Consequently, the region-aware decoder module is proposed to predict the transformations for different regions respectively. The global geometric transformation from the source point set to the target one is then formed by the weighted fusion of region-aware transformation. Compared to the state-of-the-art approaches, our experiments show that our 3D-URRT achieves superior registration performance over various benchmark datasets (e.g. ModelNet40).

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Smartphone Application Designed to Detect Obstacles for Pedestrians’ Safety 1

引用

6th EAI International Conference on Science and Technologies for Smart Cities, SmartCity 2020

作者： Thoma, Marios Theodosiou, Zenonas Partaourides, Harris Tylliros, Charalambos Antoniades, Demetris Lanitis, Andreas Research Centre on Interactive Media Smart Systems and Emerging Technologies - RISE Nicosia Cyprus Visual Media Computing Research Lab Department of Multimedia and Graphic Arts Cyprus University of Technology Limassol Cyprus

ISBN: (数字)9783030760632

ISBN: (纸本)9783030760625

Encouraging people to walk rather than using other means of transportation is an important factor towards personal health and environmental sustainability. However, given the large number of pedestrian accidents recorded every year, the need for safe urban environments is increasing. Taking advantage of the potential of citizen-science for crowdsourcing data and creating awareness, we developed a smartphone application for enhancing the safety of pedestrians while walking in cities. Using the application, citizens will monitor the urban sidewalks and update a crowdsourcing platform with the detected barriers and damages that hinder safe walking, along with their location on a city map. To help users assign the correct type of obstacle, and authorities to assess the urgency, a Convolutional Neural Network (CNN) model for barrier and damage recognition is embedded in the application. The results of a user evaluation, based on a group of volunteers who used the application in real conditions, demonstrate the potential of using the application in conjunction with a smart city framework. © 2021, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.

关键词： Crowdsourcing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：