The quest for personalized sports therapy has long been a concern for practitioners and patients alike, aiming for recovery protocols that transcend the one-size-fits-all approach. In this study, we introduce a novel ...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
The quest for personalized sports therapy has long been a concern for practitioners and patients alike, aiming for recovery protocols that transcend the one-size-fits-all approach. In this study, we introduce a novel framework for personalized sports therapy through automated joint movement analysis. By synthesizing the analytical capabilities of a Random Forest Classifier (RFC) with a Vector Quantized Variational AutoEncoder (VQ-VAE), we systematically discern the nuanced kinematic differences between healthy and pathological exercise movements. The RFC prioritizes the joints by their discriminative influence on movement healthiness, which informs the VQ-VAE’s derivation of a distilled list of pivotal joints. This dual-model approach not only identifies a hierarchy of joint importance but also ascertains the minimal subset of joints critical for distinguishing between healthy and unhealthy movement patterns. The resultant data-driven insight into joint-specific dynamics underpins the development of targeted, individualized rehabilitation programs. Our results exhibit promising directions in sports therapy, showcasing the potential of machine learning in developing personalized therapeutic interventions.
Foundation models have emerged as pivotal tools, tackling many complex tasks through pre-training on vast datasets and subsequent fine-tuning for specific applications. The Segment Anything Model is one of the first a...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Foundation models have emerged as pivotal tools, tackling many complex tasks through pre-training on vast datasets and subsequent fine-tuning for specific applications. The Segment Anything Model is one of the first and most well-known foundation models for computervision segmentation tasks. This work presents a multi-faceted red-teaming analysis that tests the Segment Anything Model against challenging tasks: (1) We analyze the impact of style transfer on segmentation masks, demonstrating that applying adverse weather conditions and raindrops to dashboard images of city roads significantly distorts generated masks. (2) We focus on assessing whether the model can be used for attacks on privacy, such as recognizing celebrities’ faces, and show that the model possesses some undesired knowledge in this task. (3) Finally, we check how robust the model is to adversarial attacks on segmentation masks under text prompts. We not only show the effectiveness of popular white-box attacks and resistance to black-box attacks but also introduce a novel approach - Focused Iterative Gradient Attack (FIGA) that combines white-box approaches to construct an efficient attack resulting in a smaller number of modified pixels. All of our testing methods and analyses indicate a need for enhanced safety measures in foundation models for image segmentation.
Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when c...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Continual Learning (CL) focuses on maximizing the predictive performance of a model across a non-stationary stream of data. Unfortunately, CL models tend to forget previous knowledge, thus often underperforming when compared with an offline model trained jointly on the entire data stream. Given that any CL model will eventually make mistakes, it is of crucial importance to build calibrated CL models: models that can reliably tell their confidence when making a prediction. Model calibration is an active research topic in machine learning, yet to be properly investigated in CL. We provide the first empirical study of the behavior of calibration approaches in CL, showing that CL strategies do not inherently learn calibrated models. To mitigate this issue, we design a continual calibration approach that improves the performance of post-processing calibration methods over a wide range of different benchmarks and CL strategies. CL does not necessarily need perfect predictive models, but rather it can benefit from reliable predictive models. We believe our study on continual calibration represents a first step towards this direction.
In this paper, we introduce a novel unsupervised network to denoise microscopy videos featured by image sequences captured by a fixed location microscopy camera. Specifically, we propose a DeepTemporal Interpolation m...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
In this paper, we introduce a novel unsupervised network to denoise microscopy videos featured by image sequences captured by a fixed location microscopy camera. Specifically, we propose a DeepTemporal Interpolation method, leveraging a temporal signal filter integrated into the bottom CNN layers, to restore microscopy videos corrupted by unknown noise types. Our unsupervised denoising architecture is distinguished by its ability to adapt to multiple noise conditions without the need for pre-existing noise distribution knowledge, addressing a significant challenge in real-world medical applications. Furthermore, we evaluate our denoising framework using both real microscopy recordings and simulated data, validating our outperforming video denoising performance across a broad spectrum of noise scenarios. Extensive experiments demonstrate that our unsupervised model consistently outperforms state-of-the-art supervised and unsupervised video denoising techniques, proving especially effective for microscopy videos. The project page is available at https://***/UMVD/
In this work, we identify continual learning (CL) methods’ inherent differences in sequential decision attribution. In the sequential learning process, inconsistent decision attribution may undermine the interpretabi...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
In this work, we identify continual learning (CL) methods’ inherent differences in sequential decision attribution. In the sequential learning process, inconsistent decision attribution may undermine the interpretability of a continual learner. However, existing CL evaluation metrics, as well as current interpretability methods, cannot measure the decision attribution stability of a continual learner. To bridge the gap, we introduce Shapley value, a well-known decision attribution theory, and define SHAP value consistency (SHAPC) to measure the consistency of a continual learner’s decision attribution. Furthermore, we define the mean and the variance of SHAPC values, namely SHAPC-Mean and SHAPC-Var, to jointly evaluate the decision attribution stability of continual learners over sequential tasks. On Split CIFAR-10, Split CIFAR-100, and Split TinyImageNet, we compare the decision attribution stability of different CL methods using the proposed metrics, providing a new perspective for evaluating their reliability.
Manual visual assessment of mangoes has been problematic for the agriculture sector because of its time-consuming nature and inconsistent evaluation and sorting methods. The advent of automated flaw identification usi...
详细信息
ISBN:
(纸本)9798350357974
Manual visual assessment of mangoes has been problematic for the agriculture sector because of its time-consuming nature and inconsistent evaluation and sorting methods. The advent of automated flaw identification using computervision and machine learning offers a notable shift and improvement in the visual inspection process. A common issue with mangoes is the presence of dark patches, indicative of disease or rot, which negatively affect the appearance and quality of the fruit. This paper introduces a framework using computervision which utilizes image analysis and machine learning methods to identify these dark spots, taking into account the mangoes' texture. The proposed framework has a simplified configuration and tuning process, enhancing its ease of deployment in real-world applications. This innovation aligns with the advancements in integrating cutting-edge technologies to optimize efficiency and consistency in agricultural practices, thereby contributing to the evolution of smart agriculture and addressing the challenges and opportunities presented by the next wave of industrial revolution.
Top-down instance segmentation architectures excel with predefined closed-world taxonomies but exhibit biases and performance degradation in open-world scenarios. In this work, we introduce bottom-Up and top-Down Open...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Top-down instance segmentation architectures excel with predefined closed-world taxonomies but exhibit biases and performance degradation in open-world scenarios. In this work, we introduce bottom-Up and top-Down Open-world Segmentation (UDOS), a novel approach that combines classical bottom-up segmentation methods within a top-down learning framework. UDOS leverages a top-down network trained with weak supervision derived from class-agnostic bottom-up segmentation to predict object parts. These part-masks undergo affinity-based grouping and refinement to generate precise instance-level segmentations. UDOS balances the efficiency of top-down architectures with the capacity to handle unseen categories through bottom-up supervision. We validate UDOS on challenging datasets (MS-COCO, LVIS, ADE20k, UVO, and OpenImages), achieving superior performance over state-of-the-art methods in cross-category and cross-dataset transfer tasks. Our code and models will be publicly available.
Broiler localization is crucial for welfare monitoring, particularly in identifying issues such as wet litter. We focus on multi-camera detection systems since multiple viewpoints not only ensure comprehensive pen cov...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Broiler localization is crucial for welfare monitoring, particularly in identifying issues such as wet litter. We focus on multi-camera detection systems since multiple viewpoints not only ensure comprehensive pen coverage but also reduce occlusions caused by lighting, feeder and drinking equipment. Previous multi-view detection studies localize subjects either by aggregating ground plane projections of single-view predictions or by developing end-to-end multi-view detectors capable of directly generating predictions. However, single-view detections may suffer from reduced accuracy due to occlusions, and obtaining ground plane labels for training end-to-end multi-view detectors is challenging. In this paper, we combine the strengths of both approaches by using the readily available aggregated single-view detections as labels for training a multi-view detector. Our approach alleviates the need for hard-to-acquire ground-plane labels. Through experiments on a real-world broiler dataset, we demonstrate the effectiveness of our approach.
Face image synthesis has shown remarkable progress in recent years. However, the effect that the demographics of the data used to train synthesizers has on the generation of new face images remains an open question. T...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Face image synthesis has shown remarkable progress in recent years. However, the effect that the demographics of the data used to train synthesizers has on the generation of new face images remains an open question. This paper investigates the effects of the training set demographics in the face image synthesis task. To this end, we propose a strategy that allows synthesizing face images for specific groups of people with a high visual quality. The strategy uses an unsupervised learning approach to discover groups of people in the training set based on Bayesian inference via a probabilistic mixture model. If labels are available to define the groups, our strategy can also exploit such information in lieu of unsupervised learning. Once the groups are defined, our strategy trains a Generative Adversarial Network on each group to generate new face images with specific characteristics. Our results show remarkable performance in terms of image quality compared to several state-of-the-art baselines. More importantly, our strategy allows synthesizing face images with reduced demographic biases.
Visual odometry is an ill-posed problem and utilized in many robotics applications, especially automated driving for mapless navigation. Recent applications have shown that deep models outperform traditional approache...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Visual odometry is an ill-posed problem and utilized in many robotics applications, especially automated driving for mapless navigation. Recent applications have shown that deep models outperform traditional approaches especially in localization accuracy and furthermore significantly reduce catastrophic failures. The disadvantage of most of these models is a strong dependence on high-quantity and high-quality ground truth data. However, accurate and dense depth ground truth data for real world datasets is difficult to obtain. As a result, deep models are often trained on synthetic data which introduces a domain gap. We present a weakly supervised approach to overcome this limitation. Our approach uses estimated optical flow for training that can be generated without the need for high-quality dense depth ground truth. Instead, it only requires ground truth poses and raw camera images for training. In the experiments, we show that our approach enables deep visual odometry to be efficiently trained on the target domain (real data) while achieving state-of-the-art performance on the KITTI dataset.
暂无评论