Saliency computation is widely studied in computervision but not in medical imaging. Existing computational saliency models have been developed for general (natural) images and hence may not be suitable for medical i...
详细信息
ISBN:
(纸本)9781450347532
Saliency computation is widely studied in computervision but not in medical imaging. Existing computational saliency models have been developed for general (natural) images and hence may not be suitable for medical images. this is due to the variety of imaging modalities and the requirement of the models to capture not only normal but also deviations from normal anatomy. We present a biologically inspired model for colour fundus images and illustrate it for the case of diabetic retinopathy. the proposed model uses spatially varying morphological operations to enhance lesions locally and combines an ensemble of results, of such operations, to generate the saliency map. the model is validated against an average Human Gaze map of 15 experts and found to have 10% higher recall (at 100% precision) than four leading saliency models proposed for natural images. the F-score for match with manual lesion markings by 5 experts was 0.4 (as opposed to 0.532 for gaze map) for our model and very poor for existing models. the model's utility is shown via a novel enhancement method which employs saliency to selectively enhance the abnormal regions and this was found to boost their contrast to noise ratio by similar to 30%.
In this work, we have estimated ball possession statistics from the video of a soccer match. the ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball i...
详细信息
ISBN:
(纸本)9781450366151
In this work, we have estimated ball possession statistics from the video of a soccer match. the ball possession statistics is calculated based on the valid pass counts of two playing teams. We propose a player-ball interaction energy function to detect ball pass event. Based on position and velocity of the ball and players, a model for interaction energy is defined. the energy increases when the ball is closer and about to collide with a player. Lower energy denotes that the ball is freely moving and not near to any player. the interaction energy generates a binary state sequence which determines a valid pass or a miss-pass. We assess the performance of our model on publicly available soccer videos and have achieved close to 83% accuracy.
Classification of brain fiber tracts is an important problem in brain tractography analysis. We propose a supervised algorithm which learns features for anatomically meaningful fiber clusters, from labeled DTI white m...
详细信息
ISBN:
(纸本)9781450347532
Classification of brain fiber tracts is an important problem in brain tractography analysis. We propose a supervised algorithm which learns features for anatomically meaningful fiber clusters, from labeled DTI white matter data. the classification is performed at two levels: a) Grey vs White matter (macro level) and b) White matter clusters (micro level). Our approach focuses on high curvature points in the fiber tracts, which embodies the unique characteristics of the respective classes. Any test fiber is classified into one of these learned classes by comparing proximity using the learned curvature-point model (for micro level) and with a neural network classifier (at macro level). the proposed algorithm has been validated with brain DTI data for three subjects containing about 2,50,000 fibers per subject, and is shown to yield high classification accuracy (> 93%) at both macro and micro levels.
One of the major challenges in no-reference (NR) image quality assessment (IQA) is the ability to generalize to diverse quality assessment applications. Recently, multi-modal vision-language models are found to be ver...
详细信息
ISBN:
(纸本)9798400710759
One of the major challenges in no-reference (NR) image quality assessment (IQA) is the ability to generalize to diverse quality assessment applications. Recently, multi-modal vision-language models are found to be very promising in this direction. they are beginning to form a part of several state of the art NR IQA methods. On the other hand, multi-modal large language models (LLMs) are increasingly being studied for various computervision applications including IQA. In this work, we perform a thorough study of the ability of multi-modal LLMs for NR IQA by training some of its components and testing for its generalizability. In particular, we keep the LLM frozen and learn parameters corresponding to the querying transformer, LLM prompt and some layers that process the embedding output by the LLM. We observe that some of these components offer a generalization performance far superior to any existing NR IQA algorithm.
In this paper, we propose a novel binary descriptor for 3D point clouds. the proposed descriptor termed as 3D Binary Signature (3DBS) is motivated from the matching efficiency of the binary descriptors for 2D images. ...
详细信息
ISBN:
(纸本)9781450347532
In this paper, we propose a novel binary descriptor for 3D point clouds. the proposed descriptor termed as 3D Binary Signature (3DBS) is motivated from the matching efficiency of the binary descriptors for 2D images. 3DBS describes keypoints from point clouds with a binary vector resulting in extremely fast matching. the method uses keypoints from standard keypoint detectors. the descriptor is built by constructing a Local Reference Frame and aligning a local surface patch accordingly. the local surface patch constitutes of identifying nearest neighbours based upon an angular constraint among them. the points are ordered with respect to the distance from the keypoints. the normals of the ordered pairs of these keypoints are projected on the axes and the relative magnitude is used to assign a binary digit. the vector thus constituted is used as a signature for representing the keypoints. the matching is done by using hamming distance. We show that 3DBS outperforms state of the art descriptors on various evaluation metrics.
the proceedings contains 30 papers from the conference on 6th International Workshop on Digital imageprocessing and computergraphics (DIP-97). Topics discussed include: local adaptive filtering in transform domain f...
详细信息
the proceedings contains 30 papers from the conference on 6th International Workshop on Digital imageprocessing and computergraphics (DIP-97). Topics discussed include: local adaptive filtering in transform domain for image restoration, enhancement, and target location;analysis of running discrete orthogonal transforms;history of stochastic growth model;image restoration involving connectedness;fingerprint ridge structure generation models;dot pattern clustering using a cellular neural network;two-dimensional variation and image decomposition;and digital restoration, enhancement, and archiving of photodocuments.
Ultrasound (US) guided intervention is a surgical procedure where the clinician makes use of imaging in realtime, to track the position of the needle, and correct its trajectory for accurately steering it to the lesio...
详细信息
ISBN:
(纸本)9781450347532
Ultrasound (US) guided intervention is a surgical procedure where the clinician makes use of imaging in realtime, to track the position of the needle, and correct its trajectory for accurately steering it to the lesion of interest. However, the needle is visible in the US image, only when aligned in-plane withthe scanning plane of the US probe. In practice, clinicians often use a mechanical needle guide, thus restricting their available degrees of freedom in the US probe movement. Alternatively, during free-hand procedure, they use multiple needle punctures to achieve this in-plane positioning. Our present work details an augmented reality (AR) system for patient comfort centric aid to needle intervention through an overlaid visualization of the needle trajectory on the US frame prior to its insertion. this is implemented by continuous visual tracking of the US probe and the needle in 3D world coordinate system using fiducial markers. the tracked marker positions are used to draw the needle trajectory and tip visualized in realtime to augment on the US feed. Subsequently, the continuously tracked US probe and needle, and the navigation assistance information, would be overlaid withthe visual feed from a head mounted display (HMD) for generating totally immersive AR experience for the clinician.
this book constitutes the refereed proceedings of the 6th Iberian conference on Pattern Recognition and image Analysis, IbPRIA 2013, held in Funchal, Madeira, Portugal, in June 2013. the 105 papers (37 oral and 68 pos...
详细信息
ISBN:
(数字)9783642386282
ISBN:
(纸本)9783642386275
this book constitutes the refereed proceedings of the 6th Iberian conference on Pattern Recognition and image Analysis, IbPRIA 2013, held in Funchal, Madeira, Portugal, in June 2013. the 105 papers (37 oral and 68 poster ones) presented were carefully reviewed and selected from 181 submissions. the papers are organized in topical sections on computervision, pattern recognition, image and signal, applications.
Person re-identification (ReID) is an important problem in computervision, especially for video surveillance applications. the problem focuses on identifying people across different cameras or across different frames...
详细信息
ISBN:
(纸本)9781450366151
Person re-identification (ReID) is an important problem in computervision, especially for video surveillance applications. the problem focuses on identifying people across different cameras or across different frames of same camera. the main challenge lies in identifying similarity of the same person against large appearance and structure variations, while differentiating between individuals. Recently, deep learning networks with triplet loss has become a common framework for person ReID. However, triplet loss focuses on obtaining correct orders on the training set. We demonstrate that it performs inferior in a clustering task. In this paper, we design a cluster loss, which can lead to the model output with a larger interclass variation and a smaller intra-class variation compared to the triplet loss. As a result, our model has a better generalisation ability and can achieve a higher accuracy on the test set especially for a clustering task. We also introduce a batch hard training mechanism for improving the results and faster convergence of training.
High-intensity activities in sports like basketball can result in fatigue without proper recovery. this study introduces a collaborative framework that leverages computervision (CV) and Machine Learning for evaluatin...
详细信息
ISBN:
(纸本)9798400710759
High-intensity activities in sports like basketball can result in fatigue without proper recovery. this study introduces a collaborative framework that leverages computervision (CV) and Machine Learning for evaluating jump landings and predicting athletic readiness by modelling Countermovement Jumps (CMJs) biomechanical aspects. Seventeen female collegiate basketball athletes of Sacred Heart University (SHU), CT, USA, participated in weekly CMJs over a 26-week season. through CV-driven semantic analysis of videos, the framework identifies the crucial initial contact and maximum flexion point during jump landings and extracts kinetic and kinematic features of the lower extremities. Next, an inferential analysis is conducted to understand the relationship between these features and the CMJ-driven reactive strength indexmodified (RSImod) score, which measures fatigue and athletic readiness. An XGBoost regressor, trained on the past week's data, then predicted the RSImod score for the following week, which resulted in an MSE of 0.020 and an R-2 of 0.892. Using SHapley Additive exPlanations (SHAP), the framework offers interpretable feedback, aiding coaches in creating personalised training programs and optimising athletic performance while minimising injury risks.
暂无评论