Anatomical key-points recognition is essential in many medical image analyses and clinical healthcare applications. Successfully identifying these anatomical key points provides multiple advantages, such as assisting ...
Anatomical key-points recognition is essential in many medical image analyses and clinical healthcare applications. Successfully identifying these anatomical key points provides multiple advantages, such as assisting medical experts in making treatment adjustments and offering information that helps to position surgical instruments at the appropriate locations. However, manual anatomical key-point recognition is subjective, slow, and time-consuming, especially when processing many medical images in clinical institutions. To overcome these limitations, this study aims to establish the correlation between human anatomical key points based on OpenPose and Baidu AI key-point detection techniques and the truth ground anatomical key-points marked by therapists in human medical images. This relationship will help to optimize the detection performance, reduce cost, decrease human error, and accelerate the process. The Sichuan Cancer Hospital provided five whole-body scan images obtained from a clinical CT scanner. A medical expert subsequently identified 14 anatomical key points from each scan. Finally, the datasets were reconstructed into 3- dimensional volume models to visualize whole-body skin models and the skeletons. The human-annotated 14 key points were then used as ground truth compared to the computervision techniques: OpenPose and Baidu AI. Both OpenPose and Baidu AI were found to have systematic offsets from the ground true reference points. These findings are reported in this work and can be used as a correction method.
Traditional convolution and pooling operations in the previous semantic segmentation methods will cause the loss of feature information due to limited receptive field size. They are insufficient to support an accurate...
详细信息
ISBN:
(纸本)9781450396899
Traditional convolution and pooling operations in the previous semantic segmentation methods will cause the loss of feature information due to limited receptive field size. They are insufficient to support an accurate image prediction result. To solve this problem, Firstly, we design a Second-Order Encoder to enlarge the feature receptive field and capture more semantic context information. Secondly, we design a Restore Detail Decoder to focus on processing the spatial detail information and refining the object edges. The experiments verify the effectiveness of the proposed approach. The results show that our method achieves competitive performance on two datasets, including PASCAL VOC2012 and Cityscapes with the mIoU of 80.13% and 76.31%, respectively.
The textual information present in natural scene images is a valuable clue for many vision-based applications. Locating text areas becomes particularly challenging due to issues like varying lighting conditions, compl...
The textual information present in natural scene images is a valuable clue for many vision-based applications. Locating text areas becomes particularly challenging due to issues like varying lighting conditions, complex backgrounds, non-uniform text sizes, and more. Detecting curved text, in particular, is a demanding task in the fields of computervision and patternrecognition. This research work proposes a robust approach for the detection of text in natural scene images with different orientations. The proposed method is divided into three key stages: (i) Pre-processing: The original image is converted to grayscale, and this research propose a new Norm-Contrast-limited Adaptive Histogram Equalization (CLAHE) to enhance the contrast in the grayscale image by limiting noise amplification and prevents saturation in dark and bright text regions. (ii) Text Region Extraction: this research combined Otsu's thresholding, Maximally Stable Extremal Region (MSER) and Sobel edge de-tector to extract text regions and enhance edges in the image. (iii) Non-Text Region Removal: In this stage, this research combine Connected Component (CC), Morphological operations, MSER region filters, and basic text geometric properties to create a framework for removing non-text regions and drawing rectangle bounding boxes around the detected text regions. this research assessed the performance of the proposed method utilizing the ICDAR 2017 text dataset, which yielded an overall accuracy of 93.47%, demonstrating highly satisfactory results.
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from t...
详细信息
ISBN:
(纸本)9798350301298
International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and post-processing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work. [GRAPHICS] .
In order to recognize patterns in images, this study tests the performance of many “machine learning algorithms” and feature extraction methods. Here, synthetic photographs of handwritten digits are used to compare ...
In order to recognize patterns in images, this study tests the performance of many “machine learning algorithms” and feature extraction methods. Here, synthetic photographs of handwritten digits are used to compare the performance of four machine learning methods (“deep learning, support vector machines, decision trees, and random forests”) and two feature extraction strategies (raw pixel values and Histogram of Oriented Gradients). The efficacy of each algorithm is measured in terms of its “accuracy, precision, recall, and F1 score”, among others. Our findings also demonstrate that the Histogram of Oriented Gradients feature extraction method is good at collecting local gradient information in pictures and that deep learning and support vector machines obtain the best accuracy overall. The results of our research have significant ramifications for the future of machine learning techniques used in computervision and handwriting recognition. Research in the future may test these methods on other datasets and picture kinds, or look into alternative feature extraction strategies and machine learning algorithms.
Face recognition is one of the premier disciplines in the vast field of computervision and image analysis. A popular method is the Gaussian scale space analysis which limits the performance by smoothing both the nois...
详细信息
Multiple users can jointly train a universal model using federated learning, a revolutionary AI approach, without having to reveal their personal data. With this strategy, anonymity is maintained while a secure learni...
详细信息
Deep learning has become an effective approach over the past few years to addressing intricate computervision problems, and Convolutional Neural Networks (CNNs) have been the primary driving force behind this progres...
详细信息
The conflict between computational overhead and detection accuracy affects nearly every Automated Accident Detection (AAD) system. Although the accuracy of detection and classification approaches has recently improved...
详细信息
Diabetic Retinopathy (DR) is the main cause of blindness and harms the retina due to the accumulation of glucose in the blood. Therefore, early DR detection, diagnosis, segmentation, and classification prevent patient...
详细信息
暂无评论