The theme of this study is to provide a detailed description of its recent improvements in image segmentation and lesion classification in disease prognosis. Previous studies have shown that gray-white matter hyperint...
详细信息
In the application of computer-vision-based displacement measurement, an optical target is usually required to prove the reference. If the optical target cannot be attached to the measuring objective, edge detection a...
详细信息
In the application of computer-vision-based displacement measurement, an optical target is usually required to prove the reference. If the optical target cannot be attached to the measuring objective, edge detection and template matching are the most common approaches in target-less photogrammetry. However, their performance significantly relies on parameter settings. This becomes problematic in dynamic scenes where complicated background texture exists and varies over time. We propose virtual point tracking for real-time target-less dynamic displacement measurement, incorporating deeplearning techniques and domain knowledge to tackle this issue. Our approach consists of three steps: 1) automatic calibration for detection of region of interest;2) virtual point detection for each video frame using deep convolutional neural network;3) domain-knowledge based rule engine for point tracking in adjacent frames. The proposed approach can be executed on an edge computer in a real-time manner (i.e. over 30 frames per second). We demonstrate our approach for a railway application, where the lateral displacement of the wheel on the rail is measured during operation. The numerical experiments have been performed to evaluate our approach's performance and latency in a harsh railway environment with dynamic complex backgrounds. We make our code and data available at https://github. com/quickhdsdc/Point-Tracking-for-Displacement-Measurement-in-Railway-Applications.
Lip reading has gained popularity due to the proliferation of emerging real-world applications. This article provides a comprehensive review of benchmark datasets available for lip-reading applications and pioneering ...
详细信息
Lip reading has gained popularity due to the proliferation of emerging real-world applications. This article provides a comprehensive review of benchmark datasets available for lip-reading applications and pioneering works that analyze lower facial cues for lip-reading applications. A comprehensive review of lip reading applications is broadly classified into five distinct applications: Lip Reading Biometrics (LRB), Audio Visual Speech Recognition (AVSR), Silent Speech Recognition (SSR), Voice from Lips, and Lip HCI (Human-computer interaction). LRB entails extensive research in the fields of authentication and liveness detection. AVSR covers key findings that have contributed significantly to applications such as voice assistants, video-totext transcription, hearing aids, and pronunciation-correcting systems. SSR analyzes the efforts made for silent-video-to-text transcription and surveillance camera applications. The voice from lips section discusses applications such as voice for the voiceless and vision-infused speech inpainting. In lip HCI, LR-HCI for smartphones, smart TVs, computers, robots, and musical instruments is reviewed in detail. Comprehensive coverage is given to cutting-edge techniques in computer vision, signal processing, machine learning, and deeplearning. The advancements that aid the system in learning to lip-read and authenticate lip gestures, generate text transcription, synthesize voice based on lip movements, and control systems via lip movements (lip HCI) are covered. The work concludes by highlighting the limitations of existing frameworks, the road maps of each application illustrating the evolution of techniques employed over time, and future research avenues in lip-reading applications.
Scalp electroencephalogram (EEG) signals inherently have a low signal-To-noise ratio due to the way the signal is electrically transduced. Temporal and spatial information must be exploited to achieve accurate detecti...
详细信息
In the last decade, the limitation of the propagation of Wildfire had become a higher necessity. In fact, it is important to optimize the resources used for dislocation to verify the probabilistic signaled fire zones....
详细信息
ISBN:
(数字)9781510645691
ISBN:
(纸本)9781510645691;9781510645684
In the last decade, the limitation of the propagation of Wildfire had become a higher necessity. In fact, it is important to optimize the resources used for dislocation to verify the probabilistic signaled fire zones. Hence, using sophisticated and low-cost techniques to sense the previously mentioned zones is highly demanded. Models with high computational necessity are not interesting for realtime application. More simple models are requested, to fulfill the desired tasks with an admitted response time. Squeezesegv2 is a model applied initially for LiDAR (Light Detection And Ranging) Point Cloud data segmentation, which gives a high IoU value compared with other state of art architectures. The model was experimented in this paper, it is robust against dropout noise. Experiments were run over RGB pictures of Corsican public French dataset with 1135 RGB images. It is common that highly unbalanced datasets, which is our case, induce high precision low sensitivity. Therefore, several validation measures criterions were adopted to access the performance. In fact, the capability of the model was tested with four different metrics: Accuracy, mean Intersection over Union (IoU), Mean Boundary F1 (BF) Score, and Mean Dice coefficient. The experimental results demonstrate that the trained model, over the Corsican French dataset, with five-fold cross validation procedure can accurately detect the fire flame. The results were collected for different loss function types: Focal loss, Dice and Tversky loss. In general, the given results are very encouraging for further study using deeplearning approaches.
This paper considers the accident images and develops a deeplearning method for feature extraction together with a mixture of experts for classification. For the first task, the outputs of the last max-pooling layer ...
详细信息
This paper considers the accident images and develops a deeplearning method for feature extraction together with a mixture of experts for classification. For the first task, the outputs of the last max-pooling layer of a Convolution Neural Network (CNN) are used to extract the hidden features automatically. For the second task, a mixture of advanced variations of Extreme learning Machine (ELM) including basic ELM, constraint ELM (CELM), On-Line Sequential ELM (OSELM) and Kernel ELM (KELM), is developed. This ensemble classifier combines the advantages of different ELMs using a gating network and its accuracy is very high while the processingtime is close to real-time. To show the efficiency, the different combinations of the traditional feature extraction and feature selection methods and the various classifiers are examined on two kinds of benchmarks including accident images' data set and some general data sets. It is shown that the proposed system detects the accidents with 99.31% precision, recall and F-measure. Besides, the precisions of accident-severity classification and involved-vehicle classification are 90.27% and 92.73%, respectively. This system is suitable for on-line processing on the accident images that will be captured by Unmanned Aerial Vehicles (UAV) or other surveillance systems.
Hashing-based medical image retrieval has drawn extensive attention recently, which aims at providing effective aided diagnosis for medical personnel. In the paper, a novel deep hashing framework is proposed in the me...
详细信息
Hashing-based medical image retrieval has drawn extensive attention recently, which aims at providing effective aided diagnosis for medical personnel. In the paper, a novel deep hashing framework is proposed in the medical image retrieval, where the processes of deep feature extraction, binary code learning, and deep hash function learning are jointly carried out in supervised fashion. Particularly, the discrete constrained objective function in the hash code learning is optimized iteratively, where the binary code can be directly solved with no need for relaxation. In the meantime, the semantic similarity is maintained by fully exploring supervision information during the discrete optimization, where the neighborhood structure of training data is preserved by applying a graph regularization term. Additionally, to gain the fine-grained ranking of the returned medical images sharing the same Hamming distance, a novel image re-ranking scheme is proposed to refine the similarity measurement by jointly considering Euclidean distance between the real-valued feature descriptors and their category information between those images. Extensive experiments on the pulmonary nodule image dataset demonstrate that the proposed method can achieve better retrieval performance over the state of the arts.
Face detection and tracking have become popular in recent years. It has critical importance in security, defense, and robotic uses encountered in daily life. For this purpose, many decision support systems or expert s...
详细信息
Adversarial patch attacks have become a primary concern in recent years as they pose a significant threat to the security and reliability of deep neural networks. Modifying benign images by introducing adversarial pat...
详细信息
ISBN:
(纸本)9783031442360;9783031442377
Adversarial patch attacks have become a primary concern in recent years as they pose a significant threat to the security and reliability of deep neural networks. Modifying benign images by introducing adversarial patches comprising localized adversarial pixels alters the salient features of the image resulting in misclassification. The novelty of our approach is in the use of image inpainting technique as an adversarial defence for rectifying the patch region. Adversarial patch is automatically localized using Fast Score Class Activation Map and superseded by inpainting using Fast Marching Method which efficiently propagates pixel information from the surrounding areas into the patch region. This approach ensures original image's structural integrity while simultaneously inpainting the adversarial pixels. Moreover, at the time of the attack it is not expected to have prior knowledge about the patch. Therefore, we propose our novel adversarial defence technique in a black-box setting assuming no knowledge about the patch location, shape or its size. Furthermore, we do not rely on re-training our victim model on adversarial examples, indicating its potential usefulness for real-world applications. Our experimental results show that the proposed approach achieves accuracy up to 76.37% on imageNet100 despite the adversarial patch attack amounting to a considerable improvement of 76.28% points. Moreover, on benign images our approach gives decent accuracy of 81.11% thereby suggesting that our defence pipeline is applicable irrespective of whether the input image is adversarial or clean.
Augmented reality applications overlay our physical world with digital components in an interactive 3D space. These applications generally capture information about the physical world around the user through cameras a...
详细信息
暂无评论