This work addresses the problem of extracting sounds that are unexpected in an audio stream and stand out because of their spectrotemporal characteristics. In human auditory scene analysis, such sounds are referred to...
详细信息
ISBN:
(数字)9788362065486
ISBN:
(纸本)9798350373806
This work addresses the problem of extracting sounds that are unexpected in an audio stream and stand out because of their spectrotemporal characteristics. In human auditory scene analysis, such sounds are referred to as (sensory) salient. Previous research initiatives are mostly limited to the detection of presence of salient sounds and identification of their temporal localization within the signal. Other approaches aim at developing classifiers that detect fixed, predetermined categories of salient sounds. In contrast, this work aims at developing a solution capable of suppressing all background (non-salient) sounds from an audio stream, preserving, to the best extent possible, the salient sounds without any distortion. An additional assumption is that the algorithm should not be limited to any particular category of salient sound events. This challenging task is realized in two steps, both being novel contributions of this work. In the first step, a large-scale dataset of clean background samples and clean salient sound samples is created by automatically processing publicly available resource of field recordings. In the second step, a deep neural network (U-Net) trained to predict complex ideal ratio mask, a method typically used for speech enhancement, is adopted and evaluated in the context of salient sound extraction. The results of conducted experiments indicate potential high efficacy of the proposed solution and indicate directions for future research.
image registration has become a major medical image computing technology over the past ten years, with applications ranging from computer-assisted therapy and surgery to computer-assisted diagnosis. A medical image re...
详细信息
ISBN:
(数字)9798350387230
ISBN:
(纸本)9798350387247
image registration has become a major medical image computing technology over the past ten years, with applications ranging from computer-assisted therapy and surgery to computer-assisted diagnosis. A medical image registration model based on a Cross-Entropy-based Deep neural Network (CEL-DNN) is presented in this paper. First, pre-processing is done on fixed and moving images, removing noise and enhancing contrast. Next, the features of both images are retrieved and then matched using the Mahalanobis Distance-based Brute-Force Matcher (MD-BFM) approach. FIS is used to estimate the transformation parameters based on the matched features. The final aligned image is obtained by applying the parameters using the CEL-DNN model over the moving image. To show the proposed model's superior performance, it is finally benchmarked against existing models.
A brain-computer interface (BCI) facilitates direct interaction between the brain and external devices. To concurrently achieve high decoding accuracy and low energy consumption in invasive BCIs, we propose a novel sp...
详细信息
image identification with extracting features in medical applications has proven to be a significant obstacle in recent years. For medical doctors, diagnosing illnesses using image recognition of X-ray or scan picture...
详细信息
Deep Learning as a Service (DLaaS) stands as a promising solution for cloud-based inference applications. In this setting, the cloud has a pre-learned model whereas the user has samples on which she wants to run the m...
详细信息
Deep Learning as a Service (DLaaS) stands as a promising solution for cloud-based inference applications. In this setting, the cloud has a pre-learned model whereas the user has samples on which she wants to run the model. The biggest concern with DLaaS is the user privacy if the input samples are sensitive data. We provide here an efficient privacy-preserving system by employing high-end technologies such as Fully Homomorphic Encryption (FHE), Convolutional neuralnetworks (CNNs) and Graphics processing Units (GPUs). FHE, with its widely-known feature of computing on encrypted data, empowers a wide range of privacy-concerned applications. This comes at high cost as it requires enormous computing power. In this article, we show how to accelerate the performance of running CNNs on encrypted data with GPUs. We evaluated two CNNs to classify homomorphically the MNIST and CIFAR-10 datasets. Our solution achieved sufficient security level (> 80 bit) and reasonable classification accuracy (99) and (77.55 percent) for MNIST and CIFAR-10, respectively. In terms of latency, we could classify an image in 5.16 seconds and 304.43 seconds for MNIST and CIFAR-10, respectively. Our system can also classify a batch of images (> 8,000) without extra overhead.
As a prevailing research in artificial intelligence, the application of computer vision is widely used in many fields which are closely related to people's livelihood, such as industrial automation, new retail ind...
详细信息
As a prevailing research in artificial intelligence, the application of computer vision is widely used in many fields which are closely related to people's livelihood, such as industrial automation, new retail industry, smart transportation and security monitoring. And the proposed face recognition method is a branch in the field of computer vision, it integrates neuralnetworks, biology, image signal processing, machine learning and other fields, which promote research and cross-development among different disciplines. Hence, this paper focuses on face recognition method by using convolutional neural network(CNN), and CNN has the property of "weight sharing", which has been widely popularized in image recognition, it can greatly simplify the work of large-scale network training. The experiments demonstrate that the proposed face recognition method is successful, and the accuracy of the proposed method can be as high as 98%.
The widespread availability of forged image software necessitates the integrity verification of digital images in industrial and medical applications. Because of image manipulation, detecting small tampering and dupli...
详细信息
The widespread availability of forged image software necessitates the integrity verification of digital images in industrial and medical applications. Because of image manipulation, detecting small tampering and duplicated forgery from digital radiography (gamma and x-ray) images has become a research challenge, Two essential approaches are proposed for forgery detection from digital radiography images. A precise forgery detection approach with pretrained deep convolution neuralnetworks (CNN) is conducted. Alexnet, Resnet-18 and VGG-19 are three pretrained networks for features extraction. artificialneural network (ANN) and multiclass support vector machine (MSVM) classifiers are applied for classifying the extracted features into authentic or forged. The second suggested approach depends on Haralick and Zoning extractors. These extracted features are trained and tested using the K-nearest neighbors (KNN) classifier. The suggested approaches are investigated using several manipulated industrial (gamma welding images) and medical (spine images) datasets images. Besides, these approaches are tested with several color benchmark dataset images. The results are verified using a variety of evaluation metrics. The approaches are validated through comparison with published work and high agreements are demonstrated. For digital radiography images, Alexnet pretrained network with MSVM, Resnet-18 pretrained network with ANN and Haralick extractor with KNN achieve the highest accuracy and assessment metrics. It is observed that the performance of pretrained CNN outperforms that of conventional classification algorithms in respect of accuracy with computational time. The developed approaches allow for the precise detection of forgery regions in x-ray and gamma radiographic images as well as digital images.& COPY;2023 Elsevier B.V. All rights reserved.
The main objective of this paper was to effectively interface object detection based on Convolution neuralnetworks (CNNs) with selective lossy image compression techniques to improve the efficiency of subsequent imag...
详细信息
ISBN:
(纸本)9781665416696
The main objective of this paper was to effectively interface object detection based on Convolution neuralnetworks (CNNs) with selective lossy image compression techniques to improve the efficiency of subsequent image operations and reduce the memory requirement for storing the images in autonomous applications of self-driving vehicles. Object detection and localization was performed using 2 state-of-the-art CNN based models from the Tensorflow 2.0 Object Detection API - Faster R-CNN ResNet152 V1 1024x1024 and CenterNet HourGlass104 1024x1024. Lossy image Compression centred around the most prominent detected object (which is preserved) is done through 3 techniques - K-Means Clustering (KM), Genetic Algorithm (GA), Discrete Cosine Transform (DCT). The compressed and preserved parts were recombined to produce the final image. Analysis of the results obtained from different models and compression techniques was carried out. It was found that DCT produced the best results on both the models.
Reducing environmental pollution with household waste and emissions from the computing clusters is an urgent technological problem. In our work, we explore both of these aspects: the deep learning application to impro...
详细信息
Reducing environmental pollution with household waste and emissions from the computing clusters is an urgent technological problem. In our work, we explore both of these aspects: the deep learning application to improve the efficiency of waste recognition on recycling plant's conveyor, as well as carbon dioxide emission from the computing devices used in this process. To conduct research, we developed an unique open WaRP dataset that demonstrates the best diversity among similar industrial datasets and contains more than 10,000 images with 28 different types of recyclable goods (bottles, glasses, card boards, cans, detergents, and canisters). Objects can overlap, be in poor lighting conditions, or significantly distorted. On the WaRP dataset, we study training and evaluation of cutting-edge deep neuralnetworks for detection, classification and segmentation tasks. Additionally, we developed a hierarchical neural network approach called H-YC with weakly supervised waste segmentation. It provided a notable increase in the detection quality and made it possible to segment images, learning only having class labels, not their masks. Both the suggested hierarchical approach and the WaRP dataset have shown great industrial application potential.
Adversarial robustness of neuralnetworks is an increasingly important area of research, combining studies on computer vision models, large language models (LLMs), and others. With the release of JPEG AI — the first ...
详细信息
暂无评论