Pattern recognition has been evolving to include problems posed by new sceneries containing a high number of pattern components. processing this volume of information allows a more exact classification in wider types ...
详细信息
ISBN:
(纸本)9783031477645;9783031477652
Pattern recognition has been evolving to include problems posed by new sceneries containing a high number of pattern components. processing this volume of information allows a more exact classification in wider types of applications;however, some of the difficulties of this scheme is the maintenance of numerical precision and mainly the reduction of the execution time. During the last 15 years, several Machine Learning solutions have been implemented to reduce the number of pattern components to be analyzed, such as artificialneuralnetworks. Deep learning is an appropriate tool to accomplish this task. In this paper, a convolutional neural network is implemented for recognition and classification of human activity signals and digital images. It is achieved by automatically adjusting the parameters of the neural network through genetic algorithms using a multiprocessor and GPU platform. The results obtained show the reduction of computational costs and the possibility of better understanding of the solutions provided by Deep Learning.
Vision Transformer (ViT) is becoming widely popular in automating accurate disease diagnosis in medical imaging owing to its robust self-attention mechanism. However, ViTs remain vulnerable to adversarial attacks that...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329;9798350359312
Vision Transformer (ViT) is becoming widely popular in automating accurate disease diagnosis in medical imaging owing to its robust self-attention mechanism. However, ViTs remain vulnerable to adversarial attacks that may thwart the diagnosis process by leading it to intentional misclassification of critical disease. In this paper, we propose a novel image classification pipeline, namely, S-E Pipeline, that performs multiple pre-processing steps that allow ViT to be trained on critical features so as to reduce the impact of input perturbations by adversaries. Our method uses a combination of segmentation and image enhancement techniques such as Contrast Limited Adaptive Histogram Equalization (CLAHE), Unsharp Masking (UM), and High-Frequency Emphasis filtering (HFE) as preprocessing steps to identify critical features that remain intact even after adversarial perturbations. The experimental study demonstrates that our novel pipeline helps in reducing the effect of adversarial attacks by 72.22% for the ViT-b32 model and 86.58% for the ViT-l32 model. Furthermore, we have shown an end-to-end deployment of our proposed method on the NVIDIA Jetson Orin Nano board to demonstrate its practical use case in modern hand-held devices that are usually resource-constrained.
In recent years, convolutional neuralnetworks (CNNs) have become the core of many artificial intelligence applications, especially in fields such as image recognition and speech recognition. Deploying convolutional n...
详细信息
Autonomous driving systems mainly rely on accurate detection of lane markings for navigation and safety. This paper explores an enhanced lane detection methodology employing deformable linear convolution, which dynami...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329;9798350359312
Autonomous driving systems mainly rely on accurate detection of lane markings for navigation and safety. This paper explores an enhanced lane detection methodology employing deformable linear convolution, which dynamically adjusts to geometric variations of road markings. Our method aims to improve the detection fidelity under a range of challenging conditions such as variable illumination, road wear, and diverse weather scenarios, as evidenced by our experiments on the BDD100K dataset. The results demonstrate an improvement over those traditional lane detection techniques, suggesting that the deformable linear convolution offers a viable path forward for complex environmental adaptation in real-time imageprocessing. Nevertheless, the computational demands of the proposed method in this paper highlight an area for further optimization. This study contributes to the field by providing an adaptable framework for the lane detection and sets the stage for future research focused on operational efficiency.
Edge detection is a fundamental task in machine vision that facilitates feature extraction and representation across various visual domains, such as panoptic segmentation, autonomous driving, and image recognition. De...
详细信息
ISBN:
(纸本)9798350350920
Edge detection is a fundamental task in machine vision that facilitates feature extraction and representation across various visual domains, such as panoptic segmentation, autonomous driving, and image recognition. Despite the superior performance of current neural network-based edge detectors, the large parameter size renders edge detection models unsuitable for direct application in complex scenarios. Consequently, designing a compact edge detection network remains an imperative challenge. In this paper, we introduce the Efficient Stage Features Edge Detector (ESFED), a low-parameter, high-performance edge detector. ESFED is primarily composed of an efficient stage feature extractor, an upsampling network for edge features, and a feature fusion network for prediction, totaling only 51K parameters. It achieves 0.829 Optimal Dataset Scale (ODS) and 0.846 Optimal image Scale (OIS) on the Unified Dataset for Edge Detection (UDED) dataset, demonstrating notable performance in comparison to other state-of-the-art models.
In drug discovery, chemical language models (CLMs) originating from natural language processing offer new opportunities for molecular design. CLMs have been developed using recurrent neural network (RNN) or transforme...
详细信息
In drug discovery, chemical language models (CLMs) originating from natural language processing offer new opportunities for molecular design. CLMs have been developed using recurrent neural network (RNN) or transformer architectures. For the predictive performance of RNN-based encoder-decoder frameworks and transformers, attention mechanisms play a central role. Among others, emerging application areas for CLMs include constrained generative modeling and the prediction of chemical reactions or drug-target interactions. Since CLMs are applicable to any compound or target data that can be presented in a sequential format and tokenized, mappings of different types of sequences can be learned. For example, active compounds can be predicted from protein sequence motifs. Novel off-the-beat-path applications can also be considered. For example, analogue series from medicinal chemistry can be perceived and represented as chemical sequences and extended with new compounds using CLMs. Herein, methodological features of CLMs and different applications are discussed. image
With the advancement of deep learning techniques, the classification of remote sensing data using artificialneuralnetworks has emerged as a prominent research area. Despite this progress, the emulation of brain stru...
详细信息
Deep learning and Convolutional neuralnetworks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks l...
详细信息
ISBN:
(纸本)1577358872
Deep learning and Convolutional neuralnetworks (CNNs) have driven major transformations in diverse research areas. However, their limitations in handling low-frequency information present obstacles in certain tasks like interpreting global structures or managing smooth transition images. Despite the promising performance of transformer structures in numerous tasks, their intricate optimization complexities highlight the persistent need for refined CNN enhancements using limited resources. Responding to these complexities, we introduce a novel framework, the Multiscale Low-Frequency Memory (MLFM) Network, with the goal to harness the full potential of CNNs while keeping their complexity unchanged. The MLFM efficiently preserves low-frequency information, enhancing performance in targeted computer vision tasks. Central to our MLFM is the Low-Frequency Memory Unit (LFMU), which stores various low-frequency data and forms a parallel channel to the core network. A key advantage of MLFM is its seamless compatibility with various prevalent networks, requiring no alterations to their original core structure. Testing on imageNet demonstrated substantial accuracy improvements in multiple 2D CNNs, including ResNet, MobileNet, EfficientNet, and ConvNeXt. Furthermore, we showcase MLFM's versatility beyond traditional image classification by successfully integrating it into image-to-image translation tasks, specifically in semantic segmentation networks like FCN and U-Net. In conclusion, our work signifies a pivotal stride in the journey of optimizing the efficacy and efficiency of CNNs with limited resources. This research builds upon the existing CNN foundations and paves the way for future advancements in computer vision. Our codes are available at https://***/AlphaWuSeu/MLFM.
By implementing neuromorphic paradigms in processing visual information, machine learning became crucial in an ever-increasing number of applications of our everyday lives, ever more performing but also computationall...
详细信息
By implementing neuromorphic paradigms in processing visual information, machine learning became crucial in an ever-increasing number of applications of our everyday lives, ever more performing but also computationally demanding. While a pre-processing of the information passively in the optical domain, before optical-electronic conversion, can reduce the computational requirements for a machine learning task, a comprehensive analysis of computational requirements for hybrid optical-digital neuralnetworks is thus far missing. In this work we critically compare and analyze the performance of different optical, digital and hybrid neural network architectures with respect to their classification accuracy and computational requirements for analog classification tasks of different complexity. We show that certain hybrid architectures exhibit a reduction of computational requirements of a factor >10 while maintaining their performance. This may inspire a new generation of co-designed optical-digital neural network architectures, aimed for applications that require low power consumption like remote sensing devices.
Adversarial transferability enables black-box attacks on unknown victim deep neuralnetworks (DNNs), rendering attacks viable in real-world scenarios. Current transferable attacks create adversarial perturbation over ...
详细信息
ISBN:
(纸本)1577358872
Adversarial transferability enables black-box attacks on unknown victim deep neuralnetworks (DNNs), rendering attacks viable in real-world scenarios. Current transferable attacks create adversarial perturbation over the entire image, resulting in excessive noise that overfit the source model. Concentrating perturbation to dominant image regions that are model-agnostic is crucial to improving adversarial efficacy. However, limiting perturbation to local regions in the spatial domain proves inadequate in augmenting transferability. To this end, we propose a transferable adversarial attack with fine-grained perturbation optimization in the frequency domain, creating centralized perturbation. We devise a systematic pipeline to dynamically constrain perturbation optimization to dominant frequency coefficients. The constraint is optimized in parallel at each iteration, ensuring the directional alignment of perturbation optimization with model prediction. Our approach allows us to centralize perturbation towards sample-specific important frequency features, which are shared by DNNs, effectively mitigating source model overfitting. Experiments demonstrate that by dynamically centralizing perturbation on dominating frequency coefficients, crafted adversarial examples exhibit stronger transferability, and allowing them to bypass various defenses.
暂无评论