With wide applications of machine learning algorithms, machine learning security has become a significant issue. The vulnerability to adversarial perturbations exists in most machine learning algorithms, including cut...
详细信息
ISBN:
(数字)9783031064272
ISBN:
(纸本)9783031064272;9783031064265
With wide applications of machine learning algorithms, machine learning security has become a significant issue. The vulnerability to adversarial perturbations exists in most machine learning algorithms, including cutting-edge deep neural networks. The standard adversarial perturbation defence techniques with adversarial training need to generate adversarial examples during the training process, which require high computational costs. This paper proposed a novel defence method using self-adaptive logit balancing and Gaussian noise boost training. This method can improve the robustness of deep neural networks without high computational cost and achieve competitive results compared with the adversarial training methods. Meanwhile, this defence method enables deep learning systems to have proactive and reactive defence during the operation. A sub-classifier is trained to determine whether the system is under attack and detect attack algorithms via the patterns of the Log-Softmax values. It can achieve high accuracy for detecting clean inputs and adversarial examples created by seven attack methods.
Successful applications of deep learning often depend on large amount of training data. However, in practical image recognition tasks, available training data are often limited or imbalanced across classes, causing th...
详细信息
ISBN:
(纸本)9783031189098;9783031189104
Successful applications of deep learning often depend on large amount of training data. However, in practical image recognition tasks, available training data are often limited or imbalanced across classes, causing the over-fitting issue or the prediction bias issue during model training. In this paper, based on word embedding models from studies in natural language processing, the prior knowledge about the relationships between image classes is utilized to help train more generalizable classifiers under the condition of limited or class-imbalanced training data. Such inter-class relational knowledge is captured in the word embedding vectors for the textual names of image classes. Using these word embedding vectors as soft labels for corresponding image classes, the feature extractor part of a deep learning model can be guided to learn to extract visual features which contain both class-specific and class-shared information. Experiments on multiple image classification datasets confirm that the proposed learning framework helps improve model performance when training data is limited or class-imbalanced.
The proceedings contain 14 papers. The special focus in this conference is on Context-Aware Systems and applications. The topics include: Prediction of Chaotic Time Series Based on LSTM, Autoencoder and Chaos Theory;a...
ISBN:
(纸本)9783031288159
The proceedings contain 14 papers. The special focus in this conference is on Context-Aware Systems and applications. The topics include: Prediction of Chaotic Time Series Based on LSTM, Autoencoder and Chaos Theory;an Approach to Selecting Students Taking Provincial and National Excellent Student Exams;safe Interaction Between Human and Robot Using vision Technique;application of the imageprocessing Technique for Powerline Robot;collaborative Recommendation with Energy Distance Correlation;blockchain Model in Industrial Pangasius Farming;multiple-Criteria Rating Recommendation with Ordered Weighted Averaging Aggregation Operators;a Survey of On-Chip Hybrid Interconnect for Multicore Architectures;a Framework for Brain-Computer Interfaces Closed-Loop Communication Systems;identification of Abnormal Cucumber Leaves image Based on Recurrent Residual U-Net and Support Vector machine Techniques;lung Lesion images Classification Based on Deep Learning Model and Adaboost Techniques;balltree Similarity: A Novel Space Partition Approach for Collaborative Recommender Systems.
In recent years, there has been a remarkable increase in interest and challenges in imageprocessing and pattern recognition, specifically in the context of air writing. This exciting research area has significant pot...
详细信息
In the realm of computer vision, the term "autonomous driving" has become a buzzword. The main goal of the autonomous driving is to reduce human efforts while driving. However, dealing with measurements of d...
详细信息
ISBN:
(纸本)9783031243660;9783031243677
In the realm of computer vision, the term "autonomous driving" has become a buzzword. The main goal of the autonomous driving is to reduce human efforts while driving. However, dealing with measurements of distance raises numerous obstacles, both in terms of equipment and approach. The use of cameras to measure the distance of an object is practical and popular for obstacle avoidance and navigation.. This work focuses on vehicle distance measuring of traffic signs and cars, which is a critical task in the imageprocessing domain. In this research, the suggested system employs two cameras installed in the hosting vehicle in front, to obtain the data and estimate distance. The proposed pipeline starts with YOLO v3 and YOLOv2 algorithms for detecting traffic signs and cars in the video frames. The distances of the detected objects are measured using triangle similarity approach. In final phase, lane segmentation and grid marking are added along with these results. As a result, it will assist drivers inmaking decisions prior to reaching signs, potentially resulting in improved safety decisions.
In this research paper, an overview of computer methods for segmenting continuous-tone images into meaningful parts and characterizing these parts with 'features' is presented. image segmentation is an essenti...
详细信息
vision Transformers (ViT) and other Transformer-based architectures for image classification have achieved promising performances in the last two years. However, ViT-based models require large datasets, memory, and co...
详细信息
ISBN:
(纸本)9783031064302;9783031064296
vision Transformers (ViT) and other Transformer-based architectures for image classification have achieved promising performances in the last two years. However, ViT-based models require large datasets, memory, and computational power to obtain state-of-the-art results compared to more traditional architectures. The generic ViT model, indeed, maintains a full-length patch sequence during inference, which is redundant and lacks hierarchical representation. With the goal of increasing the efficiency of Transformer-based models, we explore the application of a 2D max-pooling operator on the outputs of Transformer encoders. We conduct extensive experiments on the CIFAR-100 dataset and the large imageNet dataset and consider both accuracy and efficiency metrics, with the final goal of reducing the token sequence length without affecting the classification performance. Experimental results show that bidimensional downsampling can outperform previous classification approaches while requiring relatively limited computation resources.
The rise of cultural tourism in India and massive digitization over the last decade has necessitated preserving Indian art forms. Recent advances in artificial intelligence (AI) have provided the tools and techniques ...
详细信息
The rapid expansion of autonomous driving technologies necessitates the development of robust systems for accurate road surface identification and classification to ensure safe and reliable driving. This review articl...
详细信息
machinevision systems used in modern industrial complexes, based on the analysis of multi and hyperspectral imaging. The transition to implementing the "Industry 4.0" program is not possible when using one ...
详细信息
ISBN:
(数字)9781510645974
ISBN:
(纸本)9781510645974;9781510645967
machinevision systems used in modern industrial complexes, based on the analysis of multi and hyperspectral imaging. The transition to implementing the "Industry 4.0" program is not possible when using one type of data. The first control system used only the visible range image. They made it possible to analyze the trajectories of movement of objects, control product quality, carry out security functions (control of perimeter crossing), etc. The development of new industrial robotic cells and processing complexes using cognitive functions implying the receipt, analysis, and processing of heterogeneous data. The construction of a unified information field, which allows performing multidimensional operations with data, allows increasing the speed of decision-making and the implementation of automated robot-human systems at the level of an assistant working in a unified workspace. The use of machinevision systems analyzing information received in: visible (shape, the trajectory of movement, position of objects, etc.);near-infrared range (data is similar to visible, allows operation in dusty, foggy, low light conditions);far-infrared range - thermal (plotting temperature gradients, identifying areas of overheating);ultraviolet range (analysis of ionization sources, corona discharges, static charges, tags);X-ray and microwave ranges (analysis of the surface and internal structure of objects, allow the identification of defects);range and 3D sensors (construction of volumetric figures, analysis of the relative position of objects and their interaction), etc. Data analysis is often performed not by a single camera but by a group of sensors located not in a single housing. Primary data integration reduces the number of information channels while maintaining the functionality and accuracy of the analysis. The article discusses creating fusion images obtained by industrial sensors into a combined image containing joint data. Combining multi and hyperspectral imaging makes i
暂无评论