检索结果-内蒙古大学图书馆

23rd IEEE International Conference on machine Learning and applications, ICMLA 2024

作者： Peng, Wenyu Zheng, Simeng Baluja, Michael Xie, Tao Jiang, Anxiao Siegel, Paul H. University of California San Diego Electrical and Computer Engineering Department La Jolla CA92093 United States San Diego State University San DiegoCA92182 United States Texas A&m University Department of Computer Science and Engineering College StationTX77843 United States

ISBN: (纸本)9798350374889

The goal of functional error correction is to preserve neural network performance when stored network weights are corrupted by noise. To achieve this goal, a selective protection (SP) scheme was proposed to optimally protect the functionally important bits in binary weight representations in a layer-dependent manner. Although it showed its effectiveness in image classification tasks on some relatively simple networks such as ResNet-18 and vGG-16, it becomes inadequate for emerging complex machine learning tasks generated from natural language processing and vision-language association domains. To solve this problem, we extend the SP scheme in three directions: task complexity, model complexity, and storage complexity. Extensions to complex natural language and vision-language tasks include text categorization and 'zero-shot' textual classification of images. Extensions to more complex models with deeper block structures and attention mechanisms consist of very Deep Convolutional Neural Network (vDCNN) and Contrastive Language-image Pre-Training (CLIP) networks. Extensions to more complex storage configurations focus on distributed storage architectures to support model parallelism. Experimental results show that the optimized SP scheme preserves network performance in all of these settings. The results also provide insights into redundancy-performance tradeoffs, generalizability of SP across datasets and tasks, and robustness of partitioned network architectures. © 2024 IEEE.

关键词： visual languages

来源：评论

学校读者我要写书评

暂无评论

New design strategies of deep heterogenous convolutional neural networks ensembles for breast cancer diagnosis

引用

MULTIMEDIA TOOLS AND applications 2024年第24期83卷 65189-65220页

作者： Zerouaoui, Hasnae El Alaoui, Omar Idri, Ali Mohammed VI Polytech Univ Coll Comp Benguerir Morocco Mohammed V Univ Software Project Management Res Team ENSIAS Rabat Morocco Mohammed VI Polytech Univ Benguerir Morocco

One of the most consequential public health issues in the world and a major factor in women's mortality is breast cancer. Early detection and diagnosis can significantly improve the likelihood of survival. Therefore, this study suggests a deep end-to-end heterogeneous ensemble approach by using deep convolutional neural networks models for breast histological images classification tested on the BreakHis dataset. The proposed approach showed a significant increase of performances compared to their base learners. Thus, seven deep learning architectures (vGG16, vGG19, ResNet50, Inception v3, Inception ResNet v2, Xception, and MobileNet v2) were trained using fivefold cross-validation. Thereafter, deep end-to-end heterogeneous ensembles of two up to seven models were constructed based on three selection criteria's (by accuracy, by diversity, and by both accuracy and diversity) and combined with two voting methods: majority voting by tacking the mode of the distribution of the predicted labels, and weighted voting by taking the average of predicted probabilities. Results showed the effectiveness of deep end-to-end ensemble learning techniques for histopathological breast cancer images classification since the ensembles designed using weighted voting with the selection by accuracy strategy method exceeded the ones designed using the selection by diversity or by accuracy and diversity strategies. The accuracy values of the proposed approach have shown a significant amelioration compared to the least performing base learner used as a baseline ResNet 50 with an accuracy increased from 78.14%, 78.57%, 82.80 and 79.43% to 93.8%, 93.4%, 93.3%, and 91.8% through the BreakHis dataset's four magnification factors: 40X, 100X, 200X, and 400X respectively.

关键词： machine learning Deep learning Classification Ensemble learning image processing Digital pathology

来源：评论

学校读者我要写书评

暂无评论

Comparison of image processing techniques for defect detection 1

Comparison of image processing techniques for defect detecti...

引用

1st International Workshop of Young Scientists on Artificial Intelligence for Sustainable Development, AISD 2024

作者： Kovalskyi, Semen Koval, vasyl West Ukrainian National University 11 Lvivska Str. Ternopil46009 Ukraine

Defect detection is a crucial quality control process in the manufacturing industry, aimed at identifying and classifying imperfections or anomalies in products before they reach customers. Traditional manual inspection methods are time-consuming, labor-intensive, and prone to human error. This paper provides a comprehensive overview of image-based defect detection algorithms, including traditional image processing techniques, machine learning algorithms, and deep learning models. The study analyzes the strengths, limitations, and performance of each approach across various applications and datasets. The results demonstrate that while traditional methods and machine learning algorithms offer reliable defect detection, deep learning models, particularly convolutional neural networks (CNNs), achieve exceptional accuracy and robustness. However, deep learning models require significant computational resources and large amounts of labeled data for training. The paper highlights the importance of selecting the most appropriate approach based on specific application requirements, data characteristics, and computational constraints. Furthermore, it discusses future research opportunities, such as developing more robust and generalized algorithms, leveraging multi-modal data, improving model interpretability, and enabling real-time and edge computing solutions. © 2024 Copyright for this paper by its authors.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

MAGIC: machine-Learning-Guided image Compression for vision applications in Internet of Things

引用

IEEE INTERNET OF THINGS JOURNAL 2021年第9期8卷 7303-7315页

作者： Chakraborty, Prabuddha Cruz, Jonathan Bhunia, Swarup Univ Florida Dept Elect & Comp Engn Gainesville FL 32611 USA

The emergent ecosystems of intelligent edge devices in diverse Internet-of-Things (IoT) applications, from automatic surveillance to precision agriculture, increasingly rely on recording and processing a variety of image data. Due to resource constraints, e.g., energy and communication bandwidth requirements, these applications require compressing the recorded images before transmission. For these applications, image compression commonly requires: 1) maintaining features for coarse-grain pattern recognition instead of the high-level details for human perception due to machine-to-machine communications;2) high compression ratio that leads to improved energy and transmission efficiency;and 3) large dynamic range of compression and an easy tradeoff between compression factor and quality of reconstruction to accommodate a wide diversity of IoT applications as well as their time-varying energy/performance needs. To address these requirements, we propose, MAGIC, a novel machine learning (ML)-guided image compression framework that judiciously sacrifices the visual quality to achieve much higher compression when compared to traditional techniques, while maintaining accuracy for coarse-grained vision tasks. The central idea is to capture application-specific domain knowledge and efficiently utilize it in achieving high compression. We demonstrate that the MAGIC framework is configurable across a wide range of compression/quality and is capable of compressing beyond the standard quality factor limits of both JPEG 2000 and WebP. We perform experiments on representative IoT applications using two vision data sets and show 42.65x compression at similar accuracy with respect to the source. We highlight low variance in compression rate across images using our technique as compared to JPEG 2000 and WebP.

关键词： image coding Internet of Things image color analysis image edge detection Task analysis image segmentation Transform coding Computer vision edge intelligence image compression Internet of Things (IoT) machine learning (ML) sensor signal processing

来源：评论

学校读者我要写书评

暂无评论

An Enhanced Integrated Model for image Inpainting Using Gated Convolution Spectral Normalized SN-Patch Generative Adversarial Networks

An Enhanced Integrated Model for Image Inpainting Using Gate...

引用

International Conference on machine Intelligence and Smart Innovation (ICMISI)

作者： Elharmil, Mahmoud Merghany, Menna Youssef, Sherin M. Arab Acad Sci Technol & Maritime Transport AASTMT Comp Engn Dept Alexandria Egypt

ISBN: (纸本)9798350365740;9798350365757

image inpainting is the process of reconstructing missing or damaged regions in an image and is an important task in computer vision applications for restoration and enhancement. However, repair algorithms are often sensitive to noise and yield suboptimal results. To address this challenge, a new integrated two-stage framework is introduced to improve the performance of image inpainting. In the first stage, an effective Noise2void denoising is applied to learn meaningful representations of image patches and effectively denoise the input image. The proposed N2v model considers the structural links between pixels and retains contextual information at the same time, while suppressing noise. In the 2nd stage, an advanced enhanced DeepFill inpainting model employing deep neural networks is applied. Experimental results showed that the method proposed will outperform traditional repair methods. The denoising step tunes the accuracy of reconstructing missing areas, and greatly improves the quality of inpainting. Applied on huge benchmark datasets, the performance is evaluated and demonstrated that N2v integrated with DeepFill outperforms individual inpainting techniques. Furthermore, we carry out an ablation study to evaluate the contribution of each constituent part of our proposed framework. This outcome underscores the complementary nature of the denoising and repair stages and points to the need for noise control before repairs. In general, our technique provides a strong and effective approach to image restoration tasks and allows for improving inpainting methods under real-world conditions.

关键词： Inpainting Denoising GAN Gated Convolution Noise2void DeepFill

来源：评论

学校读者我要写书评

暂无评论

Comparative Analysis of various CNN Architectures in Recognizing Objects in a Classification System 9

Comparative Analysis of Various CNN Architectures in Recogni...

引用

9th IEEE International Conference for Convergence in Technology, I2CT 2024

作者： Surve, Yash Pudari, Kshitija Bedade, Sonali Masanam, Balaji Durai Bhalerao, Kinnari Mhatre, Pratik Vidyalankar Institute of Technology Department of Electronics And Telecommunication Engineering Mumbai India

ISBN: (纸本)9798350394474

Object recognition, an essential technique in computer vision, enables machines to identify and understand real-time objects and environments based on input images. The main aim of this technology is to accurately recognize image features to enable accurate object recognition. With the rapid evolution of various machine learning (ML) and deep learning (DL) algorithms, the world of image processing and computer vision has witnessed significant growth. Deep learning algorithms offer high accuracy by processing vast amounts of data, while machine learning algorithms provide flexibility in selecting the best combinations of features and classifiers for learning purposes. Convolutional Neural Network (CNN) is a widely used deep-learning model that is highly effective in classifying images. AlexNet, vGG16, ResNet and GoogleNet are some of the well-known CNN architectures used for object recognition. The paper proposes a comparative analysis of prediction accuracy between these pre-trained CNN models using transfer learning. The transfer learning approach, on the other hand, involves the fine-tuning of pre-trained models to enhance prediction accuracy in various image classification scenarios. In this paper, the CNN models are trained and tested on a dataset containing images of apples, oranges and bananas. Experimental results with a real dataset show that MobileNet v2 has the highest accuracy of 92.80%. © 2024 IEEE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Bag of views: An Appearance-Based Approach to Next-Best-view Planning for 3D Reconstruction

引用

IEEE ROBOTICS AND AUTOMATION LETTERS 2024年第1期9卷 295-302页

作者： Gazani, Sara Hatami Tucsok, Matthew Mantegh, Iraj Najjaran, Homayoun Univ Victoria Dept Mech Engn Victoria BC V8P 5C2 Canada Univ British Columbia Okanagan Sch Engn Kelowna BC V1V 1V7 Canada Natl Res Council NRC Canada Montreal PQ H3T 2B2 Canada

UAv-based intelligent data acquisition for 3D reconstruction and monitoring of infrastructure has experienced an increasing surge of interest due to recent advancements in image processing and deep learning-based techniques. view planning is an essential part of this task that dictates the information capture strategy and heavily impacts the quality of the 3D model generated from the captured data. Recent methods have used prior knowledge or partial reconstruction of the target to accomplish view planning for active reconstruction;the former approach poses a challenge for complex or newly identified targets while the latter is computationally expensive. In this work, we present Bag-of-views (Bov), a fully appearance-based model used to assign utility to the captured views for both offline dataset refinement and online next-best-view (NBv) planning applications targeting the task of 3D reconstruction. With this contribution, we also developed the view Planning Toolbox (vPT), a lightweight package for training and testing machine learning-based view planning frameworks, custom view dataset generation of arbitrary 3D scenes, and 3D reconstruction. Through experiments which pair a Bov-based reinforcement learning model with vPT, we demonstrate the efficacy of our model in reducing the number of required views for high-quality reconstructions in dataset refinement and NBv planning.

关键词： Planning Three-dimensional displays image reconstruction Solid modeling Feature extraction Task analysis Training Aerial systems: perception and autonomy intelligent data acquisition reactive and sensor-based planning

来源：评论

学校读者我要写书评

暂无评论

E2Evideo: End to End video and image Pre-processing and Analysis Tool 30th

E2Evideo: End to End Video and Image Pre-processing and A...

引用

30th International Conference on MultiMedia Modeling, MMM 2024

作者： Alawad, Faiga Halvorsen, Pål Riegler, Michael A. Department of Holistic Systems Simula Metropolitan Center for Digital Engineering Oslo Norway

ISBN: (纸本)9783031533013

In this demonstration paper, we present "e2evideo" a versatile Python package composed of domain-independent modules. These modules can be seamlessly customised to suit specialised tasks by modifying specific attributes, allowing users to tailor functionality to meet the requirements of a targeted task. The package offers a variety of functionalities, such as interpolating missing video frames, background subtraction, image resizing, and extracting features utilising state-of-the-art machine learning techniques. With its comprehensive set of features, "e2evideo" stands as a facilitating tool for developers in the creation of image and video processing applications, serving diverse needs across various fields of computer vision. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

关键词： Python

来源：评论

学校读者我要写书评

暂无评论

vehicle Movement Tracking and Control Using image processing 8th

Vehicle Movement Tracking and Control Using Image Processing

引用

8th International Conference on Inventive Communication and Computational Technologies, ICICCT 2024

作者： Krishna Chaitanya, R. Sirisha, G.N.v.G. Ravi Kiran varma, P. Alla, Janakiram Naidu Sagi Rama Krishnam Raju Engineering College Andhra Pradesh Bhimavaram534204 India

ISBN: (纸本)9789819777099

This paper proposes machine vision-based live traffic monitoring aided with various image processing methods. The application is designed to monitor live traffic on a road or private property premises with features like vehicle counting, incoming and outgoing vehicle counting, speed estimation, so that the results can aid intelligent traffic management. Typical image processing tools like thresholding, canny edge, DeepSORT, Kalman filter, and Hungarian algorithm are used for the purpose. vehicle detection accuracy of 94.8% and counting accuracy of 86.6% are achieved. Based on the traffic density, green for good to go and red for stop indications are also included. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Kalman filters

来源：评论

学校读者我要写书评

暂无评论

Traffic sign detection and recognition under low illumination

引用

machine vision AND applications 2023年第5期34卷 1-19页

作者： Yao, Jiana Huang, Bingqiang Yang, Song Xiang, Xinjian Lu, Zhigang Zhejiang Univ Sci & Technol Hangzhou 310023 Zhejiang Peoples R China Zhejiang Sci Res Inst Transport Hangzhou 310009 Zhejiang Peoples R China

To address the problems such as the difficulty of traffic sign detection and recognition under low illumination, a new low illumination traffic sign detection and recognition algorithm is proposed. The algorithm firstly uses an illumination judgement algorithm to filter out low-illumination images, then uses a New Illumination Enhancement algorithm to adjust the brightness and contrast of the low-illumination images, and finally uses mask RCNN (mask region-based convolutional neural network, mask RCNN) to detect and recognize traffic signs. The New Illumination Enhancement Algorithm is based on Illumination-Reflection model, firstly converting the image RGB space into HSv space, applying guided filtering to the v channel to obtain the illumination component, using the illumination component to extract the reflection component, and adjusting the reflection component by linear pull-up. Next, the distribution characteristics of the illumination component are used to adjust the 2D gamma function and obtain the optimized illumination component. Subsequently, the illumination component is used to obtain the detail component. Finally, a hybrid spatial enhancement method is used to obtain the enhanced v-channel and reconstruct the image. The experimental results show that the New Illumination Enhancement algorithm can effectively improve image brightness and sharpness in both low illumination traffic scenes, ensure that the image is not distorted, retain image information and enhance the prominence of traffic signs in traffic scenes. In the ZCTSDB-lightness test set, the combined algorithm of new light image enhancement and Mask RCNN improved object detection mAP(bb) and instance segmentation mAP(seg) by 2.810% and 1.176%, respectively, compared to Mask RCNN. In the ZCTSDB test set, the performance of the new low illumination traffic sign detection and recognition algorithm outperformed all other algorithms.

关键词： Intelligent transportation Traffic sign recognition Mask RCNN Low illumination Illumination image enhancement

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：