检索结果-内蒙古大学图书馆

Generative Adversarial Networks for Anomaly Detection in Biomedical Imaging: A Study on Seven Medical image Datasets

引用

IEEE ACCESS 2023年 11卷 17906-17921页

作者： Esmaeili, Marzieh Toosi, Amirhosein Roshanpoor, Arash Changizi, vahid Ghazisaeedi, Marjan Rahmim, Arman Sabokrou, Mohammad Univ Tehran Med Sci Sch Allied Med Sci Dept Hlth Informat Management Tehran 417744361 Iran Inst Res Fundamental Sci IPM Sch Comp Sci Tehran *** Iran BC Canc Res Inst Dept Integrat Oncol Vancouver BC V5Z 1L3 Canada Islamic Azad Univ Dept Comp Janat Abad Branch Yadegar Eimam Khomeini Tehran *** Iran Univ Tehran Med Sci Sch Allied Hlth Sci Dept Radiol & Radiotherapy Technol Tehran *** Iran BC Canc Res Inst Dept Integrat Oncol Vancouver BC Canada

Anomaly detection (AD) is a challenging problem in computer vision. Particularly in the field of medical imaging, AD poses even more challenges due to a number of reasons, including insufficient availability of ground truth (annotated) data. In recent years, AD models based on generative adversarial networks (GANs) have made significant progress. However, their effectiveness in biomedical imaging remains underexplored. In this paper, we present an overview of using GANs for AD, as well as an investigation of state-of-the-art GAN-based AD methods for biomedical imaging and the challenges encountered in detail. We have also specifically investigated the advantages and limitations of AD methods on medical image datasets, conducting experiments using 3 AD methods on 7 medical imaging datasets from different modalities and organs/tissues. Given the highly different findings achieved across these experiments, we further analyzed the results from both data-centric and model-centric points of view. The results showed that none of the methods had a reliable performance for detecting abnormalities in medical images. Factors such as the number of training samples, the subtlety of the anomaly, and the dispersion of the anomaly in the images are among the phenomena that highly impact the performance of the AD models. The obtained results were highly variable (AUC: 0.475-0.991;Sensitivity: 0.17-0.98;Specificity: 0.14-0.97). In addition, we provide recommendations for the deployment of AD models in medical imaging and foresee important research directions.

关键词： Generators Generative adversarial networks Training data Medical diagnostic imaging Anomaly detection image reconstruction Feature extraction artificial intelligence machine learning deep learning unsupervised anomaly detection generative adversarial networks medical imaging biomedical image processing

来源：评论

学校读者我要写书评

暂无评论

Efficient Fault Detection Methods in Printed Circuit Boards using machine Learning Techniques 15

Efficient Fault Detection Methods in Printed Circuit Boards ...

引用

15th International Conference on Computing Communication and Networking Technologies, ICCCNT 2024

作者： Anbumani, v. Padmapriya, S. Rajaraja, R. vikram, N. Ranjith, S. Abhinav, B.T. Kongu Engineering College Electronics and Communication Engineering Erode Perundurai India Psg Institute of Technology and Applied Research Electronics and Communication Engineering Coimbatore Neelambur India Sona College of Technology Electronics and Communication Engineering Salem India

ISBN: (纸本)9798350370249

Printed circuit boards (PCBs) becoming more complex as technology advances, adding new components and changing their architecture. One of the most crucial quality control procedures is PCB surface inspection since even little flaws in a signal trace may have a significant detrimental effect on the system. It has always been difficult to determine the pass/fail criteria in traditional machine vision systems based on small failure samples, despite the advancements in sensor technology. Suggesting a sophisticated PCB inspection method built on a skip-connected convolutional auto encoder to address these issues suggested to enhance the PCB inspection system by using convolutional autoencoders. The original, fault-free photos and the damaged ones were used to train the deep autoencoder model. The defect location was then located by comparing the decoded images with the input image. Using proper image augmentation to enhance the model training performance in order to get over the tiny and uneven dataset in the early phases of production. Printed circuit boards, or PCBs, are essential parts of electronic gadgets and are very significant to the electronics sector. While ensuring PCB quality and reliability is crucial, manual inspection techniques are often labour and error-intensive. The proposed novel machine learning (ML)-based method for identifying PCB defects demonstrates a significant improvement in detection rates compared to traditional methods, offering a promising solution for the electronics manufacturing industry. © 2024 IEEE.

关键词： Fault detection image processing machine Learning Printed Circuit Boards

来源：评论

学校读者我要写书评

暂无评论

HELPD: Mitigating Hallucination of LvLMs by Hierarchical Feedback Learning with vision-enhanced Penalty Decoding

HELPD: Mitigating Hallucination of LVLMs by Hierarchical Fee...

引用

2024 Conference on Empirical Methods in Natural Language processing, EMNLP 2024

作者： Yuan, Fan Qin, Chi Xu, Xiaogang Li, Piji College of Artificial Intelligence Nanjing University of Aeronautics and Astronautics Nanjing China MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Nanjing China The Chinese University of Hong Kong Hong Kong

ISBN: (纸本)9798891761643

Large vision-Language Models (LvLMs) have shown remarkable performance on many visual-language tasks. However, these models still suffer from multimodal hallucination, which means the generation of objects or content that violates the images. Many existing work detects hallucination by directly judging whether an object exists in an image, overlooking the association between the object and semantics. To address this issue, we propose Hierarchical Feedback Learning with vision-enhanced Penalty Decoding (HELPD). This framework incorporates hallucination feedback at both object and sentence semantic levels. Remarkably, even with a marginal degree of training, this approach can alleviate over 15% of hallucination. Simultaneously, HELPD penalizes the output logits according to the image attention window to avoid being overly affected by generated text. HELPD can be seamlessly integrated with any LvLMs. Our experiments demonstrate that the proposed framework yields favorable results across multiple hallucination benchmarks. It effectively mitigates hallucination for different LvLMs and concurrently improves their text generation quality. © 2024 Association for Computational Linguistics.

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

Electromyography Based Decoding of Dexterous, In-Hand Manipulation Motions With Temporal Multichannel vision Transformers

引用

IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING 2022年 30卷 2207-2216页

作者： Godoy, Ricardo, v Dwivedi, Anany Liarokapis, Minas Univ Auckland Dept Mech & Mechatron Engn New Dexter Res Grp Auckland 1010 New Zealand Friedrich Alexander Univ Chair Autonomous Syst & Mechatron D-91052 Erlangen Germany

Electromyography (EMG) signals have been used in designing muscle-machine interfaces (MuMIs) for various applications, ranging from entertainment (EMG controlled games) to human assistance and human augmentation (EMG controlled prostheses and exoskeletons). For this, classical machine learning methods such as Random Forest (RF) models have been used to decode EMG signals. However, these methods depend on several stages of signal pre-processing and extraction of hand-crafted features so as to obtain the desired output. In this work, we propose EMG based frameworks for the decoding of object motions in the execution of dexterous, in-hand manipulation tasks using raw EMG signals input and two novel deep learning (DL) techniques called Temporal Multi-Channel Transformers and vision Transformers. The results obtained are compared, in terms of accuracy and speed of decoding the motion, with RF-based models and Convolutional Neural Networks as a benchmark. The models are trained for 11 subjects in a motion-object specific and motion-object generic way, using the 10-fold cross-validation procedure. This study shows that the performance of MuMIs can be improved by employing DL-based models with raw myoelectric activations instead of developing DL or classic machine learning models with hand-crafted features.

关键词： Electromyography Brain modeling Task analysis Transformers Decoding Data models Feature extraction Electromyography motion decoding dexterous manipulation deep learning transformers

来源：评论

学校读者我要写书评

暂无评论

Graph Moving Object Segmentation

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2022年第5期44卷 2485-2503页

作者： Giraldozuluaga, Jhony H. Javed, Sajid Bouwmans, Thierry La Rochelle Univ Lab MIA Math Image & Applicat F-17000 La Rochelle France Khalifa Univ Ctr Autonomous Robot Syst Abu Dhabi 127788 U Arab Emirates

Moving Object Segmentation (MOS) is a fundamental task in computer vision. Due to undesirable variations in the background scene, MOS becomes very challenging for static and moving camera sequences. Several deep learning methods have been proposed for MOS with impressive performance. However, these methods show performance degradation in the presence of unseen videos;and usually, deep learning models require large amounts of data to avoid overfitting. Recently, graph learning has attracted significant attention in many computer vision applications since they provide tools to exploit the geometrical structure of data. In this work, concepts of graph signal processing are introduced for MOS. First, we propose a new algorithm that is composed of segmentation, background initialization, graph construction, unseen sampling, and a semi-supervised learning method inspired by the theory of recovery of graph signals. Second, theoretical developments are introduced, showing one bound for the sample complexity in semi-supervised learning, and two bounds for the condition number of the Sobolev norm. Our algorithm has the advantage of requiring less labeled data than deep learning methods while having competitive results on both static and moving camera videos. Our algorithm is also adapted for video Object Segmentation (vOS) tasks and is evaluated on six publicly available datasets outperforming several state-of-the-art methods in challenging conditions.

关键词： videos Task analysis Signal processing algorithms Object segmentation Semisupervised learning Deep learning Complexity theory Moving object segmentation graph signal processing semi-supervised learning unseen videos video object segmentation

来源：评论

学校读者我要写书评

暂无评论

Detection of violence using mosaicking and DFE- WLSRF: Deep feature extraction with weighted least square with random forest

引用

MULTIMEDIA TOOLS AND applications 2023年第14期83卷 40873-40908页

作者： Elakiya, v. Puviarasan, N. Aruna, P. Annamalai Univ Dept Comp Sci & Engn Chidambaram Tamil Nadu India Annamalai Univ Dept Comp & Informat Sci Chidambaram Tamil Nadu India

The violence-related instances had surged recently in areas including footpaths, sports stadiums, remote roads, liquor stores and elevators that are tragically discovered only after some time. In exploring this issue, the complete video analysis model's potential to determine any violent acts from the sequence of video clips is evolved. However, the recent studies that work on the violent detection approach majorly focus on traditional hand-crafted features, less performance accuracy in violence detection and do not make entire utilization of deep learning research outcomes in computer vision. The proposed system is put forth a violence detection framework based on (CNN) Convolutional neural network with (LSTM) Long short-term memory feature extraction process and fine-tuned the image frame hyperparameter from extracted features using Random forest classifier updated with weight score through (WLS) Weight least square algorithm. The Model in prior subjected to the feature extraction phase and the image frames are segmented through the mosaicking pre-processing step, with a 30:20 enlargement ratio to image mosaics, aiding to generate time-consistent outcomes and algorithm's performance improvisation through minimizing search space. The integration of CNN and LSTM framework is applied to reduce the complexity of the extraction learning process, and the LSTM network in correlating feature value with past information, and retaining memory space. The dynamic weighing scheme is proposed with the WLS method and this weighted score is assigned to the most probable class in the decision tree. Such more similar parameters as hyperparameters were tuned through a random forest classifier, and it categorizes the outcomes as non-fight or fights clips dynamically. The comparative performance evaluation of the proposed framework (DFE-WLSRF), Deep feature extraction - Weighted least square random-forest classifier delineated the outperforming high accuracy results in comparison to o

关键词： Mosaicking LSTM- Long short-term memory CNN- Convolutional neural network RF-Random forest WLS-Weighted least square violence detection

来源：评论

学校读者我要写书评

暂无评论

violet: A vision-Language Model for Arabic image Captioning with Gemini Decoder 1

Violet: A Vision-Language Model for Arabic Image Captioning ...

引用

1st Arabic Natural Language processing Conference, ArabicNLP 2023

作者： Mohamed, Abdelrahman Alwajih, Fakhraddin Nagoudi, El Moatez Billah Inciarte, Alcides Alcoba Abdul-Mageed, Muhammad Deep Learning & Natural Language Processing Group The University of British Columbia Canada Department of Natural Language Processing Department of Machine Learning MBZUAI United States

ISBN: (纸本)9781959429272

Although image captioning has a vast array of applications, it has not reached its full potential in languages other than English. Arabic, for instance, although the native language of more than 400 million people, remains largely underrepresented in this area. This is due to the lack of labeled data and powerful Arabic generative models. We alleviate this issue by presenting a novel vision-language model dedicated to Arabic, dubbed violet. Our model is based on a vision encoder and a Gemini text decoder that maintains generation fluency while allowing fusion between the vision and language components. To train our model, we introduce a new method for automatically acquiring data from available English datasets. We also manually prepare a new dataset for evaluation. violet performs sizeably better than our baselines on all of our evaluation datasets. For example, it reaches a CIDEr score of 61.2 on our manually annotated dataset and achieves an improvement of 13 points on Flickr8k. © 2023 Association for Computational Linguistics.

关键词： Decoding

来源：评论

学校读者我要写书评

暂无评论

A Comparative Analysis of image Partitioning and Compression Mechanisms 15

A Comparative Analysis of Image Partitioning and Compression...

引用

15th International Conference on Advances in Computing, Control, and Telecommunication Technologies, ACT 2024

作者： Chaudhari, Sampada P. Thakur, Nileshsingh v. Dept. of Computer Science and Engineering Prof Ram Meghe College of Engineering & Management Badnera India Dept. of Computer Science and Engineering Yeshwantrao Chavan College of Engineering Nagpur India

ISBN: (纸本)9798331300579

This paper presents a comprehensive comparative analysis of image partitioning and compression mechanisms, two fundamental techniques in image processing and data compression. image partitioning involves dividing an image into smaller components, while compression aims to reduce the size of digital images while preserving visual quality. The analysis encompasses various aspects of both techniques, including methodologies, algorithms, and performance metrics. For image partitioning, segmentation methods such as block-based, region-based, and hierarchical partitioning are examined, along with their respective advantages and limitations. Additionally, the paper explores compression mechanisms such as transform coding, quantization, entropy coding, and predictive coding, highlighting their roles in achieving efficient compression. Furthermore, the comparative analysis evaluates the tradeoffs between different partitioning and compression techniques in terms of compression ratio, image quality, computational complexity, and applicability across different types of images and applications. various compression standards such as JPEG and PNG are also reviewed to provide insights into real-world implementations. Finally, the paper discusses future research directions and emerging trends in image partitioning and compression, including the integration of machine learning techniques for improved segmentation and compression performance. The findings presented in this comparative analysis can serve as a valuable resource for researchers, practitioners, and developers working in the fields of image processing, data compression, and multimedia systems. © Grenze Scientific Society, 2024.

关键词： image compression

来源：评论

学校读者我要写书评

暂无评论

High Dynamic Range image processing Techniques for Automation and machine vision

High Dynamic Range Image Processing Techniques for Automatio...

引用

Automation, Robotics and Computer Engineering (ICARCE), International Conference on

作者： Chunhao Li Zhejiang University of Science and Technology Hangzhou China

ISBN: (数字)9798331529505

ISBN: (纸本)9798331529512

High Dynamic Range (HDR) imaging has become a significant technological advancement in visual data processing, allowing for the capture of a wider dynamic range of luminance levels in images. This paper explores various HDR processing techniques and their potential applications in automation and machine vision. By using methods such as multiple image fusion, image registration, and tone mapping, the paper demonstrates how HDR processing can enhance visual data in automated systems, improving accuracy in environments requiring complex lighting conditions. This work applies HDR algorithms to real-world scenarios, showcasing their potential in industrial automation and robotics, where accurate visual data plays a crucial role.

关键词： visualization image registration Automation Accuracy Service robots machine vision Lighting Real-time systems High dynamic range image fusion

来源：评论

学校读者我要写书评

暂无评论

Class preserving projections and data augmentation for appearance-based face recognition

引用

PATTERN ANALYSIS AND applications 2025年第1期28卷 1-12页

作者： Soldera, John Scharcanski, Jacob Inst Fed Farroupilha Estr RS-218 BR-98806700 St Angelo Rio Grande do S Brazil Univ Fed Rio Grande do Sul Inst Informat Ave Bento Goncalves 9500 BR-91501970 Porto Alegre Rio Grande do S Brazil

Computer vision and Biometrics benefit from the recent advances in Pattern Recognition and Artificial Intelligence, which tends to make model-based face recognition more efficient. Also, deep learning combined with data augmentation tends to enrich the training sets used for learning tasks. Nevertheless, face recognition still is challenging, especially because of imaging issues that occur in practice, such as changes in lighting, appearance, head posture and facial expression. In order to increase the reliability of face recognition, we propose a novel supervised appearance-based face recognition method which creates a low-dimensional orthogonal subspace that enforces the face class separability. The proposed approach uses data augmentation to mitigate the problem of training sample scarcity. Unlike most face recognition approaches, the proposed approach is capable of handling efficiently grayscale and color face images, as well as low and high-resolution face images. Moreover, proposed supervised method presents better class structure preservation than typical unsupervised approaches, and also provides better data preservation than typical supervised approaches as it obtains an orthogonal discriminating subspace that is not affected by the singularity problem that is common in such cases. Furthermore, a soft margins Support vector machine classifier is learnt in the low-dimensional subspace and tends to be robust to noise and outliers commonly found in practical face recognition. To validate the proposed method, an extensive set of face identification experiments was conducted on three challenging public face databases, comparing the proposed method with methods representative of the state-of-the-art. The proposed method tends to present higher recognition rates in all databases. In addition, the experiments suggest that data augmentation also plays an essential role in the appearance-based face recognition, and that the CIELAB color space (L*a*b) is generally mor

关键词： Pattern recognition image processing Appearance-based face recognition Supervised methods Dimensionality reduction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：