检索结果-内蒙古大学图书馆

作者： Garcia, Jason S. Colorado State University

学位级别：M.S., Master of Science/Master of Surgery

Topological Data Analysis (TDA) uses ideas from topology to study the "shape" of data. It provides a set of tools to extract features, such as holes, voids, and connected components, from complex high-dimensional data. This thesis presents an introductory exposition of the mathematics underlying the two main tools of TDA: Persistent Homology and the MAPPER algorithm. Persistent Homology detects topological features that persist over a range of resolutions, capturing both local and global geometric information. The MAPPER algorithm is a visualization tool that provides a type of dimensional reduction that preserves topological properties of the data by projecting them onto lower dimensional simplicial complexes. Furthermore, this thesis explores recent applications of these tools to natural language processing and computer vision. These applications are divided into two main approaches: In the first approach, TDA is used to extract features from data that is then used as input for a variety of machine learning tasks, like image classification or visualizing the semantic structure of text documents. The second approach, applies the tools of TDA to the machine learning algorithms themselves. For example, using MAPPER to study how structure emerges in the weights of a trained neural network. Finally, the results of several experiments are presented. These include using Persistent Homology for image classification, and using MAPPER to visual the global structure of these data sets. Most notably, the MAPPER algorithm is used to visualize vector representations of contextualized word embeddings as they move through the encoding layers of the BERT-base transformer model.

关键词： Language processing Computer vision Topological data analysis machine learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Deep Learning object detection models: evolution and evaluation 23

Deep Learning object detection models: evolution and evaluat...

引用

15th International Conference on Digital image processing, ICDIP 2023

作者： Bouraya, Sara Belangour, Abdessamad Laboratory of Information Technology and Modeling Hassan Ii University Faculty of Sciences Ben m'Sik Casablanca Morocco

ISBN: (纸本)9798400708237

Computer vision is a subfield of artificial intelligence that relies on training computers to obtain a high level of understanding of vision data. A computer vision system aims at identifying objects through the acquisition of their features such as textures, shapes, sizes, colors, spatial arrangement, to gain an exhaustive description of a video or an image. There are a lot of subfields of computer vision one of them is Object Detection. Detecting objects is a task to identify objects in a specific area. Over the last decades, object detection has gained attention due to its wide range of applications such as human motion analysis, robot navigation, event detection, anomaly detection, video surveillance, traffic analysis, and security. In this paper, we are going to introduce the different Object Detection methodologies and especially relied on Deep Learning based on two categories one stage detectors and two stage detectors. The main goal of this study is to detect, analyze and compare several detection methods and identify the best method based on different several performance metrics ranging from 2014 to 2021. The purpose of this paper to compare some of Object detection methodologies using Weighted Scoring Model (WSM). This covers, studying those algorithms, selecting relevant algorithms. The result of this comparison will show the best Object Detection methods applied on COCO dataset. © 2023 ACM.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

A vision transformer approach for fusarium wilt of chickpea classification

引用

Multimedia Tools and applications 2024年 1-18页

作者： Erbay, Hasan Hayit, Tolga Computer Engineering Department Ostim Technical University 100. Yıl Blv Ankara06374 Turkey Department of Computer Engineering Yozgat Bozok University Ataturk Road 7th km. Yozgat66900 Turkey

Fusarium wilt disease(FWD) caused by Fusarium oxysporum f. sp. ciceris (Padwick) is the most important disease affecting chickpea yield among biotic stresses. Fusarium wilt is a vascular disease that causes permanent lack of productivity in seeds and soil, manifests itself with structural symptoms such as wilting, drooping, dull green discoloration, yellowing, browning of the vascular system, and eventually causes the collapse of the entire plant. These symptoms can be seen 20-30 days after planting and can cause yield losses of up to 90% if necessary precautions are not taken. Although control of FWD in chickpea is challenging, early detection of FWD is critical for effective control and prevention of yield losses. Thus, advanced detection mechanisms need to be implemented to determine the severity level of FWD. Traditional machine learning algorithms based on hand-crafted features normally have poor performance due to their limited ability to represent complex plant features. Herein, vision Transformers (ViTs), which are state-of-the-art image processing methods, were employed in determining the infection type of FWD in chickpea. Pre-trained ViTs such as ViT_L_16, ViT_L_32 and ViT_H_14 were adopted, fine-tuned and tested. The reliability of the models was measured through both statistical performance measures and performance curves. Among the models, ViT_H_14 based model performed the best, followed by ViT_L_16 based. Although the overall accuracy of ViT_H_14 based model is 83%, it classifies Highly-Susceptible as 97% and Highly-Resistant as 85%. Moreover, the overall specificity of ViT_H_14 is 96%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Plant diseases

来源：评论

学校读者我要写书评

暂无评论

VIRTUALLY THROWING BENCHMARKS INTO THE OCEAN FOR DEEP SEA PHOTOGRAMMETRY AND image processing EVALUATION 24

VIRTUALLY THROWING BENCHMARKS INTO THE OCEAN FOR DEEP SEA PH...

引用

24th ISPRS Congress on Imaging Today, Foreseeing Tomorrow

作者： Song, Yifan She, Mengkun Koeser, Kevin GEOMAR Helmholtz Ctr Ocean Res Kiel Ocean Machine Vis Kiel Germany

vision in the deep sea is acquiring increasing interest from many fields as the deep seafloor represents the largest surface portion on Earth. Unlike common shallow underwater imaging, deep sea imaging requires artificial lighting to illuminate the scene in perpetual darkness. Deep sea images suffer from degradation caused by scattering, attenuation and effects of artificial light sources and have a very different appearance to images in shallow water or on land. This impairs transferring current vision methods to deep sea applications. Development of adequate algorithms requires some data with ground truth in order to evaluate the methods. However, it is practically impossible to capture a deep sea scene also without water or artificial lighting effects. This situation impairs progress in deep sea vision research, where already synthesized images with ground truth could be a good solution. Most current methods either render a virtual 3D model, or use atmospheric image formation models to convert real world scenes to appear as in shallow water appearance illuminated by sunlight. Currently, there is a lack of image datasets dedicated to deep sea vision evaluation. This paper introduces a pipeline to synthesize deep sea images using existing real world RGB-D benchmarks, and exemplarily generates the deep sea twin datasets for the well known Middlebury stereo benchmarks. They can be used both for testing underwater stereo matching methods and for training and evaluating underwater image processing algorithms. This work aims towards establishing an image benchmark, which is intended particularly for deep sea vision developments.

关键词： Deep Sea image Underwater Photogrammetry Underwater image processing Synthetic image Dataset Underwater image Formation

来源：评论

学校读者我要写书评

暂无评论

CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medical image Arbitrary-Scale Super Resolution

CuNeRF: Cube-Based Neural Radiance Field for Zero-Shot Medic...

引用

IEEE/CVF International Conference on Computer vision (ICCV)

作者： Chen, Zixuan Yang, Lingxiao Lai, Jian-Huang Xie, Xiaohua Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou Peoples R China Guangdong Prov Key Lab Informat Secur Technol Guangzhou Peoples R China Minist Educ Key Lab Machine Intelligence & Adv Comp Beijing Peoples R China

ISBN: (纸本)9798350307184

Medical image arbitrary-scale super-resolution (MI-ASSR) has recently gained widespread attention, aiming to supersample medical volumes at arbitrary scales via a single model. However, existing MIASSR methods face two major limitations: (i) reliance on high-resolution (HR) volumes and (ii) limited generalization ability, which restricts their applications in various scenarios. To overcome these limitations, we propose Cube-based Neural Radiance Field (CuNeRF), a zero-shot MIASSR framework that is able to yield medical images at arbitrary scales and free viewpoints in a continuous domain. Unlike existing MISR methods that only fit the mapping between low-resolution (LR) and HR volumes, CuNeRF focuses on building a continuous volumetric representation from each LR volume without the knowledge of the corresponding HR one. This is achieved by the proposed differentiable modules: cube-based sampling, isotropic volume rendering, and cube-based hierarchical rendering. Through extensive experiments on magnetic resource imaging (MRI) and computed tomography (CT) modalities, we demonstrate that CuNeRF can synthesize high-quality SR medical images, which outperforms state-of-the-art MISR methods, achieving better visual verisimilitude and fewer objectionable artifacts. Compared to existing MISR methods, our CuNeRF is more applicable in practice.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Computer-aided analysis of radiological images for cancer diagnosis: performance analysis on benchmark datasets, challenges, and directions

引用

EJNMMI REPORTS 2024年第1期8卷 7页

作者： Alyami, Jaber King Abdulaziz Univ Fac Appl Med Sci Dept Radiol Sci Jeddah 21589 Saudi Arabia King Abdulaziz Univ King Fahd Med Res Ctr Jeddah 21589 Saudi Arabia King Abdulaziz Univ Smart Med Imaging Res Grp Jeddah 21589 Saudi Arabia King Abdulaziz Univ Ctr Modern Math Sci & its Applicat Med Imaging & Artificial Intelligence Res Unit Jeddah 21589 Saudi Arabia

Radiological image analysis using machine learning has been extensively applied to enhance biopsy diagnosis accuracy and assist radiologists with precise cures. With improvements in the medical industry and its technology, computer-aided diagnosis (CAD) systems have been essential in detecting early cancer signs in patients that could not be observed physically, exclusive of introducing errors. CAD is a detection system that combines artificially intelligent techniques with image processing applications thru computer vision. Several manual procedures are reported in state of the art for cancer diagnosis. Still, they are costly, time-consuming and diagnose cancer in late stages such as CT scans, radiography, and MRI scan. In this research, numerous state-of-the-art approaches on multi-organs detection using clinical practices are evaluated, such as cancer, neurological, psychiatric, cardiovascular and abdominal imaging. Additionally, numerous sound approaches are clustered together and their results are assessed and compared on benchmark datasets. Standard metrics such as accuracy, sensitivity, specificity and false-positive rate are employed to check the validity of the current models reported in the literature. Finally, existing issues are highlighted and possible directions for future work are also suggested.

关键词： Radiological images MRI Analysis Clinical research applications Cancer diagnosis Multi-organs Biopsy

来源：评论

学校读者我要写书评

暂无评论

An FPGA-Friendly Algorithm for QR Code Detection 56

An FPGA-Friendly Algorithm for QR Code Detection

引用

56th IEEE International Symposium on Circuits and Systems (ISCAS)

作者： Lam, Kenny K. L. Sum, K. W. Chinese Univ Hong Kong Dept Comp Sci & Engn Hong Kong Peoples R China

ISBN: (纸本)9781665451093

QR code is widely used in different applications, and its detection is currently being done by software. However, hardware detection using FPGAs offers real-time processing ability, which makes it attractive for time-critical applications, such as high-precision robotics and augmented reality. In light of this, an FPGA algorithm for QR code detection is proposed in this paper. It operates with a maximum latency of 12.2 ms to detect a QR code when the input image resolution is 640x480, which offers a 85.3% performance boost over the best state-ofthe-art software detector according to benchmarks. To the best of the authors' knowledge, this is the first work that explores the use of FPGA in QR code detection.

关键词： QR code FPGA machine vision real-time

来源：评论

学校读者我要写书评

暂无评论

vision Transformer Based vision Enhancement for Visually Impaired Individuals 7

Vision Transformer Based Vision Enhancement for Visually Imp...

引用

7th International Conference on Circuit Power and Computing Technologies, ICCPCT 2024

作者： Mohammed Ovaiz, A. Yogaraj, A. Rani, K.S. Madhan Kumar, V. Mohammed Kabir, M. Veeramuthu, K. Vel Tech High Tech Dr Rangarajan Dr Sakunthala Engineering College Department of Electronics and Communication Engineering Chennai India

ISBN: (数字)9798350372816

ISBN: (纸本)9798350372816

The goal of visual implants is to create artificial vision that can partially restore function. It can enhance the quality of life for visually challenged individuals by allowing them to feel light, even after years of darkness, by the use of 60 microelectrodes implanted in the retina. The artificial vision that is made possible by current visual system stimulators has very poor resolution because of their small number of microelectrodes. Numerous researchers have sought to enhance artificial vision produced by low-resolution implants through the application of machine learning and image processing techniques. Because phosphine pictures have low resolution, users report unhappiness with the Retinal Prosthesis System. This underscores the important need for targeted research aimed at improving visual clarity and user pleasure in general. This research proposes simulating artificial vision in which the visually impaired user receives information synthesized by the system through a low-resolution photo courtesy of a visual implant. Through the use of vision Transformer, the technique gathers useful data about people in the immediate vicinity of the visually impaired person, including their number, familiarity, gender, approximated ages, facial emotions, nearby items, and approximate distances. The information obtained from the user's glasses' camera frames is used to create signals that are then sent into a visual stimulator, offering a potentially effective way to improve the visual experience for those who are visually impaired. In order to facilitate economical real-time implementations in an independent portable system, an algorithm that best suits each feature is chosen based on its accuracy and time complexity. The proposed approach uses audio to provide crucial information about those in close proximity to a visually impaired person, enabling them to converse with others more comfortably. This paper can thus be taken into consideration for some next-generation v

关键词： Ophthalmology

来源：评论

学校读者我要写书评

暂无评论

image Captioning of Satellite images Using Transfer Learning and LSTM Blending 6th

Image Captioning of Satellite Images Using Transfer Learning...

引用

6th International Conference on Information Systems and Management Science, ISMS 2023

作者： Sharma, Rohit Shree, Aishwary katoch, Aayush Verma, Sourabh Singh Manipal University Jaipur Jaipur303007 India

ISBN: (纸本)9783031707889

In recent years, the field of image captioning has gained substantial attention, posing a complex challenge that necessitates the integration of computer vision (CV), natural language processing (NLP), and machine learning techniques. Our study introduces an advanced model designed to generate natural language descriptions for satellite images. We harnessed the capabilities of convolutional neural networks (CNNs) and utilized state-of-the-art architectures such as VGG-19, ResNet-50, DenseNet-201, and EfficientNet B7 as encoders to extract intricate features from satellite images. These encoded features were seamlessly incorporated into a Long Short-Term Memory (LSTM) network, which is a type of recurrent neural network (RNN), serving as a robust decoder. To assess the effectiveness of our approach, we conducted extensive experiments using two datasets: the Sydney Dataset and the RSIC Dataset, both widely recognized benchmarks in the field of image captioning. The comparative analysis results obtained from our evaluations not only show promise but also underscore the competitiveness of our model. By achieving significant milestones in satellite image captioning, this research not only contributes to the expanding body of knowledge in this domain but also opens avenues for real-world applications, spanning from environmental monitoring to urban planning. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

关键词： Satellite imagery

来源：评论

学校读者我要写书评

暂无评论

Real-time Bangla License Plate Recognition System for Low Resource Video-based applications

Real-time Bangla License Plate Recognition System for Low Re...

引用

22nd IEEE/CVF Winter Conference on applications of Computer vision (WACV)

作者： Ashrafee, Alif Khan, Akib Mohammed Irbaz, Mohammad Sabik Al Nasim, Md Abdullah Islamic Univ Technol Dept Comp Sci & Engn Gazipur Bangladesh Pioneer Alpha Ltd Machine Learning Team Dhaka Bangladesh

ISBN: (纸本)9781665458245

Automatic License Plate Recognition systems aim to provide a solution for detecting, localizing, and recognizing license plate characters from vehicles appearing in video frames. However, deploying such systems in the real world requires real-time performance in low-resource environments. In our paper, we propose a two-stage detection pipeline paired with vision API that provides real-time inference speed along with consistently accurate detection and recognition performance. We used a haar-cascade classifier as a filter on top of our backbone MobileNet SSDv2 detection model. This reduces inference time by only focusing on high confidence detections and using them for recognition. We also impose a temporal frame separation strategy to distinguish between multiple vehicle license plates in the same clip. Furthermore, there are no publicly available Bangla license plate datasets, for which we created an image dataset and a video dataset containing license plates in the wild. We trained our models on the image dataset and achieved an AP(0.5) score of 86% and tested our pipeline on the video dataset and observed reasonable detection and recognition performance (82.7% detection rate, and 60.8% OCR F1 score) with real-time processing speed (27.2 frames per second).

关键词： Computer vision image recognition Conferences Pipelines Focusing Licenses Streaming media

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：