检索结果-内蒙古大学图书馆

IEEE/CVF International Conference on Computer vision (ICCV)

作者： Chen, Zixuan Yang, Lingxiao Lai, Jian-Huang Xie, Xiaohua Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou Peoples R China Guangdong Prov Key Lab Informat Secur Technol Guangzhou Peoples R China Minist Educ Key Lab Machine Intelligence & Adv Comp Beijing Peoples R China

ISBN: (纸本)9798350307184

Medical image arbitrary-scale super-resolution (MI-ASSR) has recently gained widespread attention, aiming to supersample medical volumes at arbitrary scales via a single model. However, existing MIASSR methods face two major limitations: (i) reliance on high-resolution (HR) volumes and (ii) limited generalization ability, which restricts their applications in various scenarios. To overcome these limitations, we propose Cube-based Neural Radiance Field (CuNeRF), a zero-shot MIASSR framework that is able to yield medical images at arbitrary scales and free viewpoints in a continuous domain. Unlike existing MISR methods that only fit the mapping between low-resolution (LR) and HR volumes, CuNeRF focuses on building a continuous volumetric representation from each LR volume without the knowledge of the corresponding HR one. This is achieved by the proposed differentiable modules: cube-based sampling, isotropic volume rendering, and cube-based hierarchical rendering. Through extensive experiments on magnetic resource imaging (MRI) and computed tomography (CT) modalities, we demonstrate that CuNeRF can synthesize high-quality SR medical images, which outperforms state-of-the-art MISR methods, achieving better visual verisimilitude and fewer objectionable artifacts. Compared to existing MISR methods, our CuNeRF is more applicable in practice.

关键词：

来源：评论

学校读者我要写书评

暂无评论

SDIP: Self-reinforcement deep image prior framework for image processing

引用

PATTERN RECOGNITION 2025年 168卷

作者： Shu, Ziyu Pan, Zhixin SUNY Stony Brook Dept Radiat Oncol Stony Brook NY 11776 USA Florida State Univ Dept Elect & Comp Engn Tallahassee FL 32306 USA

Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently addresses the inverse problems in image processing and has induced extensive applications in various domains. In paper, we propose the self-reinforcement deep image prior (SDIP) as an improved version of the original We observed that the changes in the DIP networks' input and output are highly correlated during each iteration. SDIP efficiently utilizes this discovery in a reinforcement learning manner, where the current iteration's output is utilized by a steering algorithm to update the network input for the next iteration, guiding the algorithm towards improved results. Experimental results across multiple applications demonstrate that our proposed SDIP framework offers improvement compared to the original DIP method, especially when the corresponding inverse problem is highly ill-posed.

关键词： Deep image prior Inverse problem image reconstruction Computer vision Deep learning machine learning

来源：评论

学校读者我要写书评

暂无评论

VIRTUALLY THROWING BENCHMARKS INTO THE OCEAN FOR DEEP SEA PHOTOGRAMMETRY AND image processing EVALUATION 24

VIRTUALLY THROWING BENCHMARKS INTO THE OCEAN FOR DEEP SEA PH...

引用

24th ISPRS Congress on Imaging Today, Foreseeing Tomorrow

作者： Song, Yifan She, Mengkun Koeser, Kevin GEOMAR Helmholtz Ctr Ocean Res Kiel Ocean Machine Vis Kiel Germany

vision in the deep sea is acquiring increasing interest from many fields as the deep seafloor represents the largest surface portion on Earth. Unlike common shallow underwater imaging, deep sea imaging requires artificial lighting to illuminate the scene in perpetual darkness. Deep sea images suffer from degradation caused by scattering, attenuation and effects of artificial light sources and have a very different appearance to images in shallow water or on land. This impairs transferring current vision methods to deep sea applications. Development of adequate algorithms requires some data with ground truth in order to evaluate the methods. However, it is practically impossible to capture a deep sea scene also without water or artificial lighting effects. This situation impairs progress in deep sea vision research, where already synthesized images with ground truth could be a good solution. Most current methods either render a virtual 3D model, or use atmospheric image formation models to convert real world scenes to appear as in shallow water appearance illuminated by sunlight. Currently, there is a lack of image datasets dedicated to deep sea vision evaluation. This paper introduces a pipeline to synthesize deep sea images using existing real world RGB-D benchmarks, and exemplarily generates the deep sea twin datasets for the well known Middlebury stereo benchmarks. They can be used both for testing underwater stereo matching methods and for training and evaluating underwater image processing algorithms. This work aims towards establishing an image benchmark, which is intended particularly for deep sea vision developments.

关键词： Deep Sea image Underwater Photogrammetry Underwater image processing Synthetic image Dataset Underwater image Formation

来源：评论

学校读者我要写书评

暂无评论

Deep Learning object detection models: evolution and evaluation 23

Deep Learning object detection models: evolution and evaluat...

引用

15th International Conference on Digital image processing, ICDIP 2023

作者： Bouraya, Sara Belangour, Abdessamad Laboratory of Information Technology and Modeling Hassan Ii University Faculty of Sciences Ben m'Sik Casablanca Morocco

ISBN: (纸本)9798400708237

Computer vision is a subfield of artificial intelligence that relies on training computers to obtain a high level of understanding of vision data. A computer vision system aims at identifying objects through the acquisition of their features such as textures, shapes, sizes, colors, spatial arrangement, to gain an exhaustive description of a video or an image. There are a lot of subfields of computer vision one of them is Object Detection. Detecting objects is a task to identify objects in a specific area. Over the last decades, object detection has gained attention due to its wide range of applications such as human motion analysis, robot navigation, event detection, anomaly detection, video surveillance, traffic analysis, and security. In this paper, we are going to introduce the different Object Detection methodologies and especially relied on Deep Learning based on two categories one stage detectors and two stage detectors. The main goal of this study is to detect, analyze and compare several detection methods and identify the best method based on different several performance metrics ranging from 2014 to 2021. The purpose of this paper to compare some of Object detection methodologies using Weighted Scoring Model (WSM). This covers, studying those algorithms, selecting relevant algorithms. The result of this comparison will show the best Object Detection methods applied on COCO dataset. © 2023 ACM.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Attention to Emotions: Body Emotion Recognition In-the-Wild Using Self-attention Transformer Network 18th

Attention to Emotions: Body Emotion Recognition In-the-Wild ...

引用

18th International Joint Conference on Computer vision, Imaging and Computer Graphics Theory and applications (VISIGRAPP) / 18th International Conference on Computer Graphics Theory and applications

作者： Paiva, Pedro V. V. Ramos, Josue J. G. Gavrilova, Marina Carvalho, Marco A. G. Univ Estadual Campinas Sch Technol Limeira Brazil Renato Archer IT Ctr Cyber Phys Syst Div Campinas Brazil Univ Calgary Dept Comp Sci Calgary AB Canada

ISBN: (纸本)9783031667428;9783031667435

Body movements are an essential part of non-verbal communication as they help to express and interpret human emotions. The potential of Body Emotion Recognition (BER) is immense, as it can provide insights into user preferences, automate real-time exchanges and enable machines to respond to human emotions. BER finds applications in customer service, healthcare, entertainment, emotion-aware robots, and other areas. While face expression-based techniques are extensively researched, detecting emotions from body movements in the realworld presents several challenges, including variations in body posture, occlusions, and background. Recent research has established the efficacy of transformer deep-learning models beyond the language domain to solve video and image-related problems. A key component of transformers is the self-attention mechanism, which captures relationships among features across different spatial locations, allowing contextual information extraction. In this study, we aim to understand the role of body movements in emotion expression and to explore the use of transformer networks for body emotion recognition. Our method proposes a novel linear projection function of the visual transformer, which enables the transformation of 2D joint coordinates into a conventional matrix representation. Using an original method of contextual information learning, the developed approach enables a more accurate recognition of emotions by establishing unique correlations between individual's body motions over time. Our results demonstrated that the self-attention mechanism was able to achieve high accuracy in predicting emotions from body movements, surpassing the performance of other recent deep-learning methods. In addition, the impact of dataset size and frame rate on classification performance is analyzed.

关键词： Body emotion recognition Affective computing Video and image processing Gait analysis Attention-based design

来源：评论

学校读者我要写书评

暂无评论

A vision transformer approach for fusarium wilt of chickpea classification

引用

Multimedia Tools and applications 2024年 1-18页

作者： Erbay, Hasan Hayit, Tolga Computer Engineering Department Ostim Technical University 100. Yıl Blv Ankara06374 Turkey Department of Computer Engineering Yozgat Bozok University Ataturk Road 7th km. Yozgat66900 Turkey

Fusarium wilt disease(FWD) caused by Fusarium oxysporum f. sp. ciceris (Padwick) is the most important disease affecting chickpea yield among biotic stresses. Fusarium wilt is a vascular disease that causes permanent lack of productivity in seeds and soil, manifests itself with structural symptoms such as wilting, drooping, dull green discoloration, yellowing, browning of the vascular system, and eventually causes the collapse of the entire plant. These symptoms can be seen 20-30 days after planting and can cause yield losses of up to 90% if necessary precautions are not taken. Although control of FWD in chickpea is challenging, early detection of FWD is critical for effective control and prevention of yield losses. Thus, advanced detection mechanisms need to be implemented to determine the severity level of FWD. Traditional machine learning algorithms based on hand-crafted features normally have poor performance due to their limited ability to represent complex plant features. Herein, vision Transformers (ViTs), which are state-of-the-art image processing methods, were employed in determining the infection type of FWD in chickpea. Pre-trained ViTs such as ViT_L_16, ViT_L_32 and ViT_H_14 were adopted, fine-tuned and tested. The reliability of the models was measured through both statistical performance measures and performance curves. Among the models, ViT_H_14 based model performed the best, followed by ViT_L_16 based. Although the overall accuracy of ViT_H_14 based model is 83%, it classifies Highly-Susceptible as 97% and Highly-Resistant as 85%. Moreover, the overall specificity of ViT_H_14 is 96%. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Plant diseases

来源：评论

学校读者我要写书评

暂无评论

ComAI: Enabling Lightweight, Collaborative Intelligence by Retrofitting vision DNNs 41

ComAI: Enabling Lightweight, Collaborative Intelligence by R...

引用

41st IEEE Conference on Computer Communications (IEEE INFOCOM)

作者： Jayarajah, Kasthuri Wanniarachchige, Dhanuja Abdelzaher, Tarek Misra, Archan Univ Maryland Baltimore Cty Baltimore MD 21228 USA Singapore Management Univ Singapore Singapore Univ Illinois Champaign IL USA

ISBN: (纸本)9781665458221

While Deep Neural Network (DNN) models have transformed machine vision capabilities, their extremely high computational complexity and model sizes present a formidable deployment roadblock for AIoT applications. We show that the complexity-vs-accuracy-vs-communication tradeoffs for such DNN models can be significantly addressed via a novel, lightweight form of "collaborative machine intelligence" that requires only runtime changes to the inference process. In our proposed approach, called ComAI, the DNN pipelines of different vision sensors share intermediate processing state with one another, effectively providing hints about objects located within their mutually-overlapping Field-of-Views (FoVs). CoMAI uses two novel techniques: (a) a secondary shallow ML model that uses features from early layers of a peer DNN to predict object confidence values in the image, and (b) a pipelined sharing of such confidence values, by collaborators, that is then used to bias a reference DNN's outputs. We demonstrate that CoMAI (a) can boost accuracy (recall) of DNN inference by 20-50%, (b) works across heterogeneous DNN models and deployments, and (c) incurs negligible processing, bandwidth and processing overheads compared to non-collaborative baselines.

关键词： Deep learning Runtime Computational modeling machine vision Conferences Pipelines Neural networks

来源：评论

学校读者我要写书评

暂无评论

Computer-aided analysis of radiological images for cancer diagnosis: performance analysis on benchmark datasets, challenges, and directions

引用

EJNMMI REPORTS 2024年第1期8卷 1-20页

作者： Alyami, Jaber King Abdulaziz Univ Fac Appl Med Sci Dept Radiol Sci Jeddah 21589 Saudi Arabia King Abdulaziz Univ King Fahd Med Res Ctr Jeddah 21589 Saudi Arabia King Abdulaziz Univ Smart Med Imaging Res Grp Jeddah 21589 Saudi Arabia King Abdulaziz Univ Ctr Modern Math Sci & its Applicat Med Imaging & Artificial Intelligence Res Unit Jeddah 21589 Saudi Arabia

Radiological image analysis using machine learning has been extensively applied to enhance biopsy diagnosis accuracy and assist radiologists with precise cures. With improvements in the medical industry and its technology, computer-aided diagnosis (CAD) systems have been essential in detecting early cancer signs in patients that could not be observed physically, exclusive of introducing errors. CAD is a detection system that combines artificially intelligent techniques with image processing applications thru computer vision. Several manual procedures are reported in state of the art for cancer diagnosis. Still, they are costly, time-consuming and diagnose cancer in late stages such as CT scans, radiography, and MRI scan. In this research, numerous state-of-the-art approaches on multi-organs detection using clinical practices are evaluated, such as cancer, neurological, psychiatric, cardiovascular and abdominal imaging. Additionally, numerous sound approaches are clustered together and their results are assessed and compared on benchmark datasets. Standard metrics such as accuracy, sensitivity, specificity and false-positive rate are employed to check the validity of the current models reported in the literature. Finally, existing issues are highlighted and possible directions for future work are also suggested.

关键词： Radiological images MRI Analysis Clinical research applications Cancer diagnosis Multi-organs Biopsy

来源：评论

学校读者我要写书评

暂无评论

image Captioning of Satellite images Using Transfer Learning and LSTM Blending 6th

Image Captioning of Satellite Images Using Transfer Learning...

引用

6th International Conference on Information Systems and Management Science, ISMS 2023

作者： Sharma, Rohit Shree, Aishwary katoch, Aayush Verma, Sourabh Singh Manipal University Jaipur Jaipur303007 India

ISBN: (纸本)9783031707889

In recent years, the field of image captioning has gained substantial attention, posing a complex challenge that necessitates the integration of computer vision (CV), natural language processing (NLP), and machine learning techniques. Our study introduces an advanced model designed to generate natural language descriptions for satellite images. We harnessed the capabilities of convolutional neural networks (CNNs) and utilized state-of-the-art architectures such as VGG-19, ResNet-50, DenseNet-201, and EfficientNet B7 as encoders to extract intricate features from satellite images. These encoded features were seamlessly incorporated into a Long Short-Term Memory (LSTM) network, which is a type of recurrent neural network (RNN), serving as a robust decoder. To assess the effectiveness of our approach, we conducted extensive experiments using two datasets: the Sydney Dataset and the RSIC Dataset, both widely recognized benchmarks in the field of image captioning. The comparative analysis results obtained from our evaluations not only show promise but also underscore the competitiveness of our model. By achieving significant milestones in satellite image captioning, this research not only contributes to the expanding body of knowledge in this domain but also opens avenues for real-world applications, spanning from environmental monitoring to urban planning. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.

关键词： Satellite imagery

来源：评论

学校读者我要写书评

暂无评论

Improved Surface Defect Classification from a Simple Convolutional Neural Network by image Preprocessing and Data Augmentation 1

引用

10th International Work-Conference on the Interplay Between Natural and Artificial Computation (IWINAC)

作者： Lopez de la Rosa, Francisco Moreno-Salvador, Lucia Gomez-Sirvent, Jose L. Morales, Rafael Sanchez-Reolid, Roberto Fernandez-Caballero, Antonio Univ Castilla La Mancha Inst Invest Informat Albacete Calle Invest 2 Albacete Spain Univ Castilla La Mancha Dept Ingn Elect Elect Automat & Comunicac Ave Espana S-N Albacete Spain Univ Castilla La Mancha Dept Sistemas Informat Ave Espana S-N Albacete Spain

ISBN: (数字)9783031611377

ISBN: (纸本)9783031611360;9783031611377

Convolutional neural networks (CNNs) play an important role in an increasing number of image processing tasks. There is an obvious demand to improve their classification performance and efficiency. Current research in this area tends to focus on developing increasingly complex models and algorithms to achieve this end. However, research into computer vision techniques and data augmentation tends to be neglected. This paper demonstrates that even a very simple CNN model achieves high performance in surface defect classification on the NEU dataset thanks to image preprocessing and data augmentation. The initial F1-score of 0.9646 without image preprocessing increases to 0.9727 when preprocessing is carried out. The simple CNN then achieves an F1-score of 0.9854 after data augmentation.

关键词： Surface defect classification image preprocessing data augmentation convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：