检索结果-内蒙古大学图书馆

2nd IEEE International Conference on image processing and Computer applications, ICIPCA 2024

作者： Hu, Xixing Zheng, Weiyan Su, Fang Zhu, Chaoyue Zhejiang Dayou Industrial Co. Ltd. Hangzhou Science and Technology Development Branch Hangzhou China

ISBN: (纸本)9798350360240

In this paper, a multi-feature detection method based on graph cut for photovoltaic panels is proposed. Combined with multi-dimensional features such as optical flow field and light intensity, an interactive feature recognition method is constructed. The combination of each feature forms a pixel feature vector and is fed into a random forest classifier. The interaction information between adjacent pixel pairs is extracted. The energy function and adaptive algorithm of object-based occlusion detection are constructed. This Paper build an undirected graph. The object is segmented by the graph cutting principle. The findings from the experiments indicate that the new technique surpasses traditional methods in precision and offers enhanced responsiveness for detecting obstructions. © 2024 IEEE.

关键词： Optical flows

来源：评论

学校读者我要写书评

暂无评论

Fundus Imaging-Based Healthcare: Present and Future

引用

ACM TRANSACTIONS ON COMPUTING FOR HEALTHCARE 2023年第3期4卷 1-34页

作者： Kumar, Vijay Paul, Kolin IIT Delhi Amar Nath & Shashi Khosla Sch Informat Technol Delhi 110016 India IIT Delhi Dept Comp Sci & Engn New Delhi 110016 Delhi India

A fundus image is a two-dimensional pictorial representation of the membrane at the rear of the eye that consists of blood vessels, the optical disc, optical cup, macula, and fovea. Ophthalmologists use it during eye examinations to screen, diagnose, and monitor the progress of retinal diseases or conditions such as diabetes, age-marked degeneration (AMD), glaucoma, retinopathy of prematurity (ROP), and many more ocular ailments. Developments in ocular optical systems, image acquisition, processing, and management techniques over the past few years have contributed to the use of fundus images to monitor eye conditions and other related health complications. This review summarizes the various state-of-the-art technologies related to the fundus imaging device, analysis techniques, and their potential applications for ocular diseases such as diabetic retinopathy, glaucoma, AMD, cataracts, and ROP. We also present potential opportunities for fundus imaging-based affordable, noninvasive devices for scanning, monitoring, and predicting ocular health conditions and providing other physiological information, for example, heart rate (HR), blood components, pulse rate, heart rate variability (HRV), retinal blood perfusion, and more. In addition, we present different types of technological, economical, and sociological factors that impact the growth of the fundus imaging-based technologies for health monitoring.

关键词： Fundus image image analysis ophthalmology healthcare information systems medical technologies computer vision and machine learning eye diseases

来源：评论

学校读者我要写书评

暂无评论

HRInversion: High-Resolution GAN Inversion for Cross-Domain image Synthesis

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2023年第5期33卷 2147-2161页

作者： Zhou, Peng Xie, Lingxi Ni, Bingbing Liu, Lin Tian, Qi Shanghai Jiao Tong Univ Dept Elect Engn Shanghai 200240 Peoples R China Huawei Cloud BU Guangdong518129 Shenzhen Peoples R China Univ Sci & Technol China Dept Elect Engn & Informat Sci Hefei 230052 Anhui Peoples R China

We investigate GAN inversion problems of using pre-trained GANs to reconstruct real images. Recent methods for such problems typically employ a VGG perceptual loss to measure the difference between images. While the perceptual loss has achieved remarkable success in various computer vision tasks, it may cause unpleasant artifacts and is sensitive to changes in input scale. This paper delivers an important message that algorithm details are crucial for achieving satisfying performance. In particular, we propose two important but undervalued design principles: (i) not down-sampling the input of the perceptual loss to avoid high-frequency artifacts;and (ii) calculating the perceptual loss using convolutional features which are robust to scale. Integrating these designs derives the proposed framework, HRInversion, that achieves superior performance in reconstructing image details. We validate the effectiveness of HRInversion on a cross-domain image synthesis task and propose a post-processing approach named local style optimization (LSO) to synthesize clean and controllable stylized images. For the evaluation of the cross-domain images, we introduce a metric named ID retrieval which captures the similarity of face identities of stylized images to content images. We also test HRInversion on non-square images. Equipped with implicit neural representation, HRInversion applies to ultra-high resolution images with more than 10 million pixels. Furthermore, we show applications of style transfer and 3D-aware GAN inversion, paving the way for extending the application range of HRInversion.

关键词： image reconstruction image resolution Generative adversarial networks Task analysis Semantics Generators image synthesis GAN inversion perceptual loss image synthesis

来源：评论

学校读者我要写书评

暂无评论

MFA-DAF: Unsupervised Multimodal Medical image Fusion via Multiscale Fourier Attention and Detail-Aware Fusion Strategy 2

MFA-DAF: Unsupervised Multimodal Medical Image Fusion via Mu...

引用

2nd International Conference on image processing, Computer vision and machine Learning, ICICML 2023

作者： Xie, Xinyu Zhang, Xiaozhi Xiong, Dongping Ouyang, Lijun University of South China School of Electrical Engineering Hengyang China University of South China School of Computing/Software Hengyang China

ISBN: (纸本)9798350331417

Multimodal medical image fusion is vital for extracting complementary information and generating comprehensive images in clinical applications. However, existing deep learning-based fusion approaches face challenges in effectively utilizing frequency-domain information, designing appropriate integration strategies and modelling long-range context correlation. To address these issues, we propose a novel unsupervised multimodal medical image fusion method called Multiscale Fourier Attention and Detail-Aware Fusion (MFA-DAF). Our approach employs a multiscale Fourier attention encoder to extract rich features, followed by a detail-aware fusion strategy for comprehensive integration. The fusion image is obtained using a nested connected Fourier attention decoder. We adopt a two-stage training strategy and design new loss functions for each stage. Experiment results demonstrate that our model outperforms other state of the art methods, producing fused images with enhanced texture information and superior visual quality. © 2023 IEEE.

关键词： Detail-aware Fine-grained information Fourier Transform Multimodal medical image fusion

来源：评论

学校读者我要写书评

暂无评论

LoLI-Street: Benchmarking Low-Light image Enhancement and Beyond 17th

LoLI-Street: Benchmarking Low-Light Image Enhancement and B...

引用

17th Asian Conference on Computer vision, ACCV 2024

作者： Islam, Md Tanvir Alam, Inzamamul Woo, Simon S. Anwar, Saeed Lee, Ik Hyun Muhammad, Khan Department of Software Sungkyunkwan University Suwon Korea Republic of The Australian National University Canberra Australia Department of Mechatronics Engineering Tech University of Korea Siheung Korea Republic of Department of Human-AI Interaction Sungkyunkwan University Seoul Korea Republic of

ISBN: (纸本)9789819609161

Low-light image enhancement (LLIE) is essential for numerous computer vision tasks, including object detection, tracking, segmentation, and scene understanding. Despite substantial research on improving low-quality images captured in underexposed conditions, clear vision remains critical for autonomous vehicles, which often struggle with low-light scenarios, signifying the need for continuous research. However, paired datasets for LLIE are scarce, particularly for street scenes, limiting the development of robust LLIE methods. Despite using advanced transformers and/or diffusion-based models, current LLIE methods struggle in real-world low-light conditions and lack training on street-scene datasets, limiting their effectiveness for autonomous vehicles. To bridge these gaps, we introduce a new dataset "LoLI-Street" (Low-Light images of Streets) with 33k paired low-light and well-exposed images from street scenes in developed cities, covering 19k object classes for object detection. LoLI-Street dataset also features 1,000 real low-light test images for testing LLIE models under real-life conditions. Furthermore, we propose a transformer and diffusion-based LLIE model named "TriFuse". Leveraging the LoLI-Street dataset, we train and evaluate our TriFuse and SOTA models to benchmark on our dataset. Comparing various models, our dataset’s generalization feasibility is evident in testing across different mainstream datasets by significantly enhancing images and object detection for practical applications in autonomous driving and surveillance systems. Complete code and dataset is available on https://***/tanvirnwu/TriFuse. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： image segmentation

来源：评论

学校读者我要写书评

暂无评论

Voice Based Navigation Assistance for the Blind 7

Voice Based Navigation Assistance for the Blind

引用

7th International Conference on Circuit Power and Computing Technologies, ICCPCT 2024

作者： Chidvilash, Nallabothu Shyam, Padala Praneeth, Kopparapu Prasad Goud, Thokku Ram Angel, T.S. Amrita Vishwa Vidyapeetham Department of Electrical and Electronics Engineering India

ISBN: (纸本)9798350372816

This paper outlines the creation of a blind assistance application that detects path in real-time and aids the visually challenged people to navigate their surroundings safely. The web application uses flask framework to communicate between the client and the server. The main part of the application is YOLO object detection model capable of identifying objects in real time from a live image stream. Upon capturing and processing the images from client's webcam, the webapp sends directional instructions to client and the client's browser process these instructions to play appropriate audio files which are loaded from the server. The web applications integration of image processing, audio feed backs and web-based interaction offers an accessible solution for the visually challenged people in their daily lives. © 2024 IEEE.

关键词： vision aids

来源：评论

学校读者我要写书评

暂无评论

Precision Agriculture Advancements: A Comprehensive Integrated System for Disease Prediction and Crop Yield Estimation Using image Analysis and Environmental Data 2

Precision Agriculture Advancements: A Comprehensive Integrat...

引用

2nd International Conference on Artificial Intelligence and machine Learning applications, AIMLA 2024

作者： Nithiya, A. Navina, N. Thoshitha, D. Suvetha, R. Thirilosana, J. M. Kumarasamy College of Engineering Department of Information Technology Tamilnadu Karur639113 India

ISBN: (数字)9798350349221

ISBN: (纸本)9798350349221

The primary problem facing agriculture, which is essential to ensuring the world's food security, is maximizing crop productivity while reducing the effects of plant diseases. Advanced technologies have the potential to completely transform agricultural methods, particularly in the parts of computer vision and machine learning. This study uses meteorological datasets and fruit image analysis to create an integrated agricultural decision support system for crop yield estimation and disease prediction. By offering early plant disease detection and precise crop yield estimates, the system seeks to improve precision agriculture techniques. A variety of datasets with plant photos labelled with disease information are gathered for the study, and meteorological data is integrated to capture environmental variables. The technology includes advanced image processing techniques to extract relevant features from plant pictures. The suggested method analyses images using a convolutional neural network technique to forecast the disease in impacted fruits. Make recommendations for natural fertilizers based on the ailment being suffered. The Multilayer Perceptron algorithm is used to train the model using a large dataset that contains historical meteorological data, allowing it to identify patterns and connectionsbetween environmental conditions. Lastly, farmers receive an SMS notice with prediction specifics. © 2024 IEEE.

关键词： Fruits

来源：评论

学校读者我要写书评

暂无评论

12th EAI International Conference on Context-Aware Systems and applications, ICCASA 2023

12th EAI International Conference on Context-Aware Systems a...

引用

12th EAI International Conference on Context-Aware Systems and applications, ICCASA 2023

ISBN: (纸本)9783031588778

The proceedings contain 14 papers. The special focus in this conference is on Context-Aware Systems and applications. The topics include: User-Based Collaborative Filtering Multi-criteria Recommender System Based on Interaction Between Criteria, Criteria Set with Choquet Integral;application of machine Learning Techniques to Classify Intention to Pay for Forest Ecosystem Services;Anomaly Detection in Univariate Time Series: HOT SAX vesus LSTM-Based Method;application of machine Learning Models for Predicting Glucose-Level in the Pure Fluid with Algorithm for Reducing Data Dimension Based on Data Series Extraction;comprehensive Survey On Remote Sensing image processing Techniques for image Classification;item-Based Energy Clustering Recommendation;General Evaluation of EtherCAT-Based Techniques in Various Industrial Systems: Review and applications;towards an IoT-Based Unmanned Surface Vehicle Design for Environment Monitoring in Mekong Delta;3D CNN with BERT and vision Transformer for Video Recognition;Identify Tumors on Lung CT images;a Context-Aware Application to Monitor the Air Quality;applying Guided Discovery Learning to Enhance the Achievement of Information Technology Team.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Generation of flattop beams from a distorted optical field by the wavefront shaping technique

引用

JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS image SCIENCE AND vision 2023年第10期40卷 1926-1932页

作者： Sun, Hang Li, Haoran Chen, Ziyang Wu, Xiaoyan Liu, Guodong Pu, Jixiong Huaqiao Univ Coll Informat Sci & Engn Fujian Key Lab Light Propagat & Transformat Xiamen 361021 Fujian Peoples R China China Acad Engn Phys Inst Fluid Phys Mianyang 621900 Peoples R China China Acad Engn Phys Key Lab Sci & Technol High Energy Laser Mianyang 621900 Peoples R China

Uniform laser beams with controllable patterns are crucial for various applications, including laser processing and inertial confinement fusion. While some methods have been proposed to generate flattop beams, they often require complex optical systems that can become ineffective because of the misalignment of the system or the imperfection of optical elements. To overcome these issues, we utilized feedback-based wavefront shaping (FWS) technology to generate flattop beams with desired patterns from a disordered light. To solve the multi-goal optimization problem, we propose some modifications based on the Non-dominated Sorting Genetic Algorithm ii (NSGA2) and success-fully generate focal beams with a uniform intensity distribution and controllable beam shape from the disordered light field. (c) 2023 Optica Publishing Group

关键词： Diffractive optical elements Genetic algorithms Laser beams Laser fusion Laser materials processing Optical systems

来源：评论

学校读者我要写书评

暂无评论

Multi-Modal Learning with Joint image-Text Embeddings and Decoder Networks 7

Multi-Modal Learning with Joint Image-Text Embeddings and De...

引用

IEEE 7th International Conference on Industrial Cyber-Physical Systems (ICPS)

作者： Chemmanam, Ajai John Jose, Bijoy A. Moopan, Asif Cochin Univ Sci & Technol CPS Lab Dept Elect Cochin Kerala India Vuelogix Technol Pvt Ltd Kochi Kerala India

ISBN: (纸本)9798350363029;9798350363012

Advances in machine learning and neural networks have transformed natural language processing (NLP) and computer vision (CV) applications. Recent research efforts have begun to bridge the gap between the two domains. In this work, we propose a semi supervised Multi-Modal Encoder Decoder Network (MMEDN) to capture the relationship between images and textual descriptions, allowing us to generate meaningful descriptions of images and retrieve images from a database using cross-modality search. The semi-supervised training approach, which combines ground truth text descriptions and pseudotext generated by the text decoder within the model, requires far fewer image-text pairs in the training data and can directly add new raw images without manual text labelling for training. This approach is particularly useful for active learning environments, where labels are expensive and hard to obtain. We show that our model performs well with qualitative evaluations. We applied our model for finding images of a person from large databases and generating descriptions of people involved in an event for adding to an automatically generated report. The model was able to retrieve relevant images and generate accurate descriptions, demonstrating its applicability to more practical use cases.

关键词： Multi-modal learning Cross-modal retrieval Encoder-decoder architectures Computer vision Natural Language processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：