检索结果-内蒙古大学图书馆

International Conference on image processing, Computer vision and machine Learning (ICICML)

作者： Xue, Song Liu, Yongfeng Xu, Chao Li, Jun Army Acad Artillery & Air Def Dept Weap Engn Hefei Peoples R China Army Acad Artillery & Air Def Postgrad Team Hefei Peoples R China

ISBN: (纸本)9781665464680

At present, common object detection methods for missile borne images often perform poorly. Because the experimental data of missile borne images is difficult to obtain, the target is small, and the imaging environment is complex and random, it is a challenge to build an appropriate object detection model for such missile borne images. Based on the classic YOLOv5, the paper constructs an aerial platform image target detection model YOLO v5mb, which is suitable for missile borne images. The model can accurately detect targets in single-mode visible or infrared missile borne images. In addition, the fusion layer architecture in YOLO v5mb makes it suitable for multi-mode visible and infrared missile borne fusion images object detection.

关键词： Missile-borne image infrared and visible image fusion Object Detection

来源：评论

学校读者我要写书评

暂无评论

Near-optimal clustering in the k-machine model

引用

THEORETICAL COMPUTER SCIENCE 2022年 899卷 80-97页

作者： Bandyapadhyay, Sayan Inamdar, Tanmay Pai, Shreyas Pemmaraju, Sriram, v Univ Bergen Dept Informat Bergen Norway Univ Iowa Dept Comp Sci Iowa City IA 52242 USA

The clustering problem, in its many variants, has numerous applications in operations research and computer science (e.g., in applications in bioinformatics, image processing, social network analysis, etc.). As sizes of data sets have grown rapidly, researchers have focused on designing algorithms for clustering problems in models of computation suited for large-scale computation such as MapReduce, Pregel, and streaming models. The k-machine model (Klauck et al., (SODA 2015) [8]) is a simple, message-passing model for large-scale distributed graph processing. This paper considers three of the most prominent examples of clustering problems: the uncapacitated facility location problem, the p-median problem, and the p-center problem and presents O (1)-factor approximation algorithms for these problems running in (O) over tilde (n/k) rounds in the k-machine model. These algorithms are optimal up to polylogarithmic factors because this paper also shows (Omega) over tilde (n/k) lower bounds for obtaining polynomial-factor approximation algorithms for these problems. These are the first results for clustering problems in the k-machine model. We assume that the metric provided as input for these clustering problems is only implicitly provided, as an edge-weighted graph and in a nutshell, our main technical contribution is to show that constant-factor approximation algorithms for all three clustering problems can be obtained by learning only a small portion of the input metric. (C) 2021 Elsevier B.v. All rights reserved.

关键词： Clustering Facility location k-median k-center k-machine model Large-scale clustering Distributed clustering

来源：评论

学校读者我要写书评

暂无评论

video Based Car Parking Management and Monitoring Using Computer vision and machine Learning

Video Based Car Parking Management and Monitoring Using Comp...

引用

2025 International Conference on Multi-Agent Systems for Collaborative Intelligence, ICMSCI 2025

作者： Sujitha, Bulla Ponraj, Anitha Chakkarapani, v. Parabrahmachari, Sriram Lakshmi, T.v. Hyma Annamani, T. V. R. Siddhartha Engineering College Department of Ece AP Vijayawada India Kalasalingam Academy of Research and Education Kalasalingam University Department of Computer Science and Engineering Tamil Nadu Krishnankoil India School of Electrical and Electronics Sathyabama Institute of Science and Technology Tamil Nadu Chennai India Guru Nanak institutions Department of Cse Technical Campus Telangana Ibrahimpatnam India S.R.K.R. Engineering College Department of Ece A.P Bhimavaram India Anurag University Department of Ece Telangana Hyderabad India

ISBN: (纸本)9798331509828

Each day, countless individuals around the globe are affected by the extensive and complex problem of traffic congestion. This raising challenge is becoming increasingly substantial across the country, driven by factors such as rising population numbers, speedy urban development, insufficient infrastructure, and changing socioeconomic requirements. This paper depicts a novel global video-based parking approach designed to boost the supervising and management of parking spaces using real-time aerial statistics. Utilizing advanced image processing and computer vision procedures, the system essentially detects and examines the occupancy position of parking slots. In our realization, we assessed a total of 48 parking spaces, positively identifying 7 as unused while 41 were occupied. The application of Python, OpenCv, and machine learning assisted real-time monitoring, allowing precise counting and expose of parking availability. The procedure contained critical procedures, including Gaussian blur filtering to minimize noise and improve edge detection, followed by binary thresholding to describe vehicles and parking spots from the background. Furthermore, a median filter was utilized to boost the clarity of vehicle contours. These image enhancement methods significantly expanded detection accuracy and facilitated precise recognition of the Region of Interest (ROI). The results show the system's ability to provide reliable, real-time data for dynamic parking administration, thereby focusing the challenges of urban parking. © 2025 IEEE.

关键词： Traffic congestion

来源：评论

学校读者我要写书评

暂无评论

AI powered detection and assessment of onychomycosis: A spotlight on yellow and deep learning

JEADV CLINICAL PRACTICE

引用

JEADv CLINICAL PRACTICE 2025年第1期4卷 156-165页

作者： Agostini, C. Ranjan, R. Molnarova, M. Hadzic, A. Kubesh, O. Schnidar, v. Schnidar, H. SCARLETRED Holding GmbH Maria Jacobi Gasse 18th Floor A-1030 Vienna Austria

BackgroundDespite significant advances in computer-aided diagnostics, onychomycosis, a widespread fungal nail infection, lacks an automated approach for objective analysis and *** study aimed to develop and validate automated machine learning models to accurately detect and classify onychomycosis-affected areas in *** images in this study were captured using the Scarletred (R) vision mobile App and SkinPatch, a CE certified medical device system working seamlessly together to deliver auto-color calibrated, high-resolution clinical images. Considering a total of 1687 images from 440 subjects, the research explores various degrees of onychomycosis and evaluates the infection extent in the toenails detected. We developed an advanced machine learning algorithm for precise segmentation and classification of onychomycosis-affected toenails, utilizing expert annotations and advanced post-processing techniques. Additionally, an analysis of nail growth was performed, and a comparison graph with the percentage of infection was *** advanced machine learning algorithms, we successfully detected toenails, enabling detailed analysis of intricate structures within the images. We achieved a final validation loss of 0.0236 and an F1 score of 0.8566 for accurate toenail detection, while the Random Forest algorithm demonstrated 81% accuracy in classifying and distinguishing between infected and healthy toenail areas. Our applied superpixel method furthermore improved the algorithm's precision in identifying the infected *** AI-powered image analysis method, initially focused on the big toe's toenail, shows great promise for broader validation on comprehensive datasets, enabling more detailed assessments of onychomycosis severity and disease dynamics. The potential impact of limited patient diversity, particularly with darker skin tones, needs further assessment. Proven to measure nail growth and assess trea

关键词： Artificial Intelligence (AI) computer-aided diagnostics deep learning dermatology eHealth machine learning (ML) nail fungus neural network onychomycosis Scarletred (R) vision Standardized Erythema value (SEv) toenail detection

来源：评论

学校读者我要写书评

暂无评论

IPMv 2023 - Proceedings of 2023 5th International Conference on image processing and machine vision

IPMV 2023 - Proceedings of 2023 5th International Conference...

引用

5th International Conference on image processing and machine vision, IPMv 2023

ISBN: (纸本)9781450397926

The proceedings contain 16 papers. The topics discussed include: performance evaluation of recent object detection models for traffic safety applications on edge;tracking of artillery shell using optical flow;action recognition with non-uniform key frame selector;a view direction-driven approach for automatic room mapping in mixed reality;automatic gait gender classification using convolutional neural networks;deep 3D-2D convolutional neural networks combined with Mobinenetv2 for hyperspectral image classification;attention based BiGRU-2DCNN with hunger game search technique for low-resource document-level sentiment classification;strategies of multi-step-ahead forecasting for chaotic time series using autoencoder and LSTM neural networks: a comparative study;semi-supervised defect segmentation with uncertainty-aware pseudo-labels from multi-branch network;and security analysis of visual based share authentication and algorithms for invalid shares generation in malicious model.

关键词：

来源：评论

学校读者我要写书评

暂无评论

IMPLICIT CHANNEL LEARNING FOR machine LEARNING applications IN 6G WIRELESS NETWORKS

IMPLICIT CHANNEL LEARNING FOR MACHINE LEARNING APPLICATIONS ...

引用

IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Elbir, Ahmet M. Shi, Wei Mishra, Kumar vijay Papazafeiropoulos, Anastasios K. Chatzinotas, Symeon Univ Luxembourg Interdisciplinary Ctr Secur Reliabil & Trust Luxembourg Luxembourg Carleton Univ Sch Informat Technol Ottawa ON Canada US DEVCOM Army Res Lab Adelphi MD USA Univ Hertfordshire Hatfield Herts England

ISBN: (纸本)9798350302615

With the deployment of the fifth generation (5G) wireless systems gathering momentum across the world, possible technologies for 6G are under active research discussions. In particular, the role of machine learning (ML) in 6G is expected to enhance and aid emerging applications such as virtual and augmented reality, vehicular autonomy computer vision and internet of everything. This will result in large segments of wireless data traffic comprising image, video and speech. The ML algorithms process these for classification/recognition/estimation through the learning models located on cloud servers. This requires wireless transmission of data from edge devices to the cloud server. Channel estimation, handled separately from recognition step, is critical for accurate learning performance. Toward combining the learning for both channel and the ML data, we introduce implicit channel learning to perform the ML tasks without estimating the wireless channel. Here, the ML models are trained with channel-corrupted datasets in place of nominal data. Without channel estimation, the proposed approach exhibits approximately 60% improvement in image and speech classification tasks for diverse scenarios such as millimeter wave and IEEE 802.11p vehicular channels.

关键词： machine learning channel estimation artificial intelligence wireless communications

来源：评论

学校读者我要写书评

暂无评论

image Captioning with Reinforcement Learning 2

Image Captioning with Reinforcement Learning

引用

2nd IEEE International Conference on Computer vision and machine Intelligence, CvMI 2023

作者： verma, Anand Agarwal, Saurabh Arya, K.v. Petrlik, Ivan Esparza, Roberto Rodriguez, Ciro ABV-IIITM Gwalior Department of Information Technology Madhya Pradesh 474015 India ABV-Indian Institute of Information Technology and Management Multimedia and Information Security Research Group Department of Computer Science and Engineering Gwalior474015 India National University Federico Villarreal Faculty of Industrial and Systems Engineering Lima Peru National University Mayor de San Marcos Faculty of Systems Engineering and Informatics Peru

ISBN: (纸本)9798350305142

image captioning involves generating a natural language description that accurately represents the content and context of an image. To achieve this, image captioning utilises various machine learning techniques and fields, such as computer vision and natural language processing. In the field of image captioning, a lot of advances have been made with encoder-decoder models and reinforcement learning algorithms. However, there are still problems of imbalance between testing and training, as reinforcement learning only handles single comparator metrics such as CIDEr, SPICE, and BLEU and could not perform better in multiple metrics at once. Which is why a lack of diversity can be seen in generated captions. This idea proposes a general technique for collaborative updating that can bridge the gap between evaluation measures and test metrics to produce captions that are more human-like. To increase the precision of image captions, the approach involves using a compiled reward system that considers multiple evaluation metrics to compare the generated sentence with the provided sentences. We will evaluate the model's performance and the reward updating process on standard datasets like MS COCO. © 2023 IEEE.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Building Detection Using very High Resolution SAR images with Multi-Direction Based on Weighted-Morphological Indexes 12

Building Detection Using Very High Resolution SAR Images wit...

引用

12th Iranian/2nd International Conference on machine vision and image processing, MvIP 2022

作者： Amjadipour, Fateme Ghassemian, Hassan Imani, Maryam Tarbiat Modares University Image Processing and Information Analysis Lab Tehran Iran

ISBN: (纸本)9781665412162

Today, technological advancement in production of radar images can be seen with high spatial resolution and also the availability of these images' significant growth in interpretation and processing of high-resolution radar images. The building extraction from urban areas is one of the most challenging applications in vHR SAR image, which is used to estimate the population and urban development. Detection of individual buildings in the urban context is highly considered by researchers due to complexity of interpreting radar images in these fields. On the other hand, one of the main issues in the complexity of the scatters received from buildings is change in direction of the building relative to the horizon, which is correlated with the look angle. Other influential parameters are geometric distortions, which include layover and shadow effects. In some cases, the effect of shadow is an auxiliary parameter in detection of these targets that increases accuracy of the detection. In this paper, we intend to extract the building from high spatial resolution SAR images using fuzzy fusion of two morphological indicators, SI and DI, which represent the shadow and bright area, respectively. Due to the effect of SAR imaging geometry on ground targets, different sizes and directions of structural elements were applied to the image. The use of indicators weights with different sizes is proposed in this work. The Detection Ratio of experiment of TerraSAR-X image has a result of 95.3%. © 2022 IEEE.

关键词： Morphology

来源：评论

学校读者我要写书评

暂无评论

RASHT: A Partially Reconfigurable Architecture for Efficient Implementation of CNNs

引用

IEEE TRANSACTIONS ON vERY LARGE SCALE INTEGRATION (vLSI) SYSTEMS 2022年第7期30卷 860-868页

作者： Darbani, Paria Rohbani, Nezam Beitollahi, Hakem Lotfi-Kamran, Pejman Iran Univ Sci & Technol Sch Comp Engn Tehran *** Iran Inst Res Fundamental Sci IPM Sch Comp Sci Tehran 193955531 Iran

Convolutional neural networks (CNNs) are widely used in machine learning (ML) applications such as image processing. CNN requires heavy computations to provide significant accuracy for many ML tasks. Therefore, the efficient implementations of CNNs to improve performance using limited resources without accuracy reduction is a challenge for ML systems. One of the architectures for the efficient execution of CNNs is the array-based accelerator, that consists of an array of similar processing elements (PEs). The array accelerators are popular as high-performance architecture using the features of parallel computing and data reuse. These accelerators are optimized for a set of CNN layers, not for individual layers. Using the same accelerator dimension size to compute all CNN layers with varying shapes and sizes leads to the resource underutilization problem. We propose a flexible and scalable architecture for array-based accelerator that increases resource utilization by resizing PEs to better match the different shapes of CNN layers. The low-cost partial reconfiguration improves resource utilization and performance, resulting in a 23.2% reduction in computational times of GoogLeNet compared to the state-of-the-art accelerators. The proposed architecture decreases the on-chip memory access rate by 26.5% with no accuracy loss.

关键词： Computer architecture Convolutional neural networks Arrays Resource management System-on-chip Computational modeling very large scale integration Array accelerator convolutional neural network (CNN) image processing and computer vision machine learning (ML) reconfigurable hardware

来源：评论

学校读者我要写书评

暂无评论

FashionGPT: A Large vision-Language Model for Enhancing Fashion Understanding 33rd

FashionGPT: A Large Vision-Language Model for Enhancing Fash...

引用

33rd International Conference on Artificial Neural Networks and machine Learning (ICANN)

作者： Song, Duanxiao Gao, Dehong Liu, Gongshen Li, Xiaoyong Shanghai Jiao Tong Univ Shanghai Peoples R China Northwestern Polytech Univ Xian Peoples R China

ISBN: (纸本)9783031723438;9783031723445

Fashion understanding is a challenging multi-modal task of interpreting multi aspects of fashion images. While traditional computer vision or multi-modal algorithms fall short in providing a comprehensive understanding, Large vision-Language Model (LvLM) offers a new approach. However, directly using LvLMs presents four major limitations, highlighting the need for a fashion-specific LvLM. Existing fashion datasets also reveal limitations in providing a coherent natural input that fits the LvLMs. To address this bottleneck, we introduce the FUND dataset featuring meticulously annotated textual descriptions for fashion images. Specifically, we build a fashion knowledge base and collect fashion images in various categories online. By leveraging image segmentation model and GPT4, we refine the pre-annotations through manual modifications. Through instruct-tuning with FUND, we develop FashionGPT, a GPT-assisted LvLM based on a solid architecture with exceptional performance on fashion understanding. It is capable of generating coherent and multi-aspect descriptions for fashion images and greatly alleviates the four limitations. Extensive experiments quantitatively and qualitatively demonstrate the effectiveness of FashionGPT and the benefits of FUND, and showcase the broad applications in more tasks.

关键词： Fashion Understanding Large vision-Language Model Instruct Tuning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：