检索结果-内蒙古大学图书馆

real-time detection of wood defects based on SPP-improved YOLO algorithm

MULtimeDIA TOOLS AND APPLICATIONS 2023年第14期82卷 21031-21044页

作者： Cui, Yuming Lu, Shuochen Liu, Songyong Jiangsu Normal Univ Sch Mechatron Engn Xuzhou 221116 Peoples R China China Univ Min & Technol Sch Mechatron Engn Xuzhou 221116 Peoples R China

Wood processing is one of the most widely used in agriculture and industry. Low precision and high time delay of machine learning in wood defect detection are currently the main factors restricting the production efficiency and product quality of the wood processing industry. An SPP-improved deep learning method was proposed to detect wood defects based on the basic framework of the YOLO V3 network to improve accuracy and real-time performance. The extended dataset was firstly established by image data enhancement and preprocessing based on the limited samples of the wood defect dataset. Anchor box scale re-clustering of the wood defect dataset was carried out according to the defect features. The spatial pyramid pooling (SPP) network was applied to improve the feature pyramid (FP) network in YOLO V3. The validity and real-time performance of the proposed algorithm were verified by a randomly selected test set. The results show that the overall detection accuracy rate on the wood defect test dataset reaches 93.23% while the detection time for each image is within 13 ms.

关键词： Transfer learning Wood defects detection real-time detection Full convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

Enhancing Contrastive learning With Positive Pair Mining for Few-Shot Hyperspectral image Classification

引用

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2024年 17卷 8509-8526页

作者： Braham, Nassim Ait Ali Mairal, Julien Chanussot, Jocelyn Mou, Lichao Zhu, Xiao Xiang Tech Univ Munich TUM Chair Data Sci Earth Observat SiPEO D-80333 Munich Germany German Aerosp Ctr DLR Remote Sensing Technol Inst IMF D-82234 Wessling Germany Univ Grenoble Alpes Inria CNRS Grenoble INPLJK F-38000 Grenoble France

In recent years, deep learning has emerged as the dominant approach for hyperspectral image (HSI) classification. However, deep neural networks require large annotated datasets to generalize well. This limits the applicability of deep learning for real-world HSI classification problems, as manual labeling of thousands of pixels per scene is costly and time consuming. In this article, we tackle the problem of few-shot HSI classification by leveraging state-of-the-art self-supervised contrastive learning with an improved view-generation approach. Traditionally, contrastive learning algorithms heavily rely on hand-crafted data augmentations tailored for natural imagery to generate positive pairs. However, these augmentations are not directly applicable to HSIs, limiting the potential of self-supervised learning in the hyperspectral domain. To overcome this limitation, we introduce two positive pair-mining strategies for contrastive learning on HSIs. The proposed strategies mitigate the need for high-quality data augmentations, providing an effective solution for few-shot HSI classification. Through extensive experiments, we show that the proposed approach improves accuracy and label efficiency on four popular HSI classification benchmarks. Furthermore, we conduct a thorough analysis of the impact of data augmentation in contrastive learning, highlighting the advantage of our positive pair-mining approach.

关键词： Contrastive learning hyperspectral image (HSI) classification positive pair mining self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

Autonomous Landing on a Moving Platform Using Vision-Based deep Reinforcement learning

引用

IEEE ROBOTICS AND AUTOMATION LETTERS 2024年第5期9卷 4575-4582页

作者： Ladosz, Pawel Mammadov, Meraj Shin, Heejung Shin, Woojae Oh, Hyondong Univ Manchester Dept Mech Aerosp & Civil Engn Manchester M13 9PL England Ulsan Natl Inst Sci & Technol Dept Mech Engn Ulsan 44610 South Korea

This letter describes autonomous landing of an unmanned aircraft system on a moving platform using vision and deep reinforcement learning. Landing on the moving platform offers several benefits, such as more mission flexibility and reduced flight time. In particular, the end-to-end vision approach (i.e., an input to the reinforcement learning is a raw image from the camera) with the deep regularized Q algorithm and custom designed reward is utilized. The custom reward was specifically devised to encourage useful feature extraction from the state space. Additionally, the proposed reinforcement learning algorithm has full 3D velocity control including the vertical channel. The simulation results show that the proposed approach can outperform existing approaches which use high-level extracted features (such as relative position and velocity of the landing pad). The simulation results are then successfully transferred to the real-world experiment by utilizing domain randomization.

关键词： AI-enabled robotics aerial systems: Applications reinforcement learning vision-based navigation

来源：评论

学校读者我要写书评

暂无评论

Space-time image velocimetry in blurred scenes based on BSTI-DCGAN data augmentation

引用

MEASUREMENT SCIENCE AND TECHNOLOGY 2024年第8期35卷 085302-085302页

作者： Hu, Qiming Jiang, Dongjin Zhang, Guo Zhang, Ya Wang, Jianping Kunming Univ Sci & Technol Fac Informat Engn & Automat Kunming 650500 Peoples R China Minist Water Resources Nanjing Inst Water Resources & Hydrol Automat Nanjing 210000 Peoples R China

Due to the limited sample quantity and the complex data collection process of the blurred space-time image (BSTI) dataset, the deep learning-based space-time image velocimetry (STIV) results in larger errors when applied to blurry videos. To enhance the measurement accuracy, we propose the use of STIV in blurred scenes based on BSTI-deep convolutional generative adversarial network (DCGAN) data augmentation. Firstly, BSTI-DCGAN is developed based on the DCGAN. This network utilizes a bilinear interpolation-convolution module for upsampling and integrates coordinated attention and multi-concatenation attention to enhance the resemblance between generated and real images. Next, further expanding the dataset by using artificially synthesized space-time images subsequently, all space-time images are transformed into spectrograms to create a training dataset for the classification network. Finally, the primary spectral direction is detected using the classification network. The experimental results indicate that our approach effectively augments the dataset and improves the accuracy of practical measurements. Under the condition of video blur, the relative errors of the average flow velocity and discharge are 3.92% and 2.72%, respectively.

关键词： STIV data augmentation DCGAN image velocimetry

来源：评论

学校读者我要写书评

暂无评论

Hybrid ultra-short term solar irradiation forecasting using resource-efficient multi-step long-short term memory

引用

RENEWABLE ENERGY 2025年 247卷

作者： Barancsuk, Lilla Groma, Veronika Kocziha, Barnabas Budapest Univ Technol & Econ Dept Elect Power Engn Muegyet Quay 3 H-1111 Budapest Hungary HUN REN Ctr Energy Res Konkoly Thege Miklos St 29-33 H-1121 Budapest Hungary

Accurate forecasting of solar irradiance is a key tool for optimizing the efficiency and service quality of solar energy systems. In this paper, a novel approach is proposed for multi-step solar irradiation forecasting using deep learning models optimized for low computational resource environments. Traditional forecasting models often lack accuracy, and modern, deep-learning based models, while accurate, require substantial computational resources, making them impractical for real-time or resource-constrained environments. Our method uniquely combines dimensionality reduction via image processing with an LSTM-based architecture, achieving significant input data reduction by a factor of 4250 while preserving essential sky condition information, resulting in a lightweight neural network architecture that balances prediction accuracy with computational efficiency. The forecasts are generated simultaneously for multiple time steps: 1 minute, 5 minutes, 10 minutes and 20 minutes. Models are evaluated against a custom dataset, spanning across more than 3 years, containing 1 min samples encompassing both all-sky imagery and meteorological measurements. The approach is demonstrated to achieve better forecasting accuracy, namely a forecast skill of 10 % compared to persistence, and a significantly reduced computational overhead compared to benchmark ConvLSTM models. Moreover, utilizing the preprocessed image features reduces input size by a factor of 6 compared to the raw images. Our findings suggest that the proposed models are well-suited for deployment in embedded systems, remote sensors, and other scenarios where computational resources are limited.

关键词： Solar irradiation forecast Multistep forecasting deep learning LSTM image features Resource-efficient Total sky imager

来源：评论

学校读者我要写书评

暂无评论

Improving deep learning Models Considering the time Lags between Explanatory and Response Variables

引用

JOURNAL OF INFORMATION processing SYSTEMS 2024年第3期20卷 345-359页

作者： Kim, Chaehyeon Lee, Ki Yong Sookmyung Womens Univ Dept Comp Sci Seoul South Korea Univ Penn Dept Comp & Informat Sci Philadelphia PA USA

A regression model represents the relationship between explanatory and response variables. In real life, explanatory variables often affect a response variable with a certain time lag, rather than immediately. For example, the marriage rate affects the birth rate with a time lag of 1 to 2 years. Although deep learning models have been successfully used to model various relationships, most of them do not consider the time lags between explanatory and response variables. Therefore, in this paper, we propose an extension of deep learning models, which automatically finds the time lags between explanatory and response variables. The proposed method finds out which of the past values of the explanatory variables minimize the error of the model, and uses the found values to determine the time lag between each explanatory variable and response variables. After determining the time lags between explanatory and response variables, the proposed method trains the deep learning model again by reflecting these time lags. Through various experiments applying the proposed method to a few deep learning models, we confirm that the proposed method can find a more accurate model whose error is reduced by more than 60% compared to the original model.

关键词： deep learning Model Optimization Regression Model time Lag

来源：评论

学校读者我要写书评

暂无评论

Transformer-Based Dog Behavior Classification With Motion Sensors

引用

IEEE SENSORS JOURNAL 2024年第20期24卷 33816-33825页

作者： Or, Barak MetaOr Artificial Intelligence CEO Off IL-3349602 Haifa Israel Reichman Univ Google Reichman Tech Sch IL-4610101 Herzliyya Israel

This article deals with classifying dog behavior using motion sensors, leveraging a transformer-based deep neural network (DNN) model. Understanding dog behavior is essential for fostering positive relationships between dogs and humans and ensuring their well-being. Traditional methods often fall short in capturing temporal dependencies and efficiently processing high-dimensional sensor data. Our proposed architecture, inspired by its success in natural language processing (NLP), utilizes the self-attention mechanism of the transformer to effectively identify relevant features across various time scales, making it ideal for real-time applications. The architecture includes only the encoder part with a classifier's head to output probabilities of dog behavior. We used an open-access dataset focusing on seven different dog behavior, captured by motion sensors on top of the dog's back. Through experimentation and optimization, our model demonstrates superior performance with an impressive accuracy rate of 98.5%, outperforming time series DNN models. The model's efficiency is further highlighted by its reduced computational complexity, lower latency, and smaller size, making it well-suited for deployment in resource-constrained environments.

关键词： Dogs Transformers Motion detection Sensors Computational modeling Data models Computer architecture Accelerometer attention mechanism deep neural network (DNN) dog activity detection dog behavior gyroscope inertial sensors long short-term memory (LSTM) machine learning mode recognition motion sensors pet activity detection (PAD) real-time supervised learning transformers

来源：评论

学校读者我要写书评

暂无评论

Resource-aware strategies for real-time multi-person pose estimation

引用

image AND VISION COMPUTING 2025年 155卷

作者： Esmail, Mohammed A. Wang, Jinlei Wang, Yihao Sun, Li Zhu, Guoliang Zhang, Guohe Xi An Jiao Tong Univ Sch Microelect Xian 710049 Peoples R China Beijing Inst Technol Beijing 100076 Peoples R China Beijing Res Inst Telemetry Beijing 100076 Peoples R China

When using deep learning applications for human posture estimation (HPE), especially on devices with limited resources, accuracy and efficiency must be balanced. Common deep-learning architectures have a propensity to use a large amount of processing power while yielding low accuracy. This work proposes the implementation of Efficient YoloPose, a new architecture based on You Only Look Once version 8 (YOLOv8)-Pose, in an attempt to address these issues. Advanced lightweight methods like Depthwise Convolution, Ghost Convolution, and the C3Ghost module are used by Efficient YoloPose to replace traditional convolution and C2f (a quicker implementation of the Cross Stage Partial Bottleneck). This approach greatly decreases the inference, parameter count, and computing complexity. To improve posture estimation even further, Efficient YoloPose integrates the Squeeze Excitation (SE) attention method into the network. The main focus of this process during posture estimation is the significant areas of an image. Experimental results show that the suggested model performs better than the current models on the COCO and OCHuman datasets. The proposed model lowers the inference time from 1.1 milliseconds (ms) to 0.9 ms, the computational complexity from 9.2 Giga Floating-point operations (GFlops) to 4.8 GFlops and the parameter count from 3.3 million to 1.3 million when compared to YOLOv8-Pose. In addition, this model maintains an average precision (AP) score of 78.8 on the COCO dataset. The source code for Efficient YoloPose has been made publicly available at [https://***/malareeqi/Efficient-YoloPose].

关键词： deep learning Human pose estimation (HPE) Efficient YoloPose Lightweight techniques Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

learning to Control Camera Exposure via Reinforcement learning

Learning to Control Camera Exposure via Reinforcement Learni...

引用

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Lee, Kyunghyun Shin, Ukcheol Lee, Byeong-Uk LG AI Res Seoul South Korea CMU Pittsburgh PA USA KRAFTON Seoul South Korea

ISBN: (纸本)9798350353013;9798350353006

Adjusting camera exposure in arbitrary lighting conditions is the first step to ensure the functionality of computer vision applications. Poorly adjusted camera exposure often leads to critical failure and performance degradation. Traditional camera exposure control methods require multiple convergence steps and time-consuming processes, making them unsuitable for dynamic lighting conditions. In this paper, we propose a new camera exposure control framework that rapidly controls camera exposure while performing real-time processing by exploiting deep reinforcement learning. The proposed framework consists of four contributions: 1) a simplified training ground to simulate real-world's diverse and dynamic lighting changes, 2) flickering and image attribute-aware reward design, along with lightweight state design for real-time processing, 3) a static-to-dynamic lighting curriculum to gradually improve the agent's exposure-adjusting capability, and 4) domain randomization techniques to alleviate the limitation of the training ground and achieve seamless generalization in the wild. As a result, our proposed method rapidly reaches a desired exposure level within five steps with real-time processing (1ms). Also, the acquired images are well-exposed and show superiority in various computer vision tasks, such as feature extraction and object detection.

关键词： auto exposure control reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Breast Cancer image Classification Method Based on deep Transfer learning 24

Breast Cancer Image Classification Method Based on Deep Tran...

引用

1st International Conference on image processing Machine learning and Pattern Recognition

作者： Wang, Weimin Li, Yufeng Yan, Xu Xiao, Mingxuan Gao, Min Hong Kong Univ Sci & Technol China Hong Kong Peoples R China Univ Southampton Southampton Hants England Trine Univ Angola IN USA Southwest Jiaotong Univ Chengdu Peoples R China

ISBN: (纸本)9798400707032

To address the issues of limited samples, time-consuming feature design, and low accuracy in detection and classification of breast cancer pathological images, a breast cancer image classification model algorithm combining deep learning and transfer learning is proposed. This algorithm is based on the Densely Connected Convolutional Networks (DenseNet) structure of deep neural networks, and constructs a network model by introducing attention mechanisms, and trains the enhanced dataset using multi-level transfer learning. Experimental results demonstrate that the algorithm achieves an efficiency of over 84.0% in the test set, with a significantly improved classification accuracy compared to previous models, making it applicable to medical breast cancer detection tasks.

关键词： Breast cancer Medical image classification Transfer learning DenseNet

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：