检索结果-内蒙古大学图书馆

YOLOCSP-PEST for Crops Pest Localization and Classification

computers, Materials & Continua 2025年第2期82卷 2373-2388页

作者： Farooq Ali Huma Qayyum Kashif Saleem Iftikhar Ahmad Muhammad Javed Iqbal Department of Software Engineering University of Engineering and TechnologyTaxila47050Pakistan Department of Computer Science&Engineering College of Applied Studies&Community ServiceKing Saud UniversityRiyadh11362Saudi Arabia Department of Information Technology Faculty of Computing and Information TechnologyKing Abdulaziz UniversityJeddah21589Saudi Arabia Department of Computer Science University of Engineering and TechnologyTaxila47050Pakistan

Preservation of the crops depends on early and accurate detection of pests on crops as they cause several diseases decreasing crop production and quality. Several deep-learning techniques have been applied to overcome the issue of pest detection on crops. We have developed the YOLOCSP-PEST model for Pest localization and classification. With the Cross Stage Partial Network (CSPNET) backbone, the proposed model is a modified version of You Only Look Once Version 7 (YOLOv7) that is intended primarily for pest localization and classification. Our proposed model gives exceptionally good results under conditions that are very challenging for any other comparable models especially conditions where we have issues with the luminance and the orientation of the images. It helps farmers working out on their crops in distant areas to determine any infestation quickly and accurately on their crops which helps in the quality and quantity of the production yield. The model has been trained and tested on 2 datasets namely the IP102 data set and a local crop data set on both of which it has shown exceptional results. It gave us a mean average precision (mAP) of 88.40% along with a precision of 85.55% and a recall of 84.25% on the IP102 dataset meanwhile giving a mAP of 97.18% on the local data set along with a recall of 94.88% and a precision of 97.50%. These findings demonstrate that the proposed model is very effective in detecting real-life scenarios and can help in the production of crops improving the yield quality and quantity at the same time.

关键词： Deep learning classification of pests YOLOCSP-PEST pest detection

来源：评论

学校读者我要写书评

暂无评论

Mining user's navigation structure by filtering impurity nodes for generating relevant predictions

引用

International Journal of Cognitive Computing in engineering 2023年第1期4卷 248-258页

作者： Jindal, Honey Sardana, Neetu Vidyarthi, Ankit Gupta, Deepak Mahmud, Mufti Department of Computer Science & Engineering and Information Technology Jaypee Institute of Information Technology Noida India Department of Computer Science and Engineering Maharaja Agrasen Institute of Technology Delhi India United Kingdom

Web Navigation Prediction (WNP) has been popularly used for finding future probable web pages. Obtaining relevant information from a large web is challenging, as its size is growing with every second. Web data may contain irrelevant noise, and thus cleaning such data is essential. In this paper, we emphasize the identification and elimination of noisy information from user-navigated web pages. The pruning of user browsed sessions by removing noisy webpages and their relations can help in the development of high-performing prediction models with fewer prediction errors. To minimize the prediction errors, we have proposed four pruning models namely PM3ER, PM3EN, PM3GI, and PM3EP which remove noisy web pages and their relations from the model consisting of varied simple and complex navigations. The study reveals that PM3ER is effective in prediction for websites with complicated structures resulting in complex navigations, and PM3EN is effective in predicting websites with a simple tree-like structure resulting in simple navigations. PM3ER has shown an improvement in predictions by up to 3% whereas PM3EN has attained improvement up to 5.45%. © 2023

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Research on facial expression recognition based on multimodal data fusion and neural network

引用

International Journal of Wireless and Mobile Computing 2024年第1期27卷 47-55页

作者： Han, Yi Wang, Xubin Lu, Zhengyu School of Mechanical Science and Engineering Huazhong University of Science and Technology Hubei Wuhan China Anyang Institute of Technology Department of Computer Science and Information Engineering Henan Province Anyang China School of Computer Science and Information Engineering Shanghai Institute of Technology FengXian District Shanghai China

Facial expression recognition is a challenging task when neural network is applied to pattern recognition. Most of the current recognition research is based on single source facial data, which generally has the disadvantages of low accuracy and low robustness. In this paper, a neural network algorithm of facial expression recognition based on multimodal data fusion is proposed. The algorithm is based on the multimodal data, and it takes the facial image, the histogram of oriented gradient of the image and the facial landmarks as the input, and establishes Convolutional Neural Network (CNN) designed to extract features from facial image, neural network designed to extract features from facial Landmarks and Neural Network (LNN) designed to extract features from Histogram of gradient (HNN), three sub-neural networks to extract data features, using multimodal data feature fusion mechanism to improve the accuracy of facial expression recognition. Experimental results show that, the algorithm has a great improvement in accuracy, robustness and detection speed. Copyright © 2024 Inderscience Enterprises Ltd.

关键词： Data fusion

来源：评论

学校读者我要写书评

暂无评论

A lightweight and efficient model for botnet detection in IoT using stacked ensemble learning

引用

Soft Computing 2025年第1期29卷 89-101页

作者： Esmaeilyfard, Rasool Shoaei, Zohre Javidan, Reza Department of Computer and Information Technology Engineering Shiraz University of Technology Shiraz Iran

Efficient botnet detection is of great security importance and has been the focus of researchers in recent years. Botnet detection is also a difficult task due to the difficulty in distinguishing it from normal traffic. At the same time, detecting these botnets require a lot of computation resources using traditional methods and this limitation makes it even more difficult to detect them on Internet of Things (IoT) devices. Considering the massive IoT data, an efficient and lightweight approach for detecting and predicting IoT botnet attacks is required. In this paper, multiple lightweight machine learning methods including a deep Multilayer Perceptron (MLP) method and a Random Forest (RF) method are integrated into a stacked ensemble learning model to detect botnet attacks in IoT devices. This integration is based on applying lasso regression on features and utilizing a logistic regression in ensemble learning, which leads to increasing the accuracy of botnet detection and reducing its computational complexity. The performance evaluation of the proposed model is examined from two perspectives: accuracy and lightweight characteristics;and for this purpose, a real-world UNSW (BoT-IoT) dataset is used. A comparative study with competitive neural network methods demonstrates that our approach delivers a better outcome. Experimental results reveal that this method has a higher efficiency in all metrics than competing methods including accuracy, precision, recall, and F1 score with values of 99.3%, 99.2%, 99%, and 99.1%, respectively. Besides, the results showed that the proposed method requires at least 36% less CPU and 38% less memory compared to the competing methods, which makes the proposed method to be suitable for IoT devises with limited resources. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Bot (Internet)

来源：评论

学校读者我要写书评

暂无评论

Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network

引用

computers, Materials & Continua 2024年第5期79卷 3067-3087页

作者： Arnab Dey Samit Biswas Dac-Nhuong Le Department of Computer Science and Technology Indian Institute of Engineering Science and TechnologyShibpurHowrah711103India Faculty of Information Technology Haiphong UniversityHaiphong180000Vietnam

Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers thelikelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in videostreams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enableinstant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing actiondatasets often lack diversity and specificity for workout actions, hindering the development of accurate recognitionmodels. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significantcontribution. WAVd comprises a diverse collection of labeled workout action videos, meticulously curated toencompass various exercises performed by numerous individuals in different settings. This research proposes aninnovative framework based on the Attention driven Residual Deep Convolutional-Gated Recurrent Unit (ResDCGRU)network for workout action recognition in video streams. Unlike image-based action recognition, videoscontain spatio-temporal information, making the task more complex and challenging. While substantial progresshas been made in this area, challenges persist in detecting subtle and complex actions, handling occlusions,and managing the computational demands of deep learning approaches. The proposed ResDC-GRU Attentionmodel demonstrated exceptional classification performance with 95.81% accuracy in classifying workout actionvideos and also outperformed various state-of-the-art models. The method also yielded 81.6%, 97.2%, 95.6%, and93.2% accuracy on established benchmark datasets, namely HMDB51, Youtube Actions, UCF50, and UCF101,respectively, showcasing its superiority and robustness in action recognition. The findings suggest practicalimplications in real-world scenarios where precise video action recognition is paramount, addressing the persistingchallenges in the field. TheWAVd datas

关键词： Workout action recognition video stream action recognition residual network GRU attention

来源：评论

学校读者我要写书评

暂无评论

Robust steganographic approach using generative adversarial network and compressive autoencoder

引用

Multimedia Tools and Applications 2024年 1-38页

作者： Qasaimeh, Malik Qtaish, Alaa Abu Aljawarneh, Shadi Department of Computer Information Systems Faculty of Computer and Information Technology Jordan University of Science and Technology Irbid Jordan Department of Software Engineering co-joint with the Department of Cyber Security Faculty of Computer and Information Technology Jordan University of Science and Technology Irbid Jordan

Nowadays, social media applications and websites have become a crucial part of people’s lives;for sharing their moments, contacting their families and friends, or even for their jobs. However, the fact that these valuable data are transferred via the Internet and open channels, which are vulnerable to attacks and espionage, requires using defense methods to improve the security of data transmission. Recently, the progress of Deep Learning (DL) inspired researchers to use it with security methods such as steganography, the art of hiding secret data in unrelated content. DL and steganography combined enhanced the hiding properties, especially in the Coverless Image Steganography (CIS) methods. In this paper, we propose a Compressive Coverless Image Steganography (CCIS) model, which is a generated-based CIS, to improve the capacity and robustness of Steganography. This model uses the compressive autoencoder to compress the message, thus increasing the capacity, and uses the Generative Adversarial Network (GAN) to generate a stego image from the compressed vector, thus enhancing concealment ability, besides using a regression model and optimization method to extract the data. Another version was designed to enable sending binary messages by replacing the compressive autoencoder network with a robust mapping rule. Experiments show that the proposed model could improve the characteristics of Steganography. Furthermore, the proposed model could extract binary messages with 256 bits from attacked stego images with 99% recovery accuracy. Thus, capacity and robustness were enhanced using our model with both images and text as secret messages. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Enhance the Performance of Directional Feature-based Palmprint Recognition by Directional Response Stability Measurement

引用

Machine Intelligence Research 2024年第3期21卷 597-614页

作者： Haitao Wang Wei Jia School of Computer Science and Information Engineering Hefei University of TechnologyHefei230009China

Palmprint recognition is an emerging biometrics technology that has attracted increasing attention in recent years. Many palmprint recognition methods have been proposed, including traditional methods and deep learning-based methods. Among the traditional methods, the methods based on directional features are mainstream because they have high recognition rates and are robust to illumination changes and small noises. However, to date, in these methods, the stability of the palmprint directional response has not been deeply studied. In this paper, we analyse the problem of directional response instability in palmprint recognition methods based on directional feature. We then propose a novel palmprint directional response stability measurement (DRSM) to judge the stability of the directional feature of each pixel. After filtering the palmprint image with the filter bank, we design DRSM according to the relationship between the maximum response value and other response values for each pixel. Using DRSM, we can judge those pixels with unstable directional response and use a specially designed encoding mode related to a specific method. We insert the DRSM mechanism into seven classical methods based on directional feature, and conduct many experiments on six public palmprint databases. The experimental results show that the DRSM mechanism can effectively improve the performance of these methods. In the field of palmprint recognition, this work is the first in-depth study on the stability of the palmprint directional response, so this paper has strong reference value for research on palmprint recognition methods based on directional features.

关键词： Biometrics palmprint recognition directional response stability directional coding-based methods directional feature

来源：评论

学校读者我要写书评

暂无评论

DocPedia: unleashing the power of large multimodal model in the frequency domain for versatile document understanding

引用

science China(information sciences) 2024年第12期67卷 65-78页

作者： Hao FENG Qi LIU Hao LIU Jingqun TANG Wengang ZHOU Houqiang LI Can HUANG Department of Electronic Engineering and Information Science University of Science and Technology of China ByteDance Inc.

In this work, we present DocPedia, a novel large multimodal model(LMM) for versatile OCRfree document understanding, capable of parsing images up to 2560 × 2560 resolution. Unlike existing studies that either struggle with high-resolution documents or give up the large language model thus vision or language ability constrained, our DocPedia directly processes visual input in the frequency domain rather than the pixel space. The unique characteristic enables DocPedia to capture a greater amount of visual and textual information using a limited number of visual tokens. To consistently enhance both the perception and comprehension abilities of our DocPedia, we develop a dual-stage training strategy and enrich instructions/annotations of all training tasks covering multiple document types. Extensive quantitative and qualitative experiments are conducted on various publicly available benchmarks and the results confirm the mutual benefits of jointly learning perception and comprehension tasks. The results provide further evidence of the effectiveness and superior performance of our DocPedia over other methods.

关键词： document understanding large multimodal model OCR-free high-resolution frequency

来源：评论

学校读者我要写书评

暂无评论

GDMNet: A Unified Multi-Task Network for Panoptic Driving Perception

引用

computers, Materials & Continua 2024年第8期80卷 2963-2978页

作者： Yunxiang Liu Haili Ma Jianlin Zhu Qiangbo Zhang School of Computer Science and Information Engineering Shanghai Institute of TechnologyShanghai201418China

To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object ***,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient ***,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training *** results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,*** detection performance surpasses that of other single-task or multi-task algorithm models.

关键词： Autonomous driving multitask learning drivable area segmentation lane detection vehicle detection

来源：评论

学校读者我要写书评

暂无评论

Automatic Transportation Mode Classification Using a Deep Reinforcement Learning Approach With Smartphone Sensors

引用

IEEE Access 2024年 12卷 514-533页

作者： Taherinavid, Siavash Moravvej, Seyed Vahid Chen, Yen-Lin Yang, Jing Ku, Chin Soon Yee, Por Lip Iran University of Science and Technology School of Civil Engineering Tehran13114-16846 Iran Isfahan University of Technology Department of Electrical and Computer Engineering Isfahan84156-83111 Iran National Taipei University of Technology Department of Computer Science and Information Engineering Taipei106344 Taiwan Universiti Malaya Faculty of Computer Science and Information Technology Department of Computer System and Technology Kuala Lumpur50603 Malaysia Universiti Tunku Abdul Rahman Department of Computer Science Kampar31900 Malaysia

The increasing dependence on smartphones with advanced sensors has highlighted the imperative of precise transportation mode classification, pivotal for domains like health monitoring and urban planning. This research is motivated by the pressing demand to enhance transportation mode classification, leveraging the potential of smartphone sensors, notably the accelerometer, magnetometer, and gyroscope. In response to this challenge, we present a novel automated classification model rooted in deep reinforcement learning. Our model stands out for its innovative approach of harnessing enhanced features through artificial neural networks (ANNs) and visualizing the classification task as a structured series of decision-making events. Our model adopts an improved differential evolution (DE) algorithm for initializing weights, coupled with a specialized agent-environment relationship. Every correct classification earns the agent a reward, with additional emphasis on the accurate categorization of less frequent modes through a distinct reward strategy. The Upper Confidence Bound (UCB) technique is used for action selection, promoting deep-seated knowledge, and minimizing reliance on chance. A notable innovation in our work is the introduction of a cluster-centric mutation operation within the DE algorithm. This operation strategically identifies optimal clusters in the current DE population and forges potential solutions using a pioneering update mechanism. When assessed on the extensive HTC dataset, which includes 8311 hours of data gathered from 224 participants over two years. Noteworthy results spotlight an accuracy of 0.88±0.03 and an F-measure of 0.87±0.02, underscoring the efficacy of our approach for large-scale transportation mode classification tasks. This work introduces an innovative strategy in the realm of transportation mode classification, emphasizing both precision and reliability, addressing the pressing need for enhanced classification mechanisms in an eve

关键词： Smartphones

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：