Despite the great success of deeplearning in object detection at present, its performance and efficiency for small target detection in UAV images are still unsatisfactory. In order to improve the performance of small...
详细信息
ISBN:
(数字)9798350355413
ISBN:
(纸本)9798350355420
Despite the great success of deeplearning in object detection at present, its performance and efficiency for small target detection in UAV images are still unsatisfactory. In order to improve the performance of small target detection, which proposes an algorithm MS-YOLO based on the YOLOv8 algorithm to improve the backbone network and the detection head. Firstly, the SPD (Space-To-Depth Convolution), which is a feature extraction layer suitable for low-resolution images and small target images, is integrated into the shallow layer of the backbone network, which does not use stepwise convolution and pooling to better retain the small target feature information. This layer does not use step-width convolution and pooling, which can better retain the small target feature information. Secondly, a detection header specialized for the small target feature layer is added. The improved algorithm achieves state-of-the-art detection results on the VisDrone dataset with an average precision (AP) of 46.5 and 28.0, respectively, and a model parameter count of 16.8M, which improves the detection accuracy while maintaining lightweight.
Home service robots operating indoors, such as inside houses and offices, require the real-time and accurate identification and location of target objects to perform service tasks efficiently. However, images captured...
详细信息
Home service robots operating indoors, such as inside houses and offices, require the real-time and accurate identification and location of target objects to perform service tasks efficiently. However, images captured by visual sensors while in motion states usually contain varying degrees of blurriness, presenting a significant challenge for object detection. In particular, daily life scenes contain small objects like fruits and tableware, which are often occluded, further complicating object recognition and positioning. A dynamic and real-time object detection algorithm is proposed for home service robots. This is composed of an image deblurring algorithm and an object detection algorithm. To improve the clarity of motion-blurred images, the DA-Multi-DCGAN algorithm is proposed. It comprises an embedded dynamic adjustment mechanism and a multimodal multiscale fusion structure based on robot motion and surrounding environmental information, enabling the deblurring processing of images that are captured under different motion states. Compared with DeblurGAN, DA-Multi-DCGAN had a 5.07 improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.022 improvement in Structural Similarity (SSIM). An AT-LI-YOLO method is proposed for small and occluded object detection. Based on depthwise separable convolution, this method highlights key areas and integrates salient features by embedding the attention module in the AT-Resblock to improve the sensitivity and detection precision of small objects and partially occluded objects. It also employs a lightweight network unit Lightblock to reduce the network's parameters and computational complexity, which improves its computational efficiency. Compared with YOLOv3, the mean average precision (mAP) of AT-LI-YOLO increased by 3.19%, and the detection precision of small objects, such as apples and oranges and partially occluded objects, increased by 19.12% and 29.52%, respectively. Moreover, the model inference efficiency had a 7 ms red
Without agriculture, human existence would be inconceivable. A large percentage of the world's population relies on agriculture for their daily needs. In addition, it creates a big number of jobs in the area. Usin...
详细信息
The potato is grown worldwide and is the fourth largest food crop. Each of us knows the potato as a vegetable. If we look at other countries, it is clear that the potato is the most popular vegetable in the world, as ...
详细信息
deeplearning models (DLMs) frequently achieve accurate segmentation and classification of tumors from medical images. However, DLMs lacking feedback on their image segmentation mechanisms such as Dice coefficients an...
详细信息
The proceedings contain 33 papers presendted at a virtual meeting. The special focus in this conference is on Recent Trends in imageprocessing and Pattern Recognition. The topics include: real-time Face Recognition f...
ISBN:
(纸本)9783031070044
The proceedings contain 33 papers presendted at a virtual meeting. The special focus in this conference is on Recent Trends in imageprocessing and Pattern Recognition. The topics include: real-time Face Recognition for Organisational Attendance Systems;Harnessing Sustainable Development in image Recognition Through No-Code AI Applications: A Comparative Analysis;evaluating Performance of Adam Optimization by Proposing Energy Index;an Alignment-Free Fingerprint Template Protection Technique Based on Minutiae Triplets;early Prediction of Complex Business Processes Using Association Rule Based Mining;A Framework for Masked-image Recognition System in COVID-19 Era;A deep-learning Based Automated COVID-19 Physical Distance Measurement System Using Surveillance Video;Detection of Male Fertility Using AI-Driven Tools;face Mask Detection Using deep Hybrid Network Architectures;a Super Feature Transform for Small-Size image Forgery Detection;UHTelHwCC: A Dataset for Telugu Off-line Handwritten Character Recognition;inflectional and Derivational Hybrid Stemmer for Sentiment Analysis: A Case Study with Marathi Tweets;adaptive Threshold-Based Database Preparation Method for Handwritten image Classification;a Graph-Based Holistic Recognition of Handwritten Devanagari Words: An Approach Based on Spectral Graph Embedding;Imagined Object Recognition Using EEG-Based Neurological Brain Signals;a Fast and Efficient K-Nearest Neighbor Classifier Using a Convex Envelope;single Channel Speech Enhancement Using Masking Based on Sinusoidal Modeling;extraction of Temporal Features on Fibonacci Space for Audio Based Vehicle Classification;an Empirical Study of Vision Transformers for Cervical Precancer Detection;An Improved Technique for Preliminary Diagnosis of COVID-19 via Cough Audio Analysis;agricultural Field Analysis Using Satellite Hyperspectral Data and Autoencoder;Development of NDVI Prediction Model Using Artificial Neural Networks;time Series Forecasting of Soil Moisture Using Sa
Yoga pose detection is challenging in computer vision due to variations in body postures and environmental conditions. Recent advancements in DL models have demonstrated encouraging achievements in this field. This st...
详细信息
ISBN:
(数字)9798350355093
ISBN:
(纸本)9798350355109
Yoga pose detection is challenging in computer vision due to variations in body postures and environmental conditions. Recent advancements in DL models have demonstrated encouraging achievements in this field. This study integrates deeplearning (DL) and Machine learning (ML) techniques to detect and monitor 20 Yoga postures through the real-time application. DL techniques like OpenPose, PoseNet, and PIFPAF are applied to the image and video dataset to obtain the keypoint features. These features are combined and provided to train various ML classifiers for Yoga posture detection tasks. Integrating AI augmentation technique Generative Adversarial Networks (GANs) plays a crucial role in improving the robustness and accuracy of the models. GANs are employed to generate synthetic data that mimics real-world variations in yoga poses and environments. By generating realistic variations in poses, backgrounds, lighting, and body shapes, GAN helped the models become more resilient to complex poses and diverse environmental conditions, enhancing their generalization capabilities. All the classifiers showed improvement with augmentation, whereas the Random Forest classifier performed the best in all parameters. Further, the model deployed with a webcam feed for estimating the Yoga pose by the yoga practitioner indicating accuracy level.
Synthetic Aperture Radar (SAR) is one of the main sources of remote sensing data today. SAR raw data focussing is complicated and time consuming, therefore, is mostly done offline with sophisticated algorithms. deep L...
详细信息
The silkworm industry holds great potential for intelligence and automation. This study aims to enhance the intelligence of cocoon processing and increase economic benefits, exploring the application of deep vision te...
详细信息
Currently, tourists tend to plan travel routes and itineraries by searching for relevant information on tourist attractions via the Internet and intelligent terminals. However, it is difficult to achieve good retrieva...
详细信息
Currently, tourists tend to plan travel routes and itineraries by searching for relevant information on tourist attractions via the Internet and intelligent terminals. However, it is difficult to achieve good retrieval effect on tourist attraction images with text labels. Based on deeplearning, the visual location identification faces such defects as frequent mismatching, high probability of weak matching, and long execution time. To solve these defects, this paper puts forward a novel method for location identification and personalized recommendation of tourist attractions based on imageprocessing. Specifically, the authors detailed the ideas and steps of the location identification algorithm for tourist attractions. The algorithm, grounded on hash retrieval, encompasses two stages: an offline stage, and an online stage. Besides, a personalized recommendation model for tourist attractions based on geographical location and time period. Finally, the proposed algorithm and model were proved accurate and effective through experiments.
暂无评论