检索结果-内蒙古大学图书馆

Generative data augmentation by conditional inpainting for multi-class object detection in infrared images

PATTERN RECOGNITION 2024年 153卷

作者： Wang, Peng Ma, Zhe Dong, Bo Liu, Xiuhua Ding, Jishiyu Sun, Kewu Chen, Ying CASIC Intelligent Sci & Technol Acad Ltd Beijing 100043 Peoples R China Key Lab Aerosp Def Intelligent Syst & Technol Beijing 100043 Peoples R China Ludwig Maximilians Univ Munchen Univ Hosp Inst Stroke & Dementia Res D-81377 Munich Germany Harbin Inst Technol Dept Control Sci & Engn Harbin 150006 Peoples R China

multi-class object detection in infrared images is important in military and civilian use. Deep learning methods can obtain high accuracy but require a large-scale dataset. We propose a generative data augmentation framework DOCI-GAN, for infrared multi-class object detection with limited data. Contributions of this paper are four-folds. Firstly, DOCI-GAN is designed as a conditional image inpainting framework, yielding paired infrared multi-class object image and annotation. Secondly, a text-to-image converter is formulated to transform text-format object annotations to bounding box mask images, leading the augmentation to be mask-imageto-raw-image translation. Thirdly, a multiscale morphological erosion-based loss is created to alleviate the intensity inconsistency between inpainted local backgrounds and global background. Finally, for generating diverse images, artificial multi-class object annotations are integrated with real ones during augmentation. Experimental results demonstrated that DOCI-GAN augments dataset with high-quality infrared multi-class object images, consequently improving the accuracy of object detection baselines.

关键词： Data augmentation Image inpainting GAN Infrared image multi-class object detection

来源：评论

学校读者我要写书评

暂无评论

multi-class object detection using faster R-CNN and estimation of shaking locations for automated shake-and-catch apple harvesting

引用

COMPUTERS AND ELECTRONICS IN AGRICULTURE 2020年 173卷 105384-105384页

作者： Zhang, Jing Karkee, Manoj Zhang, Qin Zhang, Xin Yaqoob, Majeed Fu, Longsheng Wang, Shumao Capital Univ Econ & Business Coll Management Engn Beijing Peoples R China Washington State Univ Ctr Precis & Automated Agr Syst Pullman WA 99164 USA Washington State Univ Dept Biol Syst Engn Pullman WA 99164 USA Northwest A&F Univ Coll Mech & Elect Engn Xain Peoples R China China Agr Univ Coll Engn Beijing Peoples R China

In order to address the challenge of labor shortages, and to reduce costs of apple harvesting, a targeted shake-and-catch technique is being developed at Washington State University for fresh market apple harvesting. This technique is showing promising results for some varieties of apples trained to a formal, fruiting wall tree architecture. However, the operators are still required to manually engage the shaker on target branches. To further improve the shake-and-catch apple harvesting system, a multi-class object detection algorithm was developed in this study for automatically detecting apples, branches and trunks in the natural environment using a Faster R-CNN (Regions-Convolutional Neural Network) model. This study deployed transfer learning and fine-tuning for the pre-trained networks (Alexnet, VGG16 and VGG19) and activated the feature of different layers to realize the detection of these objects. The Precision and Recall (PR) curve, F1-score and mean Average Precision (mAP) were used to evaluate the performance of Faster R-CNN in detecting different object classes. VGG19 achieved the highest mAP of 82.4%, which was 10.8% higher than Alexnet and 0.4% higher than VGG16 respectively. The computational time consumed by the entire algorithm was also assessed in this study;Faster R-CNN completed the detection of one image, on average, in 0.45 s. Based on the multi-class object detection results, a polynomial fitting method was used to predict the skeleton equation of branches and trunks. The average Goodness of Fit (R-2), Root Mean Squared Error (RMSE) and correlation coefficient (r) between the predicted and reference skeleton were calculated to represent the accuracy of skeleton fitting. VGG16 and VGG19 both achieved higher accuracy than Alexnet for the skeleton fitting of branches and trunks. An algorithm was then developed to estimate shaking locations on the branches using the results of previous steps. Compared with the human experts' input, a total of 72.7%

关键词： Shake-and-catch apple harvesting Faster R-CNN multi-class object detection Skeleton equation fitting Shaking location estimation

来源：评论

学校读者我要写书评

暂无评论

multi-class object Learning with Application to Fabric Defects detection

引用

AATCC JOURNAL OF RESEARCH 2021年第1_suppl期8卷 165-172页

作者： Wei, Bing Gao, Lei Tang, Xue-song Hao, Kuangrong Donghua Univ Shanghai Peoples R China Commonwealth Sci & Ind Res Org CSIRO Canberra ACT Australia

Deep convolutional neural networks (CNNs) have shown great success in single-class fabric image detection. However, realworld fabric defect images generally contain several types of defects in one image. Accurately recognizing and classifying multi-class fabric defect images is still an unsolved issue due to the complexity of intersected defects, as well as difficulty in distinguishing small-size defects. To address these challenges, this study develops a methodology based on the deep learning feature pyramid networks (FPN) approach to detect multi-class fabric defects. To evaluate the proposed detection model, we built a unique multi-class fabric defects database (DHU-MO1000), where multi-class defect images are generated by industrial monitors from a textile factory. We used the dataset as the benchmark for multi-class defects detection training and testing the FPN. Furthermore, we conducted extensive experimental validations for various design choices. The experimental results show that the model outperformed existing multi-class object detection methods.

关键词： Computer Vision Convolutional Neural Networks Fabric Defect detection multi-class object detection

来源：评论

学校读者我要写书评

暂无评论

BOOSTED multi-class object detection WITH PARALLEL HARDWARE IMPLEMENTATION FOR REAL-TIME APPLICATIONS

BOOSTED MULTI-CLASS OBJECT DETECTION WITH PARALLEL HARDWARE ...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Yang, Yao-Tsung Chiu, Ching-Te Natl Tsing Hua Univ Inst Commun Engn Taipei Taiwan

ISBN: (纸本)9781479928934

Real-time multi-class object detection becomes popular for various applications such as vehicle vision systems, computer vision and image processing. Boosted cascades achieve fast and reliable object detection for one object class, but require parallel usage of multiple cascades for multi-class detection. The multi-class capable cascade splits the root-cascade into sub-cascades iteratively until each sub-cascade contains one class. That requires a huge number of classifiers in the generated hierarchy of interlinked cascades. In this paper, we propose a boosted multi-class object cascade that only splits one class object from the upper-level-cascade when building the sub-cascades. Since only once class object is split so we can reduce the number of classifiers in each stage. From the simulation results, the boosted multi-class object detection can reduce 46% weak classifiers compared to the multi-class capable cascade for the MIT CBCL database. The proposed method achieves high detection rate(95.54%) and low false positive rate(1.94%). We implement our proposed algorithm with a parallel architecture to accelerate the detection operation using TSMC 90nm CMOS technology. The implementation results show that the design achieves an operation frequency of 100MHz of processing images of 30 fps with size 160 x 120.

关键词： multi-class object detection weak classifiers Boosted Cascade

来源：评论

学校读者我要写书评

暂无评论

Two-level Fuzzy Logic Evaluation System for Surgeon's Hand Movement Using object detection

Two-level Fuzzy Logic Evaluation System for Surgeon's Hand M...

引用

IEEE Symposium Series on Computational Intelligence (IEEE SSCI)

作者： Fathabadi, Fatemeh Rashidi Grantner, Janos L. Shebrain, Saad A. Abdel-Qader, Ikhlas Western Michigan Univ Elect & Comp Engn Kalamazoo MI 49008 USA Western Michigan Univ Homer Stryker MD Sch Med Surg Kalamazoo MI 49008 USA

ISBN: (纸本)9781665487689

One significant aspect of surgical education and training is autonomous surgical skill assessment with feedback. In this paper, an autonomous two-level fuzzy logic assessment system for tracking and evaluation of laparoscopic instruments' tooltip movements for the FLS peg transfer task is proposed. The surgeon's left and right-hand movements are detected by using an Artificial Intelligence Network through instrument tooltip detection and position coordinates calculations. A first of its kind, custom laparoscopic box trainer dataset was built from experimental peg transfer task video recordings which were carried out by 9 doctors and OB/GYN residents, of the Homer Stryker M.D. School of Medicine, WMU, in the Intelligent Fuzzy Controllers Laboratory, WMU. A multi-class object detection algorithm, based on Deep Neural Networks, was developed.

关键词： laparoscopic surgical skill assessment multi-class object detection fuzzy logic-based decision support system

来源：评论

学校读者我要写书评

暂无评论

3D Autonomous Surgeon's Hand Movement Assessment Using a Cascaded Fuzzy Supervisor in multi-Thread Video Processing

引用

SENSORS 2023年第5期23卷 2623-2623页

作者： Fathabadi, Fatemeh Rashidi Grantner, Janos L. L. Shebrain, Saad A. A. Abdel-Qader, Ikhlas Western Michigan Univ Elect & Comp Engn Dept Kalamazoo MI 49008 USA Western Michigan Univ Homer Stryker MD Sch Med Dept Surg Kalamazoo MI 49008 USA

The purpose of the Fundamentals of Laparoscopic Surgery (FLS) training is to develop laparoscopic surgery skills by using simulation experiences. Several advanced training methods based on simulation have been created to enable training in a non-patient environment. Laparoscopic box trainers-cheap, portable devices-have been deployed for a while to offer training opportunities, competence evaluations, and performance reviews. However, the trainees must be under the supervision of medical experts who can evaluate their abilities, which is an expensive and time-consuming operation. Thus, a high level of surgical skill, determined by assessment, is necessary to prevent any intraoperative issues and malfunctions during a real laparoscopic procedure and during human intervention. To guarantee that the use of laparoscopic surgical training methods results in surgical skill improvement, it is necessary to measure and assess surgeons' skills during tests. We used our intelligent box-trainer system (IBTS) as a platform for skill training. The main aim of this study was to monitor the surgeon's hands' movement within a predefined field of interest. To evaluate the surgeons' hands' movement in 3D space, an autonomous evaluation system using two cameras and multi-thread video processing is proposed. This method works by detecting laparoscopic instruments and using a cascaded fuzzy logic assessment system. It is composed of two fuzzy logic systems executing in parallel. The first level assesses the left and right-hand movements simultaneously. Its outputs are cascaded by the final fuzzy logic assessment at the second level. This algorithm is completely autonomous and removes the need for any human monitoring or intervention. The experimental work included nine physicians (surgeons and residents) from the surgery and obstetrics/gynecology (OB/GYN) residency programs at WMU Homer Stryker MD School of Medicine (WMed) with different levels of laparoscopic skills and experience. They

关键词： laparoscopic surgical skill assessment multi-class object detection fuzzy logic-based decision support system intelligent box-trainer system

来源：评论

学校读者我要写书评

暂无评论

Efficient Military Aircraft Target detection Model Based on Federated Meta-Learning 20th

Efficient Military Aircraft Target Detection Model Based on ...

引用

20th International Conference on Intelligent Computing (ICIC)

作者： Pan, Zhongjie Wang, Xiaotian Nankai Univ Coll Software Engn Tianjin Peoples R China China Earthquake Adm Monitoring & Applicat Ctr 1 Tianjin Peoples R China

ISBN: (纸本)9789819756148;9789819756155

Military aircraft detection holds critical significance in defense operations, ensuring accurate identification and classification of aircraft for effective decision-making. However, existing methodologies face challenges due to disparate data collection, limited data availability, and the complexity of aggregating remote datasets. In response to these challenges, we propose a novel approach FedMATD, utilizing Federated Meta-Learning techniques to address the difficulties in data collection. To figure out the limitation in the scale of dataset, we integrate Federated Meta-Learning with a strategy focusing on training with small sample sizes. This innovative fusion aims to enhance target detection accuracy by leveraging the advantages of federated learning while mitigating the limitations posed by insufficient data quantities and remote data aggregation complexities. Our proposed method is evaluated using one open-source dataset, and our results demonstrate that FedMATD achieves a better level.

关键词： Federated Meta-Learning Small Sample Training Aircraft detection multi-class object detection YOLOv5s

来源：评论

学校读者我要写书评

暂无评论

In the Search for the Balance Between Real and Synthetic Images in multi-class detection Systems 19

In the Search for the Balance Between Real and Synthetic Ima...

引用

IEEE 19th Conference on Industrial Electronics and Applications (ICIEA)

作者： Cecchetti, Vitoria Biz Rudek, Marcelo Freire, Roberto Z. Pontifical Catholic Univ Parana PUCPR PPGEPS Ind & Syst Engn Curitiba Parana Brazil Univ Tecnol Fed Parana UTFPR Ind & Syst Engn Grad Program PPGEPS Curitiba Parana Brazil

ISBN: (纸本)9798350360875;9798350360868

Intelligent systems focused on traffic management have been in evidence in recent years, and applications related to vehicle detection and tracking, speed estimation, and traffic flow identification have become an interesting research topic. For the previously mentioned tasks, a large number of data has to be gathered to train deep learning algorithms, but collecting that data can be a time and resource-consuming task. Therefore, the use of synthetic data has become a viable option that helps to minimize data acquisition problems, but when misused, it can negatively impact the model's quality. This paper presents a systematic literature review based on the use of synthetic images to train object detection models in urban scenarios, aiming at identifying the ideal ratio between real and synthetic images that can benefit those models and the best methods to produce synthetic images. This study identified that there is no consensus on the number of synthetic images that can help to generate a more accurate model, due to the low number of papers addressing this relationship, however, it was noted that the use of generative adversarial networks (GANs) can create synthetic images that are more similar to real images, bringing benefits for training detection models, although without identifying how the use of images generated by this method can help in the relationship between synthetic and real.

关键词： Generative Adversarial Networks multi-class object detection object detection systems Synthetic data Traffic monitoring

来源：评论

学校读者我要写书评

暂无评论

Effective Complex Airport object detection in Remote Sensing Images Based on Improved End-to-End Convolutional Neural Network

引用

IEEE ACCESS 2020年 8卷 172652-172663页

作者： Han, Yongsai Ma, Shiping Xu, Yuelei He, Linyuan Li, Shuai Zhu, Mingming Air Force Engn Univ Grad Sch Xian 710038 Peoples R China Air Force Engn Univ Aeronaut Engn Coll Xian 710038 Peoples R China Northwestern Polytech Univ Unmanned Syst Technol Inst Xian 710038 Peoples R China

Airport objects are hotspots in the field of image object detection because of their specific features and value for applications. In this study, we developed a complex object detection method based on improved Faster R-CNN to achieve higher detection precision to detect seven types of remote sensing image objects in airport areas under complex conditions such as different scales, different visual angles, and different backgrounds. When building the network, we used deeper basic networks and feature fusion components to extract more robust features. At the same time, we had also modified the selection of positive and negative samples to improve sample imbalance. The main improvements in the algorithm concern the anchor size generation rule, and the addition of an a priori judgment network for the network. The effectiveness of the improved algorithm was verified in experiments. Compared with the original Faster R-CNN, the improved network brings a 12.7% increase in mAP, at the detection time of 0.307s. Finally, the model with trained weights was used to test the detection of the seven types of objects in airport areas on different data sets, and comparisons were conducted with other algorithms. The experimental results showed that the method improved the average detection accuracy and had a good performance in remote sensing airport object detection tasks.

关键词： Feature extraction Airports Remote sensing object detection Convolution Convolutional neural networks Roads Airport object image processing multi-class object detection pattern recognition remote sensing

来源：评论

学校读者我要写书评

暂无评论

Virtual multi-modal object detection and classification with Deep Convolutional Neural Networks 18

Virtual Multi-modal Object Detection and Classification with...

引用

Conference on Wavelets and Sparsity XVIII

作者： Mitsakos, Nikolaos Papadakis, Manos Univ Houston Houston TX 77004 USA

ISBN: (数字)9781510629707

ISBN: (纸本)9781510629707

In this paper we demonstrate how the post-processing of gray-scale images with algorithms which have a singularity enhancement effect can assume the role of auxiliary modalities, as in the case where an intelligent system fuses information from multiple physical modalities. We show that as in multimodal AI-fusion, "virtual" multimodal inputs can improve the performance of object detection. We design, implement and test a novel Convolutional Neural Network architecture, based on the Faster R-CNN network for multi-class object detection and classification. Our architecture combines deep feature representations of the input images, generated by networks trained independently on physical and virtual imaging modalities. Using an analog of the ROC curve, the Average Recall over Precision curve, we show that the fusion of certain virtual modality inputs, capable of enhancing singularities and neutralizing illumination, improve recognition accuracy.

关键词： multi-class object detection Region Proposal Network Convolutional Neural Networks multi-modal Feature Fusion Image Processing Wavelets Retinex

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：