In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets...
详细信息
ISBN:
(数字)9798350384574
ISBN:
(纸本)9798350384581
In the context of rapid advancements in industrial automation, vision-based robotic grasping plays an increasingly crucial role. In order to enhance visual recognition accuracy, the utilization of large-scale datasets is imperative for training models to acquire implicit knowledge related to the handling of various objects. Creating datasets from scratch is a time and labor-intensive process. Moreover, existing datasets often contain errors due to automated annotations aimed at expediency, making the improvement of these datasets a substantial research challenge. Consequently, several issues have been identified in the annotation of grasp bounding boxes within the popular Jacquard Grasp Dataset [1]. We propose utilizing a Human-In-The-Loop(HIL) method to enhance dataset quality. This approach relies on backbone deep learning networks to predict object positions and orientations for robotic grasping. Predictions with Intersection over Union (IOU) values below 0.2 undergo an assessment by human operators. After their evaluation, the data is categorized into False Negatives(FN) and True Negatives(TN). FN are then subcategorized into either missing annotations or catastrophic labeling errors. Images lacking labels are augmented with valid grasp bounding box information, whereas images afflicted by catastrophic labeling errors are completely removed. The open-source tool Labelbee was employed for 53,026 iterations of HIL dataset enhancement, leading to the removal of 2,884 images and the incorporation of ground truth information for 30,292 images. The enhanced dataset, named the Jacquard V2 Grasping Dataset, served as the training data for a range of neural networks. We have empirically demonstrated that these dataset improvements significantly enhance the training and prediction performance of the same network, resulting in an increase of 7.1% across most popular detection architectures for ten iterations. This refined dataset will be accessible on One Drive and Baidu N
Numerous process items and procedures in the assembly process of a diesel engine have a direct and significant impact on its performance, and accurate and effective assembly process control is a key technical bottlene...
详细信息
ISBN:
(数字)9798350378658
ISBN:
(纸本)9798350378665
Numerous process items and procedures in the assembly process of a diesel engine have a direct and significant impact on its performance, and accurate and effective assembly process control is a key technical bottleneck in improving the quality consistency of batch-produced diesel engines. Accurate prediction of assembly quality can provide support for online control of assembly quality. However, due to the significant class imbalance between qualified and unqualified products of diesel engine assembly quality, the traditional imbalance learning makes it difficult to capture the small changes between classes. Therefore, an improved random balance sampling method is proposed, combined with the LightGBM classifier to create a quality prediction model. Firstly, class balancing of the training dataset is achieved by generating random weights for each class and fusing oversampling and undersampling. Then, a LightGBM-based quality prediction model is built. Finally, the effectiveness of this approach is validated using real industrial diesel engine assembly data. The comparison results demonstrate that the proposed method outperforms traditional resampling techniques. This study provides an accurate quality prediction model, laying the groundwork for the assembly quality control model of diesel engines.
In order to reduce manual patrol and improve the efficiency of meter reading in substations, many researchers began to recognize the reading of pointer meter by image recognition technology in recent years. At present...
详细信息
Multi-object recognition and grasping is a critical technology in laboratory automation. However, most existing methods are limited to structured environments and exhibit poor performance in complex, unstructured, and...
详细信息
ISBN:
(数字)9798331535087
ISBN:
(纸本)9798331535094
Multi-object recognition and grasping is a critical technology in laboratory automation. However, most existing methods are limited to structured environments and exhibit poor performance in complex, unstructured, and occluded settings. To overcome these challenges in chemical experiment scenarios, this work proposes a multi-object grasping planner based on Vision-Language models (VLM-MOGP) for robust object recognition and intelligent grasping. First, an efficient multi-task detection network (EMDN) is designed and applied for multi-object recognition and localization. Then, the image and the detected objects are fed into VLM. The planner utilizes the multimodal reasoning capability of VLM to dynamically analyze occlusion relationships and generate a collision-free grasping plan. For visually similar objects of the same category, VLM-MOGP extracts visual features through prompting to achieve object disambiguation. Experimental results demonstrate that VLM-MOGP significantly outperforms existing methods in both object recognition accuracy and grasping success rate. Additionally, experiments on real mobile robots validate the proposed approach.
Natural gas load forecasting is crucial for the safe operation of urban natural gas. Based on the system identification method, the short-term natural gas load forecasting is studied in the paper. Employing a multi-st...
详细信息
ISBN:
(数字)9798331506056
ISBN:
(纸本)9798331506063
Natural gas load forecasting is crucial for the safe operation of urban natural gas. Based on the system identification method, the short-term natural gas load forecasting is studied in the paper. Employing a multi-stage identification framework, the initial values of the model parameters using the least squares and the instrumental variables method, while the model parameters are further optimized by the prediction error method to improve the accuracy of the model identification. The proposed method is validated by the simulation based on the public natural gas data in the U.S., demonstrating the algorithm's effectiveness in forecasting the daily natural gas load effectively.
Solving linear matrix inequality (LMI) is crucial across diverse fields, and the emergence of zeroing neural networks (ZNN) presents a novel solution for the time-varying LMI (TV-LMI) challenge. However, the applicati...
详细信息
ISBN:
(数字)9798350364194
ISBN:
(纸本)9798350364200
Solving linear matrix inequality (LMI) is crucial across diverse fields, and the emergence of zeroing neural networks (ZNN) presents a novel solution for the time-varying LMI (TV-LMI) challenge. However, the application of ZNN to solve the time-varying complex-valued LMI (TVCV-LMI) problem remains unexplored. Therefore, we introduce a novel fuzzy-parameter ZNN (FP-ZNN) model in this study to tackle the TVCV-LMI problem. With the introduction of fuzzy logic system (FLS), the FP-ZNN model is able to adjust the fuzzy convergence parameter (FCP) in a real-time manner, responding to any change in the system error and achieving the best performance. We also use an exponential activation function (EAF) in our study, which makes the FP-ZNN model fixed-time stable. To verify and illustrate the superior features of the elegant FPZNN model, detailed theoretical analysis, together with numerical experiments, are provided, and the results emphasize the fixed-time stability and adaptiveness of the FP-ZNN model further. As a novel approach, we provide an elegant solution to the TVCV-LMI problem in this paper.
This work proposes a RGB-D SLAM system specifically designed for structured environments and aimed at improved tracking and mapping accuracy by relying on geometric features that are extracted from the surrounding. St...
详细信息
ISBN:
(纸本)9781728190778
This work proposes a RGB-D SLAM system specifically designed for structured environments and aimed at improved tracking and mapping accuracy by relying on geometric features that are extracted from the surrounding. Structured environments offer, in addition to points, also an abundance of geometrical features such as lines and planes, which we exploit to design both the tracking and mapping components of our SLAM system. For the tracking part, we explore geometric relationships between these features based on the assumption of a Manhattan World (MW). We propose a decoupling-refinement method based on points, lines, and planes, as well as the use of Manhattan relationships in an additional pose refinement module. For the mapping part, different levels of maps from sparse to dense are reconstructed at a low computational cost. We propose an instance-wise meshing strategy to build a dense map by meshing plane instances independently. The overall performance in terms of pose estimation and reconstruction is evaluated on public benchmarks and shows improved performance compared to state-of-the-art methods. The code is released at https : // github . com/yanyan-li/PlanarSLAM.
The proliferation of pornographic content online challenges content moderation efforts, especially in sensitive contexts. Traditional detection methods often require extensive labeled datasets and struggle with nuance...
详细信息
ISBN:
(数字)9798331542047
ISBN:
(纸本)9798331542054
The proliferation of pornographic content online challenges content moderation efforts, especially in sensitive contexts. Traditional detection methods often require extensive labeled datasets and struggle with nuanced content. This paper proposes a zero-shot classification approach using Vision-Language models (VLMs) like CLIP and Open CLIP, which leverage visual-textual alignment to classify pornographic content without task-specific training. Evaluated on the LSPD dataset, our research examined various aspects of using VLMs, including the effects of keyword choice, prompt construction, model size, and pre-training data resolution. Our method achieved comparable or better performance than traditional models, with the best accuracy reaching 91.6% using the key-word “erotica“ and specific descriptive prompts. This approach reduces dependency on large datasets, offering a robust solution for detecting explicit content.
In this paper, we mainly improved the grasp detection network based on the grasp pose detection (GPD) algorithm. Three Network in Network (NIN) structure blocks are used as feature extraction modules, and a fully conn...
详细信息
Industrial Internet of Things (IIoT) may allow for the state of equipment to be monitored and defects to be detected before they become serious, helping smart factories keep up with the rising demands for safety and e...
详细信息
暂无评论