Object detection is one of the most basic and challenging problems in such a field, which has become a paramount topic for scholars in recent years. In the past two decades, object detection has developed rapidly from...
详细信息
ISBN:
(纸本)9781665417907
Object detection is one of the most basic and challenging problems in such a field, which has become a paramount topic for scholars in recent years. In the past two decades, object detection has developed rapidly from the beginning to the application in all aspects of life, with the improvement both in detection accuracy and detection speed. In this paper, firstly, we focus on the research progress of object detection algorithm based on deep learning in the light of its technical evolution and application. Secondly, we compare and analyze the two-stage and single-stage detection framework from series algorithms based on R-CNN to series algorithms based on Yolo, and introduce the common data sets and index evaluation as well as the application process of the algorithm in the detection fields of pedestrian, face, text, medical image, sign language, etc. Finally, we predict the prospects of deep learning-based object detection algorithms according to the existing challenges.
This study presents a dipper-throated-based ant colony optimization (DTACO) with the Seasonal Auto-Regressive Integrated Moving Average with eXogenous factor (SARIMAX) model (DTACO+SARIMAX) to forecast monkeypox cases...
This study presents a dipper-throated-based ant colony optimization (DTACO) with the Seasonal Auto-Regressive Integrated Moving Average with eXogenous factor (SARIMAX) model (DTACO+SARIMAX) to forecast monkeypox cases. The work optimizes the SARIMAX model using grid search cross-validation and fine-tunes its hyperparameters using DTACO to improve prediction accuracy. The suggested model's consistency and accuracy are considerable compared to previous studies. Comparisons with state-of-the-art models validate the proposed model's predictions. DTACO+SARIMAX can be used to control disease and monitor monkeypox. Healthcare organizations and governments can better manage and track the pandemic's course by offering accurate predictions, reducing public panic, and enabling effective pandemic planning. The Analysis of Variance (ANOVA) and Wilcoxon signed-rank tests are conducted on the proposed DTACO-SARIMAX model and compared models.
This paper mainly studies a fault identification and location technology for optical cable lines using transient traveling wave mode maxima method, which mainly includes the following steps: real-time monitoring and s...
详细信息
This paper mainly studies a fault identification and location technology for optical cable lines using transient traveling wave mode maxima method, which mainly includes the following steps: real-time monitoring and synchronous upload of traffic information; Information collection and preservation; Cable fault identification, including: current oscillogram drawing, fractal box dimension calculation, current information space transformation, discrete wavelet transform and wavelet coefficient calculation for flow information in mode space, inspection of maximum point of initial traveling wave mode of flow information, and cable fault identification; Cable fault location, including: drawing of current oscillogram, spatial transformation of current information β The module voltage component and the module voltage component of the flow information on the load side are calculated using discrete wavelet transform and wavelet coefficients. The maximum point of the initial traveling wave mode of the flow information is checked, and the cable fault is located. The technical process of this paper is simple, the calculation method is simple, the accident identification and location efficiency is high, the accuracy is good, the technology is comprehensive, and the application is strong.
The complex background in the soil image collected in the field natural environment will affect the subsequent soil image recognition based on machine vision. Segmenting the soil center area from the soil image can el...
The complex background in the soil image collected in the field natural environment will affect the subsequent soil image recognition based on machine vision. Segmenting the soil center area from the soil image can eliminate the influence of the complex background, which is an important preprocessing work for subsequent soil image recognition. For the first time, the deep learning method was applied to soil image segmentation, and the Mask R-CNN model was selected to complete the positioning and segmentation of soil images. Construct a soil image dataset based on the collected soil images, use the EISeg annotation tool to mark the soil area as soil, and save the annotation information; train the Mask R-CNN soil image instance segmentation model. The trained model can obtain accurate segmentation results for soil images, and can show good performance on soil images collected in different environments; the trained instance segmentation model has a loss value of 0.1999 in the training set, and the mAP of the validation set segmentation (IoU=0.5) is 0.8804, and it takes only 0.06s to complete image segmentation based on GPU acceleration, which can meet the real-time segmentation and detection of soil images in the field under natural conditions. You can get our code in the Conclusions.
What makes a talk successful? Is it the content or the presentation? We try to estimate the contribution of the speaker39;s oratory skills to the talk39;s success, while ignoring the content of the talk. By orator...
详细信息
This paper describes a new approach to the problem of interception of wireless communication channels between the legitimate users. Physical PHY Layer Security (PLS) is new topic enhancing the secrecy performance of a...
This paper describes a new approach to the problem of interception of wireless communication channels between the legitimate users. Physical PHY Layer Security (PLS) is new topic enhancing the secrecy performance of a Multi-User Multiple-Input-Multiple-Output (MU-MIMO) system for wireless communication from a single base-station to many users. Beamforming techniques such as "Pre-Maximum Ratio Combining (P-MRC) and Pre-Equal Gain Combining (P-EGC)" with assumption of perfect Channel State Information CSI is considered in order to achieve the perfect secure transmission between legitimate users. Along with beamforming, Artificial Noise (AN) is a technology that is being studied. The plotted achievable worst-case of secrecy rate down to the best secrecy rate value (threshold of secrecy rate) versus the power allocation fractions (The power splitting factor). The achieved threshold of secrecy rates for the implemented MU-MIMO were used by Matched Filter (MF) beamforming schemes (P-MRC) and (P-EGC) and suggested that the (P-MRC) is considered as an optimal beamforming technique in order to achieve a best secrecy rate or best threshold of secrecy rate for a nearest Information User (IU) to the Eavesdropper Eve.
The arapaima, also known as paiche or pirarucu is one of the largest freshwater fish in the Amazon, in which its cultivation has been growing progressively due to consumer demand. This research presents the design of ...
The arapaima, also known as paiche or pirarucu is one of the largest freshwater fish in the Amazon, in which its cultivation has been growing progressively due to consumer demand. This research presents the design of a real-time monitoring and control system of water parameters for paiche farming, which employs mechatronics concepts using the VdI-2206 methodology. The study controls and monitors the water's temperature, dissolved oxygen, and pH. Sensors obtain the values for each parameter, process them by a PLC, and project them on an HMI interface. The environment is programmed based on the information provided by FONDEPES, which mentions that the water temperature for paiche cultivation must be between 26°C to 31°C, the pH level between 6 and 8 UI, and dissolved oxygen with a value greater than or equal to 4 mg/L. In addition, the proposed mechatronic system aims to be a user-friendly system, where the primary function is the monitoring and controlling water parameters for paiche cultivation, resulting in the correct functioning of the system, thus concluding the feasibility of implementation and the contribution to the development of new research.
The ear, as an important part of the human head, has received much less attention compared to the human face in the area of computervision. Inspired by previous work on monocular 3D face reconstruction using an autoe...
详细信息
ISBN:
(纸本)9789897584886
The ear, as an important part of the human head, has received much less attention compared to the human face in the area of computervision. Inspired by previous work on monocular 3D face reconstruction using an autoencoder structure to achieve self-supervised learning, we aim to utilise such a framework to tackle the 3D ear reconstruction task, where more subtle and difficult curves and features are present on the 2D ear input images. Our Human Ear Reconstruction Autoencoder (HERA) system predicts 3D ear poses and shape parameters for 3D ear meshes, without any supervision to these parameters. To make our approach cover the variance for in-the-wild images, even grayscale images, we propose an in-the-wild ear colour model. The constructed end-to-end self-supervised model is then evaluated both with 2D landmark localisation performance and the appearance of the reconstructed 3D ears.
As construction projects enter a new era, building crack width monitoring has entered a stage of high-quality development. It is necessary to propose image processing propositions that better meet the needs of crack w...
As construction projects enter a new era, building crack width monitoring has entered a stage of high-quality development. It is necessary to propose image processing propositions that better meet the needs of crack width monitoring for building safety by focusing on the deep learning idea with image segmentation as the core. Based on the dynamic evolution of image processing development, according to the internal logic of the convolutional neural network, a theoretical analysis framework for the development of image segmentation is constructed. It can explain the image segmentation development mechanism jointly generated by the image processing mechanism and the optimization loop mechanism involved in image sampling and segmentation. From the perspective of quality change and practical deduction of image segmentation development, the possibility of moving towards the high-quality development goal of building crack width monitoring is further explored. The purpose of developing building crack width monitoring is to provide crack width estimates that meet expected standards for construction projects, which is committed to continuously improving the quality of crack width and improving the satisfaction of crack width monitoring. Therefore, it is necessary to strengthen the image segmentation control based on the quality of the inner loop of the convolutional neural network. Establishing an interaction and feedback mechanism between image segmentation and perception of the quality of a construction project, as well as establishing an evaluation system for image segmentation and monitoring of the crack width of a building, which will achieve high-quality development of building crack width monitoring, promote construction project safety, and truly meet the needs of construction projects.
Appearance-based gaze estimation has gained more and more attention because of its generality, robustness, and subject independence. Deep learning, which has made a great deal of success in computervision, has also g...
详细信息
Appearance-based gaze estimation has gained more and more attention because of its generality, robustness, and subject independence. Deep learning, which has made a great deal of success in computervision, has also greatly improved the accuracy of appearance-based gaze estimation. To further reduce the error in gaze estimation, we focus on extracting better feature information from eye and face images. In this paper, we propose a novel multimodal fusion gaze estimation model based on ConvNext and dilated convolution. In this model, the eye image and face image are used as input, and the ConvNext network is used to extract the features of the face image and the eye features are extracted by a dilated convolution-based network, and the feature map of the two images are fused using the fully connected layer to perform gaze estimation. In the experimental part, the designed model is verified on the public dataset MPIIGaze, and compared the proposed model with other gaze estimation models. The experimental results show that our proposed method has greatly improved the accuracy of gaze estimation on the MPIIGaze dataset compared to other related works. Our proposed multimodal fusion gaze estimation model achieves state-of-the-art result on the MPIIGaze dataset.
暂无评论