In order to solve the problem of high error rate and poor real-time performance in the workpieces sorting process for traditional industrial robotic arms, this paper designed a vision robotic arm testing platform with...
详细信息
ISBN:
(数字)9781665408530
ISBN:
(纸本)9781665408530;9781665408523
In order to solve the problem of high error rate and poor real-time performance in the workpieces sorting process for traditional industrial robotic arms, this paper designed a vision robotic arm testing platform with real-timeprocessing ability, and proposes a kind of workpieces sorting method based on improved YOLOv5 used to the vision robotic arm. By replacing the focus layer in the YOLOv5 backbone network, embedding the coordinate attention module, which re-weights the feature maps from the channel and spatial, improves the object detection accuracy of the YOLOv5 model. The workpiece sorting test platform consists of an NVIDIA Jetson nano controller and a vision robotic arm. The hand-eye calibration of the robotic arm is completed by the Zhang Zhengyou calibration method and the TsarLenz method. The workpiece target image was collected, tagged and data augmented to create the target workpiece dataset. And use TensorRT to optimize the inference acceleration of the model to adapt to the hardware platform requirements. The test shows that the improved YOLOv5 model can well ensure the stable operation of the test platform, and improve the accuracy and real -time performance of workpiece target recognition.
Multiplane images (MPIs) have shown to be excellent scene representations to synthesize new scene views. Indeed, MPIs are able to model challenging occlusions and reflections, and allow to render novel images in real ...
详细信息
Multiplane images (MPIs) have shown to be excellent scene representations to synthesize new scene views. Indeed, MPIs are able to model challenging occlusions and reflections, and allow to render novel images in realtime and with angular consistency. However, their memory footprint constitutes their major limitation. In this work, we propose a learning-based method that computes compact and adaptive MPIs. Our network promotes sparsity in the MPIs to only keep the necessary scene information. Besides, we adapt the depth sampling to the given scene to optimize the available memory and increase the synthesis quality with a restricted number of planes. Moreover, in contrast to recent work, our approach does not need individual training per scene and is able to generalize well to unseen scenarios. An extensive evaluation shows the superiority of our approach with respect to the state of the art on diverse view synthesis datasets.
Driver behavior recognition (DBR) helps to ensure driver safety by alerting drivers about potential hazards and minimizing them. In this paper, we use deeplearning-based neural architecture search (NAS) to classify d...
详细信息
With the continuous development and maturity of deeplearning technologies, security issues in deeplearning are also getting more and more attention. The generation of adversarial examples makes scholars more aware o...
详细信息
In day-to-day life, it can be difficult for a person to devote his time to attend Yoga classes. In Yoga sessions, there might be a lack of individual attention for each person. While performing poses, incorrect muscle...
详细信息
In day-to-day life, it can be difficult for a person to devote his time to attend Yoga classes. In Yoga sessions, there might be a lack of individual attention for each person. While performing poses, incorrect muscle usage might lead to long-term muscle pain, back pain or many other deformities. To solve the aforementioned problems, a web application is built where a person can correct yoga pose. The Proposed methodology is working with TensorFlow lite Pose detection python module for recognizing human action based on Yoga Pose Classification using imageprocessing and deeplearning. The Objective of pose estimation is for monitoring the movement of human pose for distinct exercises. From this, the recognition of yoga poses can be done using backend part and wrongly recognized yoga poses can be corrected using frontend part. A real-time test is also carried out within a group of 5 people (three men and two women), and the accuracy attained is around 90%. Using deeplearning, the proposed model accuracy is evaluated by fitting the training data and predicting it over the testing data which is estimated to be around 98%.
In this work, we explored the possibility of developing a deeplearning model which can detect potholes on roads in realtime with maximum accuracy and minimum inference delay. We compared the results of image process...
详细信息
Longitudinal brain alignment is critical for disease monitoring and adaptive treatment planning in glioblastoma (GBM) patients. However, the current methods are either non-adaptive to pathological brains, or time and ...
详细信息
ISBN:
(纸本)9781510633940
Longitudinal brain alignment is critical for disease monitoring and adaptive treatment planning in glioblastoma (GBM) patients. However, the current methods are either non-adaptive to pathological brains, or time and labor-intensive. Here, we aim to develop a novel deep-learning-based framework for longitudinal postoperative brain GBM scan registration. The proposed pathology adaptive registration framework (PARF) adopts a double UNET architecture: a 2D 7-level UNET, NETseg, for pathology segmentation, and a 3D 5-level UNET, NETseg, for unsupervised image registration, connected through a spatial transformer and a volume combiner. NETseg was first trained separately and then combined with NETseg for pathology adaptive registration training. In aggregated registration testing of PARF, 36 registrations from 18 intra-subject pairs of post-operative follow-up MR scans were selected, and the results were compared to those from current state-of-the-art methods as well as non-adaptive NETseg alone. PARF is significantly faster and more accurate than comparison methods, in terms of sum-of-squared differences, segmentation alignment dice coefficients, and landmark mislignment errors. PARF may pave the path for various clinical and research applications that depend on the accurate registration of GBM longitudinal images.
Ultrasound imaging has been widely used for clinical diagnosis. However, the inherent speckle noise will degrade the quality of ultrasound images. Existing despeckling methods cannot deliver sufficient speckle reducti...
详细信息
Ultrasound imaging has been widely used for clinical diagnosis. However, the inherent speckle noise will degrade the quality of ultrasound images. Existing despeckling methods cannot deliver sufficient speckle reduction and preserve image details well at high noise corruption and they cannot realize real-time ultrasound image denoising. With the popularity of deeplearning, supervised learning for image denoising has recently attracted considerable attention. In this paper, we have proposed a novel residual UNet using mixed-attention mechanism (MARU) for real-time ultrasound image despeckling. In view of the signal-dependent characteristics of speckle noise, we have designed an encoder-decoder network to reconstruct the despeckled image by extracting features from the noisy image. Furthermore, a lightweight mixed-attention block is proposed to effectively enhance the image features and suppress some speckle noise during the encoding phase by using separation and re-fusion strategy for channel and spatial attention. Besides, we have graded the speckle noise levels with a certain interval and designed an algorithm to estimate the noise levels for despeckling real ultrasound images. Experiments have been done on the natural images, the synthetic image, the image simulated using Field II and the real ultrasound images. Compared with existing despeckling methods, the proposed network has achieved the state-of-the-art despeckling performance in terms of subjective human vision and such quantitative indexes as peak signal to noise ratio (PSNR), structural similarity (SSIM), equivalent number of looks (ENL) and contrast-to-noise ratio (CNR).
With the proliferation of video surveillance devices, the value of computer-assisted detection of anomalous occurrences in video streams has increased. Abnormal prevalence can also be viewed as an abnormal dip compare...
详细信息
The rapidly evolving field of photoacoustic tomography utilizes endogenous chromophores to extract both functional and structural information from deep within tissues. It is this power to perform precise quantitative ...
详细信息
The rapidly evolving field of photoacoustic tomography utilizes endogenous chromophores to extract both functional and structural information from deep within tissues. It is this power to perform precise quantitative measurements in vivo-with endogenous or exogenous contrast-that makes photoacoustic tomography highly promising for clinical translation in functional brain imaging, early cancer detection, real-time surgical guidance, and the visualization of dynamic drug responses. Considering photoacoustic tomography has benefited from numerous engineering innovations, it is of no surprise that many of photoacoustic tomography's current cutting-edge developments incorporate advances from the equally novel field of artificial intelligence. More specifically, alongside the growth and prevalence of graphical processing unit capabilities within recent years has emerged an offshoot of artificial intelligence known as deeplearning. Rooted in the solid foundation of signal processing, deeplearning typically utilizes a method of optimization known as gradient descent to minimize a loss function and update model parameters. There are already a number of innovative efforts in photoacoustic tomography utilizing deeplearning techniques for a variety of purposes, including resolution enhancement, reconstruction artifact removal, undersampling correction, and improved quantification. Most of these efforts have proven to be highly promising in addressing long-standing technical obstacles where traditional solutions either completely fail or make only incremental progress. This concise review focuses on the history of applied artificial intelligence in photoacoustic tomography, presents recent advances at this multifaceted intersection of fields, and outlines the most exciting advances that will likely propagate into promising future innovations.
暂无评论