Generalisability of grasping algorithms, grasping target recognition rates, and real-time grasping judgements in robot grasping tasks have been modern industrial challenges, from analytical heuristics to more recent n...
Generalisability of grasping algorithms, grasping target recognition rates, and real-time grasping judgements in robot grasping tasks have been modern industrial challenges, from analytical heuristics to more recent new deeplearning strategies, grasping in complex scenarios is still the aim of several works’ that propose distinctive approaches. In this context, the purpose of this paper is to discuss the real-time robot grasping task, and for the two-dimensional form-closure proposed a real-time grasping point contact resolution method, this method uses traditional imageprocessing, the test proved that the two-dimensional Form-closure point resolution real-time stability of 100fps, for the limited resources of the effectiveness of the high requirements of the scene has a computationally efficient, easy to build the mathematical model, and can be interpreted with strong and other advantages.
It is said that health is wealth. Here, health refers to both physical health and mental health. People take various measures to take care of their physical health but ignore their mental health which can lead to depr...
详细信息
The Coronavirus disease (2019) has caused massive destruction of human lives and capital around the world. The latest variant Omicron is proved to be the most infectious of all its previous counterparts - Alpha, Beta ...
详细信息
The Coronavirus disease (2019) has caused massive destruction of human lives and capital around the world. The latest variant Omicron is proved to be the most infectious of all its previous counterparts - Alpha, Beta and Delta. Various measures are identified, tested and implemented to minimize the attack on humans. Face masks are one of those measures that are shown to be very effective in containing the infection. However, it requires continuous monitoring for law enforcement. In the present manuscript, a detailed research investigation using different ablation studies is carried out to develop the framework for face mask recognition using pre-trained deep convolution neural networks (DCNN) models used in conjunction with a fast single layer feed-forward neural network (SLFNN) commonly known as Extreme learning Machine (ELM) as classification technique. The ELM is well known for its realtime data processing capabilities and has been successfully applied both for regression and classification problems of imageprocessing and biomedical domain. It is for the first time that in this paper we have proposed the use of ELM as classifier for face mask detection. As a precursor to this, for feature selection, six pre-trained DCNNs such as Xception, Vgg16, Vgg19, ResNet50, ResNet 101 and ResNet152 are tested for this purpose. The best testing accuracy is obtained in case of ResNet152 transfer learning model used with ELM as the classifier. The performance evaluation through different ablation studies on testing accuracy explicitly proves that ResNet152 - ELM hybrid architecture is not only the best among the selected transfer learning models but also proves so when it is compared with several other classifiers used for the face mask detection operation. Through this investigation, novelty of the use of ResNet152 + ELM for face mask detection framework in realtime domain is established.
Points of gaze (PoGs) and motor behaviors impact sport climbing performance. A large dataset of global PoGs and climbing holds (CHs) is needed. Recent eye-tracking devices capture only local views, leading to time-con...
详细信息
ISBN:
(数字)9798331541842
ISBN:
(纸本)9798331541859
Points of gaze (PoGs) and motor behaviors impact sport climbing performance. A large dataset of global PoGs and climbing holds (CHs) is needed. Recent eye-tracking devices capture only local views, leading to time-consuming global localization. This study aims to automate global PoG and CH computation. A wireless eye-tracking device records PoGs and CHs during climbs. Artificial landmarks aid in mapping to global space. A CNN-based framework detects and classifies landmarks. Local PoGs and CHs are transformed globally using a homography transform. Cross-validation assessed the method's success rates and accuracies. The optimal framework computed global PoGs and CHs for 2,460 climbing cases. CH success rates were 80.90% ± 13.98%, with mean Euclidean distance errors of 0.0239 ± 0.0216 m. PoG success rates were 80.79% ± 10.74%. processingtime per frame averaged 115.14 ± 6.80 ms. The datasets will analyze gaze behaviors' effects on climbing outcomes and inform a decision-support system for sport climbing.
Efficiently processing medical images, such as whole slide images in digital pathology, is essential for timely diagnosing high-risk diseases. However, this demands advanced computing infrastructure, e.g., GPU servers...
详细信息
Recent advances in real-time Magnetic Resonance Imaging (rtMRI) provide an invaluable tool to study speech articulation. In this paper, we present an effective deeplearning approach for supervised detection and track...
详细信息
Recent advances in real-time Magnetic Resonance Imaging (rtMRI) provide an invaluable tool to study speech articulation. In this paper, we present an effective deeplearning approach for supervised detection and tracking of vocal tract contours in a sequence of rtMRI frames. We train a single input multiple output deep temporal regression network (DTRN) to detect the vocal tract (VT) contour and the separation boundary between different articulators. The DTRN learns the non-linear mapping from an overlapping fixed-length sequence of rtMRI frames to the corresponding articulatory movements, where a blend of the overlapping contour estimates defines the detected VT contour. The detected contour is refined at a post-processing stage using an appearance model to further improve the accuracy of VT contour detection. The proposed VT contour tracking model is trained and evaluated over the USC-TIMIT dataset. Performance evaluation is carried out using three objective assessment metrics for the separating landmark detection, contour tracking and temporal stability of the contour landmarks in comparison with three baseline approaches from the recent literature. Results indicate significant improvements with the proposed method over the state-of-the-art baselines.
Soft computing is facing a rapid evolution thanks to the development of artificial intelligence especially the deeplearning. With video surveillance technologies of soft computing, such as imageprocessing, computer ...
详细信息
Soft computing is facing a rapid evolution thanks to the development of artificial intelligence especially the deeplearning. With video surveillance technologies of soft computing, such as imageprocessing, computer vision, and pattern recognition combined with cloud computing, the construction of smart cities could be maintained and greatly enhanced. In this article, we focus on the online detection of action start task in video understanding and analysis, which is critical to the multimedia security in smart cities. We propose a novel model to tackle this problem and achieves state-of-the-art results on the benchmark THUMOS14 data set.
In situ exploration of planets beyond Mars will largely depend on autonomous robotic agents for the foreseeable future. These autonomous planetary explorers need to perceive and understand their surroundings in order ...
详细信息
In situ exploration of planets beyond Mars will largely depend on autonomous robotic agents for the foreseeable future. These autonomous planetary explorers need to perceive and understand their surroundings in order to make decisions that maximize science return and minimize risk. deeplearning has demonstrated strong performance on a variety of computer vision and imageprocessing tasks, and has become the main approach for powering terrestrial autonomous systems from robotic vacuum cleaners to self-driving cars. However, deeplearning systems require significant volumes of annotated data to optimize the models' parameters, which is a luxury not afforded by in situ missions to new locations in our Solar Sys-tem. Moreover, space-qualified hardware used on robotic space missions relies on legacy technologies due to power constraints and extensive flight qualification requirements (e.g., radiation tolerance), resulting in computational limitations that prevent the use of deeplearning models for real-time robotic perception tasks (e.g., obstacle detection, terrain segmentation). In this paper, we address these two challenges by leveraging self-supervised distillation to train small, efficient deeplearning models that can match or outperform state-of-the-art results obtained by significantly larger models on Mars image classification and terrain segmentation tasks. Using a set of 100,000 unlabeled images taken by Curiosity and large self-supervised vision models, we distill a variety of small model architectures and evaluate their performance on the published test sets for the MSL classification benchmark and the AI4Mars segmentation benchmark. Experimental results show that on the MSL v2.1 classification task, the best-performing student ResNet-18 model is able to achieve a model compression ratio of 5.2 when distilled from a pretrained ResNet-152 teacher model. In addition, we show that using in-domain images for distillation and increasing the dataset size for dis
暂无评论