检索结果-内蒙古大学图书馆

Big Data & Artificial Intelligence & Software Engineering (ICBASE), International Conference on

作者： Peitao Hong Liucun Zhu Jiyuan Liu Zhenyu Zhang Liang Li Wenhao Bai Zhanpeng Jing Shanjun Zhang College of Mechanical and Marine Engineering Beibu Gulf University Guangxi China Institute of Advanced Science and Technology Beibu Gulf University Guangxi China Research Institute for Integrated Science Kanagawa University Kanagawa Japan

Generalisability of grasping algorithms, grasping target recognition rates, and real-time grasping judgements in robot grasping tasks have been modern industrial challenges, from analytical heuristics to more recent new deep learning strategies, grasping in complex scenarios is still the aim of several works’ that propose distinctive approaches. In this context, the purpose of this paper is to discuss the real-time robot grasping task, and for the two-dimensional form-closure proposed a real-time grasping point contact resolution method, this method uses traditional image processing, the test proved that the two-dimensional Form-closure point resolution real-time stability of 100fps, for the limited resources of the effectiveness of the high requirements of the scene has a computationally efficient, easy to build the mathematical model, and can be interpreted with strong and other advantages.

关键词：

来源：评论

学校读者我要写书评

暂无评论

real-time Emotion Detection and Song Recommendation Using CNN Architecture 2nd

Real-Time Emotion Detection and Song Recommendation Using CN...

引用

2nd International Conference on Machine learning and Information processing, ICMLIP 2020

作者： Singh, Adarsh Kumar Kaur, Rajsonal Sahu, Devraj Bilgaiyan, Saurabh School of Electronics Engineering KIIT Deemed to Be University Bhubaneswar Odisha India School of Computer Engineering KIIT Deemed to Be University Bhubaneswar Odisha India

ISBN: (纸本)9789813348585

It is said that health is wealth. Here, health refers to both physical health and mental health. People take various measures to take care of their physical health but ignore their mental health which can lead to depression and even diseases like diabetes mellitus and so on. Emotion detection can help us to diagnose our mental health status. Therefore, this paper proposes a theory for emotion detection and then a recommendation of a song to enhance the user’s mood by using the features provided by deep learning and image processing. Here, convolutional neural network-based (CNN) LeNet architecture has been used for emotion detection. The KDEF dataset is used for feeding input to the CNN model and then training it. The model has been trained for detecting the emotion. After training the model, a training accuracy of 98.03% and a validation accuracy of 97.96% have been achieved for correctly recognizing the seven different emotions, that is, sad, disgust, happy, afraid, neutral, angry and surprise through facial expressions. © 2021, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： deep learning

来源：评论

学校读者我要写书评

暂无评论

A novel DCNN-ELM hybrid framework for face mask detection

引用

INTELLIGENT SYSTEMS WITH APPLICATIONS 2023年 17卷 200175页

作者： Agarwal, Charu Itondia, Pranjul Mishra, Anurag Ajay Kumar Garg Engn Coll Dept Comp Sci & Engn Ghaziabad 201009 Uttar Pradesh India Univ Delhi Deendayal Upadhyay Coll Dept Elect Delhi 110078 India

The Coronavirus disease (2019) has caused massive destruction of human lives and capital around the world. The latest variant Omicron is proved to be the most infectious of all its previous counterparts - Alpha, Beta and Delta. Various measures are identified, tested and implemented to minimize the attack on humans. Face masks are one of those measures that are shown to be very effective in containing the infection. However, it requires continuous monitoring for law enforcement. In the present manuscript, a detailed research investigation using different ablation studies is carried out to develop the framework for face mask recognition using pre-trained deep convolution neural networks (DCNN) models used in conjunction with a fast single layer feed-forward neural network (SLFNN) commonly known as Extreme learning Machine (ELM) as classification technique. The ELM is well known for its real time data processing capabilities and has been successfully applied both for regression and classification problems of image processing and biomedical domain. It is for the first time that in this paper we have proposed the use of ELM as classifier for face mask detection. As a precursor to this, for feature selection, six pre-trained DCNNs such as Xception, Vgg16, Vgg19, ResNet50, ResNet 101 and ResNet152 are tested for this purpose. The best testing accuracy is obtained in case of ResNet152 transfer learning model used with ELM as the classifier. The performance evaluation through different ablation studies on testing accuracy explicitly proves that ResNet152 - ELM hybrid architecture is not only the best among the selected transfer learning models but also proves so when it is compared with several other classifiers used for the face mask detection operation. Through this investigation, novelty of the use of ResNet152 + ELM for face mask detection framework in real time domain is established.

关键词： COVID-19 Face mask detector Masked face Transfer learning Convolution neural network

来源：评论

学校读者我要写书评

暂无评论

基于多任务学习的焊缝图像激光条纹分割与特征点定位方法研究

引用

中国激光 2023年第16期50卷 71-81页

作者：黄义庚王大庆江曼殷浩宇高理富中国科学院合肥物质科学研究院智能机械研究所安徽合肥230031 中国科学技术大学安徽合肥230026

焊缝信息的快速准确获取是实现自动化焊接的首要问题。然而,在实际焊接过程中,电弧、飞溅、强反射光等噪声会严重污染采集的图像,导致焊缝定位偏移,最终导致跟踪失败。为了提高跟踪过程中的焊缝定位精度与图像处理速度,本文提出了一种... 详细信息

焊缝信息的快速准确获取是实现自动化焊接的首要问题。然而,在实际焊接过程中,电弧、飞溅、强反射光等噪声会严重污染采集的图像,导致焊缝定位偏移,最终导致跟踪失败。为了提高跟踪过程中的焊缝定位精度与图像处理速度,本文提出了一种将激光条纹分割与焊缝特征点定位相结合的轻量级多任务深度学习模型。该模型由编码器和解码器组成,激光条纹分割子任务与焊缝特征点定位子任务共用编码器主干网络,解码器包含激光条纹分割分支和基于可微空间到数值转换(DSNT)的焊缝特征点定位分支,整个模型遵从轻量化设计思想,同时利用多个子任务之间的相关信息,进一步提升各子任务的性能。实验结果表明,所提模型能够有效克服各类焊接噪声,完成焊缝特征的提取,单幅图像的处理时间约为11.45 ms,特征点定位精度可达0.1872 pixel,在自动化焊接方面具有广阔的应用前景。

关键词：激光技术焊缝跟踪轻量级多任务激光条纹分割特征点定位

来源：评论

学校读者我要写书评

暂无评论

deep learning for Automated Localization of Gaze Points and Climbing Holds

Deep Learning for Automated Localization of Gaze Points and ...

引用

Workshops on image processing Theory, Tools and Applications, IPTA

作者： William Pantry Tan–Nhu Nguyen Ludovic Seifert Guillaume Hacques Alexandre Perier Youssef Chahir Normandie Univ UNICAEN ENSICAEN CNRS GREYC Caen France Eastern International University Thu Dau Mot City Binh Duong Province Vietnam CETAPS EA3832 Laboratory University of Rouen Normandy Rouen France dept. name of organization (of Aff.) CETAPS EA3832 Laboratory Rouen France

ISBN: (数字)9798331541842

ISBN: (纸本)9798331541859

Points of gaze (PoGs) and motor behaviors impact sport climbing performance. A large dataset of global PoGs and climbing holds (CHs) is needed. Recent eye-tracking devices capture only local views, leading to time-consuming global localization. This study aims to automate global PoG and CH computation. A wireless eye-tracking device records PoGs and CHs during climbs. Artificial landmarks aid in mapping to global space. A CNN-based framework detects and classifies landmarks. Local PoGs and CHs are transformed globally using a homography transform. Cross-validation assessed the method's success rates and accuracies. The optimal framework computed global PoGs and CHs for 2,460 climbing cases. CH success rates were 80.90% ± 13.98%, with mean Euclidean distance errors of 0.0239 ± 0.0216 m. PoG success rates were 80.79% ± 10.74%. processing time per frame averaged 115.14 ± 6.80 ms. The datasets will analyze gaze behaviors' effects on climbing outcomes and inform a decision-support system for sport climbing.

关键词： Location awareness Wireless communication Performance evaluation Accuracy Transforms Gaze tracking Motors real-time systems Synchronization Sports

来源：评论

学校读者我要写书评

暂无评论

纹理感知联合颜色直方图特征的水下图像增强

引用

光学精密工程 2024年第13期32卷 2112-2127页

作者：袁国铭刘海军李晓丽张瑞蕾单维锋防灾科技学院应急管理学院河北三河065201

基于深度学习的水下图像增强方法在视觉效果上表现出色,但已有的端到端网络较少针对水下场景中常见的颜色失真和纹理模糊来精心构建网络结构。为了提高网络的性能,提出了纹理感知联合颜色直方图特征的水下图像增强网络,它包含纹理感知网... 详细信息

基于深度学习的水下图像增强方法在视觉效果上表现出色,但已有的端到端网络较少针对水下场景中常见的颜色失真和纹理模糊来精心构建网络结构。为了提高网络的性能,提出了纹理感知联合颜色直方图特征的水下图像增强网络,它包含纹理感知网络,颜色直方图特征提取网络以及颜色纹理融合的水下图像增强网络。纹理感知网络的构建引入变形Transformer模块,该模块利用空间感知的变形卷积设计Transformer模块的多头注意力机制,有效地将变形卷积的几何感知能力与Transformer模块的全局语义捕获能力相结合来提取纹理特征。颜色直方图特征提取网络认为直方图中包含丰富的颜色信息,利用真实水下图像的直方图来监督该网络提取颜色特征。最后,利用所提的颜色纹理融合模块将前两个网络提取的纹理和颜色特征融合,并注入到后续的颜色纹理融合的水下图像增强网络中,以实施水下图像增强,该操作不但有效地保留了纹理结构,校正了失真的颜色,还使得纹理和颜色信息保存一致。实验结果表明,本文算法较已有的水下图像增强算法,具有更好的视觉增强效果,水下图像质量测量指标(Underwater image Quality Metric,UIQM)提高10%,单张图像的运行时间缩短了9%,仅为0.051 s,满足工程实践中水下视觉增强任务要求。

关键词：图像增强多头注意力机制纹理感知卷积神经网络深度学习

来源：评论

学校读者我要写书评

暂无评论

Towards Privacy-, Budget-, and Deadline-Aware Service Optimization for Large Medical image processing across Hybrid Clouds

arXiv

引用

arXiv 2024年

作者： Wang, Yuandou Kanwal, Neel Engan, Kjersti Rong, Chunming Grosso, Paola Zhao, Zhiming University of Amsterdam Netherlands Department of Electrical Engineering and Computer Science University of Stavanger Norway

Efficiently processing medical images, such as whole slide images in digital pathology, is essential for timely diagnosing high-risk diseases. However, this demands advanced computing infrastructure, e.g., GPU servers for deep learning inferencing, and local processing is time-consuming and costly. Besides, privacy concerns further complicate the employment of remote cloud infrastructures. While previous research has explored privacy and security-aware workflow scheduling in hybrid clouds for distributed processing, privacy-preserving data splitting, optimizing the service allocation of outsourcing computation on split data to the cloud, and privacy evaluation for large medical images still need to be addressed. This study focuses on tailoring a virtual infrastructure within a hybrid cloud environment and scheduling the image processing services while preserving privacy. We aim to minimize the use of untrusted nodes, lower monetary costs, and reduce execution time under privacy, budget, and deadline requirements. We consider a two-phase solution and develop 1) a privacy-preserving data splitting algorithm and 2) a greedy Pareto front-based algorithm for optimizing the service allocation. We conducted experiments with real and simulated data to validate and compare our method with a baseline. The results show that our privacy mechanism design outperforms the baseline regarding the average lower band on individual privacy and information gain for privacy evaluation. In addition, our approach can obtain various Pareto optimal-based allocations with users’ preferences on the maximum number of untrusted nodes, budget, and time threshold. Our solutions often dominate the baseline’s solution and are superior on a tight budget. Specifically, our approach has been ahead of baseline, up to 85.2% and 6.8% in terms of the total financial and time costs, respectively. © 2024, CC BY.

关键词： Machine design

来源：评论

学校读者我要写书评

暂无评论

Vocal Tract Contour Tracking in rtMRI Using deep Temporal Regression Network

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE processing 2020年 28卷 3053-3064页

作者： Asadiabadi, Sasan Erzin, Engin Koc Univ Elect & Elect Engn Dept TR-34450 Istanbul Turkey Koc Univ Comp Engn Dept KUIS AI Lab TR-34450 Istanbul Turkey Koc Univ Elect & Elect Engn Dept TR-34450 Istanbul Turkey

Recent advances in real-time Magnetic Resonance Imaging (rtMRI) provide an invaluable tool to study speech articulation. In this paper, we present an effective deep learning approach for supervised detection and tracking of vocal tract contours in a sequence of rtMRI frames. We train a single input multiple output deep temporal regression network (DTRN) to detect the vocal tract (VT) contour and the separation boundary between different articulators. The DTRN learns the non-linear mapping from an overlapping fixed-length sequence of rtMRI frames to the corresponding articulatory movements, where a blend of the overlapping contour estimates defines the detected VT contour. The detected contour is refined at a post-processing stage using an appearance model to further improve the accuracy of VT contour detection. The proposed VT contour tracking model is trained and evaluated over the USC-TIMIT dataset. Performance evaluation is carried out using three objective assessment metrics for the separating landmark detection, contour tracking and temporal stability of the contour landmarks in comparison with three baseline approaches from the recent literature. Results indicate significant improvements with the proposed method over the state-of-the-art baselines.

关键词： Estimation Magnetic resonance imaging Speech processing image segmentation Training Heating systems Tracking Appearance model contour detection deep neural network real-time magnetic resonance imaging (rtMRI) speech production vocal tract

来源：评论

学校读者我要写书评

暂无评论

Online Detection of Action Start via Soft Computing for Smart City

引用

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS 2021年第1期17卷 524-533页

作者： Wang, Tian Chen, Yang Lv, Hongqiang Teng, Jing Snoussi, Hichem Tao, Fei Beihang Univ Sch Automat Sci & Elect Engn Beijing 100083 Peoples R China Xi An Jiao Tong Univ Elect & Informat Engn Sch Xian 710049 Peoples R China North China Elect Power Univ Sch Control & Comp Engn Beijing 102206 Peoples R China Univ Technol Troyes Inst Charles Delaunay LM2S FRE CNRS 2019 F-10004 Troyes France

Soft computing is facing a rapid evolution thanks to the development of artificial intelligence especially the deep learning. With video surveillance technologies of soft computing, such as image processing, computer vision, and pattern recognition combined with cloud computing, the construction of smart cities could be maintained and greatly enhanced. In this article, we focus on the online detection of action start task in video understanding and analysis, which is critical to the multimedia security in smart cities. We propose a novel model to tackle this problem and achieves state-of-the-art results on the benchmark THUMOS14 data set.

关键词： Task analysis Streaming media Smart cities real-time systems Machine learning Semantics Cloud computing Action start cloud computing online detection smart city soft computing video analysis

来源：评论

学校读者我要写书评

暂无评论

Self-supervised Distillation for Computer Vision Onboard Planetary Robots

Self-supervised Distillation for Computer Vision Onboard Pla...

引用

IEEE Conference on Aerospace

作者： Edwin Goh Isaac R. Ward Grace Vincent Kai Pak Jingdao Chen Brian Wilson Jet Propulsion Laboratory California Institute of Technology Electrical and Computer Engineering North Carolina State University Computer Science and Engineering Mississippi State University

In situ exploration of planets beyond Mars will largely depend on autonomous robotic agents for the foreseeable future. These autonomous planetary explorers need to perceive and understand their surroundings in order to make decisions that maximize science return and minimize risk. deep learning has demonstrated strong performance on a variety of computer vision and image processing tasks, and has become the main approach for powering terrestrial autonomous systems from robotic vacuum cleaners to self-driving cars. However, deep learning systems require significant volumes of annotated data to optimize the models' parameters, which is a luxury not afforded by in situ missions to new locations in our Solar Sys-tem. Moreover, space-qualified hardware used on robotic space missions relies on legacy technologies due to power constraints and extensive flight qualification requirements (e.g., radiation tolerance), resulting in computational limitations that prevent the use of deep learning models for real-time robotic perception tasks (e.g., obstacle detection, terrain segmentation). In this paper, we address these two challenges by leveraging self-supervised distillation to train small, efficient deep learning models that can match or outperform state-of-the-art results obtained by significantly larger models on Mars image classification and terrain segmentation tasks. Using a set of 100,000 unlabeled images taken by Curiosity and large self-supervised vision models, we distill a variety of small model architectures and evaluate their performance on the published test sets for the MSL classification benchmark and the AI4Mars segmentation benchmark. Experimental results show that on the MSL v2.1 classification task, the best-performing student ResNet-18 model is able to achieve a model compression ratio of 5.2 when distilled from a pretrained ResNet-152 teacher model. In addition, we show that using in-domain images for distillation and increasing the dataset size for dis

关键词： deep learning image segmentation Mars image coding Computational modeling Predictive models Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：