检索结果-内蒙古大学图书馆

Autonomous embodied navigation task generation from natural language dialogues

science China(information sciences) 2025年第5期68卷 119-132页

作者： Haifeng XU Yongchang LI Lumeng MA Chunwen LI Yanzhi DONG Xiaohu YUAN Huaping LIU Department of Automation Tsinghua University School of Physics and Electronic Information Yantai University School of Electrical and Electronic Engineering Shanghai Institute of Technology Department of Computer Science and Technology Tsinghua University

Robots are increasingly being deployed in densely populated environments, such as homes, hotels, and office buildings, where they rely on explicit instructions from humans to perform tasks. However, complex tasks often require multiple instructions and prolonged monitoring, which can be time-consuming and demanding for users. Despite this, there is limited research on enabling robots to autonomously generate tasks based on real-life scenarios. Advanced intelligence necessitates robots to autonomously observe and analyze their environment and then generate tasks autonomously to fulfill human requirements without explicit commands. To address this gap, we propose the autonomous generation of navigation tasks using natural language dialogues. Specifically, a robot autonomously generates tasks by analyzing dialogues involving multiple persons in a real office environment to facilitate the completion of item transportation between various *** propose the leveraging of a large language model(LLM) through chain-of-thought prompting to generate a navigation sequence for a robot from dialogues. We also construct a benchmark dataset consisting of 625 multiperson dialogues using the generation capability of LLMs. Evaluation results and real-world experiments in an office building demonstrate the effectiveness of the proposed method.

关键词： proactive robot robot navigation service robot large language model

来源：评论

学校读者我要写书评

暂无评论

Hybrid CNN-ViT architecture to exploit spatio-temporal feature for fire recognition trained through transfer learning

引用

Multimedia Tools and Applications 2025年第8期84卷 4703-4732页

作者： Shahid, Mohammad Wang, Hong-Cyuan Chen, Yung-Yao Hua, Kai-Lung Dept. of Computer Science and Information Engineering National Taiwan University of Science and Technology Taipei Taiwan Dept. of Electronic and Computer Engineering National Taiwan University of Science and Technology Taipei Taiwan

Fires are becoming one of the major natural hazards that threaten the ecology, economy, human life and even more worldwide. Therefore, early fire detection systems are crucial to prevent fires from spreading out of control and causing destruction. Based on vision sensors, many fire detection techniques have evolved with the recent surge of curiosity in deep learning, which exploits the spatial features of individual images. However, fire can take different forms, scales, and combustion materials can produce different colors, making accurate fire detection from an image challenging. Small fires captured from long-distance cameras lack salient features, further complicating detection. This paper proposes a hybrid structure that uses attention-enhanced convolutional neural networks and vision transformers (CNN-ViT) to detect fire. The proposed CNN-ViT first pays spatial attention to every frame and then aggregates temporal contextual information from neighboring frames to improve detection performance. Due to the limited availability of training fire datasets, the study employs deep transfer learning for feature extraction using pre-trained CNN. We used various metrics to examine the efficacy of the proposed approach. The results showed that the CNN-ViT method outperformed previous models based on spatial-temporal characteristics by achieving a relative improvement in accuracy and F1 score. The satisfactory results on images contaminated with different intensities of noise confirm the robustness of the approach. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Fires

来源：评论

学校读者我要写书评

暂无评论

AI-enabled dental caries detection using transfer learning and gradient-based class activation mapping

引用

Journal of Ambient Intelligence and Humanized Computing 2024年第7期15卷 3009-3033页

作者： Inani, Hardik Mehta, Veerangi Bhavsar, Drashti Gupta, Rajeev Kumar Jain, Arti Akhtar, Zahid Department of Computer Science and Engineering Pandit Deendayal Energy University Gujarat Gandhinagar India Department of Computer Science and Engineering and Information Technology Jaypee Institute of Information Technology Uttar Pradesh Noida India Department of Network and Computer Security State University of New York Polytechnic Institute NY United States

Dental caries detection holds the key to unlocking brighter smiles and healthier lives by identifying one of the most common oral health issues early on. This vital topic sheds light on innovative ways to combat tooth decay, empowering individuals to take control of their oral health and maintain radiant smiles. This research paper delves into the realm of transfer learning techniques, aiming to elevate the precision and efficacy of dental caries diagnosis. Utilizing Keras ImageDataGenerator, a rich and balanced dataset is crafted by augmenting teeth images from the Kaggle teeth dataset. Five cutting-edge pre-trained architectures are harnessed in the transfer learning approach: EfficientNetV2B3, VGG19, InceptionResNetV2, Xception, and ResNet50, with each model, initialized using ImageNet weights and tailored top layers. A comprehensive set of evaluation metrics, encompassing accuracy, precision, recall, F1-score, and false negative rates are employed to gauge the performance of these architectures. The findings unveil the unique advantages and drawbacks of each model, illuminating the path to an optimal choice for dental caries detection using Grad-CAM (Gradient-weighted Class Activation Mapping). The testing accuracies achieved by EfficientNetV2B3, VGG19, InceptionResNetV2, Xception, and ResNet50 models stand at 95.89%, 96.58%, 93.15%, 93.15%, and 94.18%, respectively. The Training accuracies stood at 100%, 99.91%, 100%, 100% and 100%, meanwhile on validation we achieved 97.63%, 96.68%, 98.82%, 96.68%, and 100% accuracies for EfficientNetV2B3, VGG19, InceptionResNetV2, Xception, and ResNet50 models respectively. Capitalizing on transfer learning and juxtaposing diverse pre-trained architectures, this research paper paves the way for substantial advancements in dental diagnostic capabilities, culminating in enhanced patient outcomes and superior oral health. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

关键词： Chemical activation

来源：评论

学校读者我要写书评

暂无评论

Adversarial attacks and defenses for digital communication signals identification

引用

Digital Communications and Networks 2024年第3期10卷 756-764页

作者： Qiao Tian Sicheng Zhang Shiwen Mao Yun Lin College of Computer Science and Technology Harbin Engineering UniversityHarbin150001China College of Information and Communication Engineering Harbin Engineering UniversityHarbin150000China Department of Electrical and Computer Engineering Auburn UniversityAuburnAL36849USA

As modern communication technology advances apace,the digital communication signals identification plays an important role in cognitive radio networks,the communication monitoring and management *** has become a promising solution to this problem due to its powerful modeling capability,which has become a consensus in academia and ***,because of the data-dependence and inexplicability of AI models and the openness of electromagnetic space,the physical layer digital communication signals identification model is threatened by adversarial *** examples pose a common threat to AI models,where well-designed and slight perturbations added to input data can cause wrong ***,the security of AI models for the digital communication signals identification is the premise of its efficient and credible *** this paper,we first launch adversarial attacks on the end-to-end AI model for automatic modulation classifi-cation,and then we explain and present three defense mechanisms based on the adversarial *** we present more detailed adversarial indicators to evaluate attack and defense ***,a demonstration verification system is developed to show that the adversarial attack is a real threat to the digital communication signals identification model,which should be paid more attention in future research.

关键词： Digital communication signals identification AI model Adversarial attacks Adversarial defenses Adversarial indicators

来源：评论

学校读者我要写书评

暂无评论

Enhancing Security and Privacy in Distributed Face Recognition Systems through Blockchain and GAN Technologies

引用

computers, Materials & Continua 2024年第5期79卷 2609-2623页

作者： Muhammad Ahmad Nawaz Ul Ghani Kun She Muhammad Arslan Rauf Shumaila Khan Javed Ali Khan Eman Abdullah Aldakheel Doaa Sami Khafaga School of Information and Software Engineering University of Electronic Science and Technology of ChinaChengdu611731China Department of Computer Science University of Science&TechnologyBannu28100Pakistan Department of Computer Science University of HertfordshireHatfieldAL109ABUK Department of Computer Sciences College of Computer and Information SciencesPrincess Nourah bint Abdulrahman UniversityRiyadh11671Saudi Arabia

The use of privacy-enhanced facial recognition has increased in response to growing concerns about data securityand privacy in the digital age. This trend is spurred by rising demand for face recognition technology in a varietyof industries, including access control, law enforcement, surveillance, and internet communication. However,the growing usage of face recognition technology has created serious concerns about data monitoring and userprivacy preferences, especially in context-aware systems. In response to these problems, this study provides a novelframework that integrates sophisticated approaches such as Generative Adversarial Networks (GANs), Blockchain,and distributed computing to solve privacy concerns while maintaining exact face recognition. The framework’spainstaking design and execution strive to strike a compromise between precise face recognition and protectingpersonal data integrity in an increasingly interconnected environment. Using cutting-edge tools like Dlib for faceanalysis,Ray Cluster for distributed computing, and Blockchain for decentralized identity verification, the proposedsystem provides scalable and secure facial analysis while protecting user privacy. The study’s contributions includethe creation of a sustainable and scalable solution for privacy-aware face recognition, the implementation of flexibleprivacy computing approaches based on Blockchain networks, and the demonstration of higher performanceover previous methods. Specifically, the proposed StyleGAN model has an outstanding accuracy rate of 93.84%while processing high-resolution images from the CelebA-HQ dataset, beating other evaluated models such asProgressive GAN 90.27%, CycleGAN 89.80%, and MGAN 80.80%. With improvements in accuracy, speed, andprivacy protection, the framework has great promise for practical use in a variety of fields that need face recognitiontechnology. This study paves the way for future research in privacy-enhanced face recognition systems, emphasizingt

关键词： Facial recognition privacy protection blockchain GAN distributed systems

来源：评论

学校读者我要写书评

暂无评论

RoadFormer: Duplex Transformer for RGB-Normal Semantic Road Scene Parsing

IEEE Transactions on Intelligent Vehicles

引用

IEEE Transactions on Intelligent Vehicles 2024年第7期9卷 1-10页

作者： Li, Jiahang Zhan, Yikang Yun, Peng Zhou, Guangliang Chen, Qijun Fan, Rui College of Electronic and Information Engineering Tongji University Shanghai China Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong

The recent advancements in deep convolutional neural networks have shown significant promise in the domain of road scene parsing. Nevertheless, the existing works focus primarily on freespace detection, with little attention given to hazardous road defects that could compromise both driving safety and comfort. In this article, we introduce RoadFormer, a novel Transformer-based data-fusion network developed for road scene parsing. RoadFormer utilizes a duplex encoder architecture to extract heterogeneous features from both RGB images and surface normal information. The encoded features are subsequently fed into a novel heterogeneous feature synergy block for effective feature fusion and recalibration. The pixel decoder then learns multi-scale long-range dependencies from the fused and recalibrated heterogeneous features, which are subsequently processed by a Transformer decoder to produce the final semantic prediction. Additionally, we release SYN-UDTIRI, the first large-scale road scene parsing dataset that contains over 10,407 RGB images, dense depth images, and the corresponding pixel-level annotations for both freespace and road defects of different shapes and sizes. Extensive experimental evaluations conducted on our SYN-UDTIRI dataset, as well as on three public datasets, including KITTI road, CityScapes, and ORFD, demonstrate that RoadFormer outperforms all other state-of-the-art networks for road scene parsing. Specifically, RoadFormer ranks first on the KITTI road benchmark. Our source code, created dataset, and demo video are publicly available at ***/RoadFormer. IEEE

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

An Empirical Study of Nature-Inspired Algorithms for Feature Selection in Medical Applications

引用

Annals of Data science 2024年 1-46页

作者： Arora, Varun Agarwal, Parul Department of Computer Science and Information Technology Jaypee Institute of Information Technology Sector-62 Noida201301 India

Nature-inspired algorithms (NIA) are proven to be the potential tool for solving intricate optimization problems and aid in the development of better computational techniques. In recent years, these algorithms have raised considerable interest to optimize feature selection problems. In literature, NIA is found to select relevant features among available features in the diagnosis of many chronic diseases. In this paper, a comprehensive review of existing nature-inspired feature selection techniques is presented. Along with this, the fundamental definitions of feature selection and the usage of NIA to optimize feature selection are shown. We have given a review showcasing the NIA application for selecting feature subsets from the available features in the domain of medical applications. The paper reviews and analyzes numerous relevant papers from 2008 to 2022 on feature selection through NIA on biomedical applications. Moreover, to find the best optimization algorithm for feature selection, we have conducted experiments among four well-known nature-inspired algorithms on ten benchmark datasets of the biomedical domain for classification. We have reported results on various state-of-the-art evaluation measures and presented the convergence graphs for analysis. Based on the average rank of fitness values, Particle Swarm Optimization is found to be better than Harris Hawk Optimization, Grey Wolf Optimization, and Whale Optimization. In this paper, we have also presented some open challenges of this research area to guide researchers as well as experts of computational intelligence for future work. The paper will help future researchers understand the use and implementation of nature-inspired algorithms for feature selection in the medical domain. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

关键词： Feature Selection

来源：评论

学校读者我要写书评

暂无评论

Image-based rice leaf disease detection using CNN and generative adversarial network

引用

Neural Computing and Applications 2025年第1期37卷 439-456页

作者： Ramadan, Syed Taha Yeasin Islam, Md Shafiqul Sakib, Tanjim Sharmin, Nusrat Rahman, Md. Mokhlesur Rahman, Md. Mahbubur Department of Computer Science and Engineering Military Institute of Science and Technology Dhaka Bangladesh

Rice is a major crop and staple food for more than half of the world’s population and plays a vital role in ensuring food security as well as the global economy pests and diseases pose a threat to the production of rice and have a substantial impact on the yield and quality of the crop. In recent times, deep learning methods have gained prominence in predicting rice leaf diseases. Despite the increasing use of these methods, there are notable limitations in existing approaches. These include a scarcity of extensive and diverse collections of leaf disease images, lower accuracy rates, higher time complexity, and challenges in real-time leaf disease detection. To address the limitations, we explicitly investigate various data augmentation approaches using different generative adversarial networks (GANs) for rice leaf disease detection. Along with the GAN model, advanced CNN-based classifiers have been applied to classify the images with improving data augmentation. Our approach involves employing various GANs to generate high-quality synthetic images. This strategy aims to tackle the challenges posed by limited and imbalanced datasets in the identification of leaf diseases. The key benefit of incorporating GANs in leaf disease detection lies in their ability to create synthetic images, effectively augmenting the dataset’s size, enhancing diversity, and reducing the risk of overfitting. For dataset augmentation, we used three distinct GAN architectures—namely simple GAN, CycleGAN, and DCGAN. Our experiments demonstrated that models utilizing the GAN-augmented dataset generally outperformed those relying on the non-augmented dataset. Notably, the CycleGAN architecture exhibited the most favorable outcomes, with the MobileNet model achieving an accuracy of 98.54%. These findings underscore the significant potential of GAN models in improving the performance of detection models for rice leaf diseases, suggesting their promising role in the future research within this doma

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

A Generative Image Steganography Based on Disentangled Attribute Feature Transformation and Invertible Mapping Rule

引用

computers, Materials & Continua 2025年第4期83卷 1149-1171页

作者： Xiang Zhang Shenyan Han Wenbin Huang Daoyong Fu School of Computer Science Nanjing University of Information Science and TechnologyNanjing210044China Engineering Research Center of Digital Forensics Nanjing University of Information Science and TechnologyMinistry of EducationNanjing210044China

Generative image steganography is a technique that directly generates stego images from secret *** traditional methods,it theoretically resists steganalysis because there is no cover ***,the existing generative image steganography methods generally have good steganography performance,but there is still potential room for enhancing both the quality of stego images and the accuracy of secret information ***,this paper proposes a generative image steganography algorithm based on attribute feature transformation and invertible mapping ***,the reference image is disentangled by a content and an attribute encoder to obtain content features and attribute features,***,a mean mapping rule is introduced to map the binary secret information into a noise vector,conforming to the distribution of attribute *** noise vector is input into the generator to produce the attribute transformed stego image with the content feature of the reference ***,we design an adversarial loss,a reconstruction loss,and an image diversity loss to train the proposed *** results demonstrate that the stego images generated by the proposed method are of high quality,with an average extraction accuracy of 99.4%for the hidden ***,since the stego image has a uniform distribution similar to the attribute-transformed image without secret information,it effectively resists both subjective and objective steganalysis.

关键词： Image information hiding generative information hiding disentangled attribute feature transformation invertible mapping rule steganalysis resistance

来源：评论

学校读者我要写书评

暂无评论

Exploiting Retina Biometric Fused with Encoded Hash for Designing Watermarked Convolutional Hardware IP Against Piracy

引用

SN computer science 2024年第8期5卷 1-17页

作者： Chaurasia, Rahul Sengupta, Anirban Department of Computer Science and Engineering Indian Institute of Information Technology Bhopal Bhopal India Department of Computer Science and Engineering Indian Institute of Technology Indore India

The convolution layer in a convolutional neural network (CNN) is highly computationally intensive. It is crucial to design reusable low-cost hardware IP for convolutional layer for enabling hardware-based feature extraction. However, the involvement of fake IP vendor/untrustworthy broker in the integrated circuit (IC) supply chain, makes these IPs susceptible to the threat of piracy. The proposed approach presents high- level synthesis (HLS) driven watermarking methodology for designing low-cost and secure convolutional hardware IP. The presented watermarking approach employs complier-driven high-level transformation and exploits retinal signature fused with the encoded hash for piracy detective countermeasure. The proposed approach, therefore, firstly performs compiler-driven high-level transformation in order to optimize the design latency, followed by embedding the watermark of an authentic IP vendor. The generated watermark in the form of encoded hardware watermarking constraints (digital evidence) is covertly embedded into the resulting optimized design during the register allocation module of HLS. The proposed approach achieves the following: (i) optimized and secure design for convolutional hardware IP, (ii) robust detection of pirated IP at zero design cost overhead, (iii) significantly lower probability of coincidence (in the range of 1.3E−06 to 1.2E−09) indicating stronger digital evidence and higher tamper tolerance (in the range of 2.64E+460 to 9.60E+698) than recent approaches. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.

关键词： Convolutional layer Encoded hash Hardware security HLS IP design Retina biometrics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：