检索结果-内蒙古大学图书馆

Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024

作者： Dang, Cuong Le, Dung D. Le, Thai FPT Software AI Center Viet Nam College of Engineering and Computer Science VinUniversity Viet Nam Department of Computer Science Indiana University United States

ISBN: (纸本)9798891760998

Existing works have shown that fine-tuned textual transformer models achieve state-of-the-art prediction performances but are also vulnerable to adversarial text perturbations. Traditional adversarial evaluation is often done only after fine-tuning the models and ignoring the training data. In this paper, we want to prove that there is also a strong correlation between training data and model robustness. To this end, we extract 13 different features representing a wide range of input fine-tuning corpora properties and use them to predict the adversarial robustness of the fine-tuned models. Focusing mostly on encoder-only transformer models BERT and RoBERTa with additional results for BART, ELECTRA, and GPT2, we provide diverse evidence to support our argument. First, empirical analyses show that (a) extracted features can be used with a lightweight classifier such as Random Forest to predict the attack success rate effectively, and (b) features with the most influence on the model robustness have a clear correlation with the robustness. Second, our framework can be used as a fast and effective additional tool for robustness evaluation since it (a) saves 30x-193x runtime compared to the traditional technique, (b) is transferable across models, (c) can be used under adversarial training, and (d) robust to statistical randomness. Our code is publicly available at https://***/CaptainCuong/RobustText_ACL2024. © 2024 Association for Computational Linguistics.

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

How far are we to GPT-4V?Closing the gap to commercial multimodal models with open-source suites

引用

science China(Information sciences) 2024年第12期67卷 5-22页

作者： Zhe CHEN Weiyun WANG Hao TIAN Shenglong YE Zhangwei GAO Erfei CUI Wenwen TONG Kongzhi HU Jiapeng LUO Zheng MA Ji MA Jiaqi WANG Xiaoyi DONG Hang YAN Hewei GUO Conghui HE Botian SHI Zhenjiang JIN Chao XU Bin WANG Xingjian WEI Wei LI Wenjian ZHANG Bo ZHANG Pinlong Cai Licheng WEN Xiangchao YAN Min DOU Lewei LU Xizhou ZHU Tong LU Dahua LIN Yu QIAO Jifeng Dai Wenhai WANG State Key Laboratory for Novel Software Technology Nanjing University Shanghai AI Laboratory School of Computer Science Fudan University SenseTime Research Department of Information Engineering The Chinese University of Hong Kong Department of Electronic Engineering Tsinghua University

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model(MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements.(1) Strong vision encoder: we explored a continuous learning strategy for the large-scale vision foundation model — InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs.(2) Dynamic high-resolution: we divide images into tiles ranging from 1 to 40 of 448×448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input.(3) High-quality bilingual dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images,and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in optical character recognition(OCR) and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary commercial models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 multimodal benchmarks. Code and models are available at https://***/OpenGVLab/InternVL.

关键词： multimodal model open-source vision encoder dynamic resolution bilingual dataset

来源：评论

学校读者我要写书评

暂无评论

Twin attention based multi-task convolutional bidirectional long short term memory for facial expression recognition

引用

Multimedia Tools and Applications 2025年第10期84卷 8037-8070页

作者： Sreenivas, Velagapudi Sivaneasan, B. Vani, K. Suvarna Chakrabarti, Prasun Singapore Institute of Technology Singapore138683 Singapore Engineering Cluster Singapore Institute of Technology Singapore138683 Singapore Department of Computer Science and Engineering AI/ML Research Group Head VR Siddhartha Engineering College Andhra Pradesh Vijayawada520007 India Department of Computer Science and Engineering Sir Padampat Singhania University Rajasthan Udaipur313601 India

Facial Expression Recognition (FER) aims to detect the emotional state of facial images. It is playing an increasingly important role in several application areas, including human–computer interaction (HCI), video transcriptions, and social communications. This article provides an adequate attention-based multi-task deep learning method for facial expression recognition. First, the input videos are collected from the RAVDESS and MELD datasets. Then, the input videos are converted using a threshold-based keyframe extraction algorithm. Next, the input data is pre-processed using the Adaptive Pixel Density Median Filtering (A-PDMF) method. Key features such as shape, color and texture are extracted from the pre-processed images. Finally, the facial expressions are recognized by proposing a novel twin attention-based multi-task convolutional bidirectional long-short-term memory method (TA-MC-BiLSTM). In addition, the classification parameters are optimally tuned using the EX-AHA method (Extended Artificial Hummingbird Algorithm). The proposed model reduces the size of facial features while accurately identifying the wide range of facial expressions. For simulation, the proposed method prefers python tool and results are analyzed using RAVDESS and MELD datasets. The simulation results show that the proposed model provides better results than other existing models in terms of accuracy of 99.6% for RAVEDESS and 99.4% for MELD. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Pixels

来源：评论

学校读者我要写书评

暂无评论

Machine Learning-based Immune Cell Classification for Enhanced Immunological Analysis 3

Machine Learning-based Immune Cell Classification for Enhanc...

引用

3rd International Conference on Electronics and Renewable Systems, ICEARS 2025

作者： Naveen Pranouv, S.S. Isravel, Deva Priya Dhas, Julia Punitha Malar Karunya Institute of Technology and Sciences Division of Computer Science and Engineering Coimbatore India

ISBN: (纸本)9798331509675

The automated classification of immune cells plays a vital role in advancing immunological research, diagnostics, and therapeutic monitoring. This paper leverages machine learning and image processing techniques to accurately classify various immune cell types from microscopic images by analyzing cellular features such as shape, size, and texture. The proposed Convolutional Neural Networks (CNN) based system integrates digital pathology workflows to provide a reliable and efficient tool for immune cell classification, reducing reliance on manual inspection. The model trained on a diverse dataset of labeled immune cell images to ensure high classification accuracy and adaptability across different cell types and imaging conditions. The proposed model provides a proactive solution to immunology research challenges, enhancing both the precision and efficiency of immune cell classification. © 2025 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Neuroimaging-based Machine Learning for Early Alzheimer's Disease Prediction 3

Neuroimaging-based Machine Learning for Early Alzheimer's Di...

引用

3rd International Conference on Electronics and Renewable Systems, ICEARS 2025

作者： Jerome, Jim Isravel, Deva Priya Dhas, Julia Punitha Malar Karunya Institute of Technology and Sciences Division of Computer Science and Engineering Coimbatore India

ISBN: (纸本)9798331509675

Early detection of Alzheimer's disease (AD) is crucial for timely intervention and slowing its progression. This research leverages neuroimaging-based machine learning to classify cognitive impairment levels using MRI images. The proposed approach involves preprocessing MRI images, converting them into a numerical format, and training a BEiT (BERT for Image Transformation) model to classify individuals into one of four categories: No Impairment, Very Mild, Mild, or Moderate Impairment. A web interface, developed using Streamlit, enables user-friendly access, allowing real-time predictions by uploading MRI images. This system demonstrates the potential of integrating advanced machine learning techniques and neuroimaging for early Alzheimer's diagnosis, paving the way for personalized interventions and improved outcomes. © 2025 IEEE.

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

IAOMDR: Improved Aquila Optimized Hybrid deep learning Model for Classification of Diabetic Retinopathy 2

IAOMDR: Improved Aquila Optimized Hybrid deep learning Model...

引用

2nd International Conference on Emerging Trends in Information Technology and engineering, ic-ETITE 2024

作者： Saju, Binju Barwal, Sahil Asha, V. Kumar, Shubham Tressa, Neethu Nikhil, T. Adi Shankara Institute of Engineering & Technology Department of Computer Science-AI Engineering Kalady India New Horizon College of Engineering Department of Master of Computer Applications Bengaluru India

ISBN: (纸本)9798350328202

One of the scariest illnesses that causes irreversible blindness is Diabetic Retinopathy (DR). As a result, early exposure to Diabetic Retinopathy can help to preserve vision. The study proposes a hybrid model to classify Diabetic Retinopathy images. Initially, pre-processing of the photographs includes resizing them. Grey scale translation and noise removing using a combination of Weiner filter and Median filter. Further, images are enhanced using Gaussian Mixture based histogram equalization. Features of the images are extracted using Dense Convolutional neural network. An optimized ResNet101 model with improved Aquila optimization is used for classification. Performance of the models is as opposed to other studies in literature. The proposed model has acquired F1 score of 98.98%, accuracy of 99.2%, sensitivity of 98.9%, specificity of 99.1%, and precision of 99%. © 2024 IEEE.

关键词： Median filters

来源：评论

学校读者我要写书评

暂无评论

A Comparative Analysis of Pancreatic Tumor Detection using VGG16, ResNet, and DenseNet 3

A Comparative Analysis of Pancreatic Tumor Detection using V...

引用

3rd International Conference on Applied Artificial Intelligence and Computing, ICAaiC 2024

作者： Deepthi, Gannamani Anusha Bamini, A.M. Praveen, Yenumala Joseph Karunya Institute of Technology and Sciences Division of Computer Science and Engineering India

ISBN: (纸本)9798350375190

This study explores the transformative potential of image classification algorithms like VGG16, ResNet, and DenseNet, for the early detection of pancreatic tumors using medical imaging. One of the main causes of cancer-related deaths globally is pancreatic cancer. However, pancreatic cancer detected at early stages can be cured. This study navigates through the evolution of CNNs in medical image analysis and their specific applications in pancreatic cancer detection. Pancreatic cancer poses a significant global health challenge due to its high mortality rates, often attributed to late-stage detection. The primary objective is to enhance diagnostic accuracy and enable timely intervention. The obtained results demonstrated promising accuracy levels for each model. VGG16 achieved an accuracy of 96%, ResNet demonstrated an accuracy of 98.31%, and DenseNet showcased an accuracy of 99%. Precision values for VGG16, ResNet, and DenseNet were 95%, 96%, and 95.2%, respectively. Recall values were 98% for VGG16, 95% for ResNet, and 97% for DenseNet. The F1 score for 98% for VGG16, 97% for ResNet, and 98.1% for DenseNet. The study concludes that VGG16, ResNet, and DenseNet are valuable tools for pancreatic tumor detection, with each architecture exhibiting unique advantages. By aiding in the early detection of pancreatic cancer, the suggested models may enhance patient outcomes. The research provides insights into the comparative performance of these deep learning architectures, guiding future developments in medical image analysis. © 2024 IEEE.

关键词： Image classification

来源：评论

学校读者我要写书评

暂无评论

Zero-dynamics attack detection based on data association in feedback pathway

Cognitive Robotics

引用

Cognitive Robotics 2025年 5卷 126-139页

作者： Zhang, Zeyu Li, Hongran Todo, Yuki Division of Electrical Engineering and Computer Science Kanazawa University Ishikawa Kanazawa9201192 Japan School of Computer Engineering Jiangsu Ocean University Jiangsu Lianyungang222005 China Faculty of Electrical Information and Communication Engineering Kanazawa University Ishikawa Kanazawa9201192 Japan

This paper considers the security of non-minimum phase systems, a typical kind of cyber-physical systems. Non-minimum phase systems are characterized by unstable zeros in their transfer functions, making them particularly susceptible to disturbances and attacks. The non-minimum phase systems are more vulnerable to zero-dynamics attack (ZDA) than minimum phase systems. ZDA is a stealthy attack strategy that exploits the internal dynamics of a system, remaining undetectable while causing gradual system destabilization. Recent cyber incidents have demonstrated the increasing risk of such hidden attacks in critical infrastructures, such as power grids and transportation systems. This paper first demonstrates that the existing ZDA has the limitation of falling into local convergence, and then proposes an enhanced zero-dynamics attack (EZDA), which overcomes local convergence by diverging system data. Furthermore, this paper presents an autoregressive model which can build the data association between the original data and the forged data. By observing the fluctuations in state values, the presented model can detect not only ZDA, but also EZDA. Finally, numerical simulations and an application example are provided to verify the theoretical results. © 2025

关键词： Cyber attacks

来源：评论

学校读者我要写书评

暂无评论

A Confirmation Based Accident Detection System Using IoT for Smart Vehicles 3

A Confirmation Based Accident Detection System Using IoT for...

引用

3rd IEEE World Conference on Applied Intelligence and Computing, aiC 2024

作者： Ninan, Bettina Karunya Institute of Technology and Sciences Division of Computer Science and Engineering India

ISBN: (纸本)9798350384598

Accidents on the road continue to pose a significant threat to life and safety, necessitating innovative solutions to improve emergency response and minimize injuries. The proposed approach introduces an IoT-based Accident Detection System that integrates seamlessly with vehicle airbags, a confirmation button, GPS tracking, SOS calls, and a beeping system. The system harnesses a network of sensors to detect accidents accurately. When an accident occurs, it automatically deploys airbags to protect the passengers in the vehicle. A critical feature in the system is the manual confirmation button, which lets the occupants confirm the severity of the accident. If no confirmation is acquired within a set timeframe, the system proceeds to the next steps to save the occupant. The system uses GPS tracking with an accuracy of 96% to determine the location of the accident. Simultaneously, it initiates SOS calls to nearby hospitals and emergency services, ensuring seamless medical assistance. In addition, the system emits alerts to caution passersby and authorities about the *** system enhances emergency response, reducing the time between an accident occurring and medical attention. It not only minimizes injuries but also improves overall road safety. The proposed system aims to provide a brief overview of its multi-faceted capabilities, emphasizing its potential to save lives and reduce the severity of injuries in road accidents. © 2024 IEEE.

关键词： Automobile air bags

来源：评论

学校读者我要写书评

暂无评论

Comparative analysis of twelve transfer learning models for the prediction and crack detection in concrete dams,based on borehole images

引用

Frontiers of Structural and Civil engineering 2024年第10期18卷 1507-1523页

作者： Umer Sadiq KHAN Muhammad ISHFAQUE Saif Ur Rehman KHAN Fang Xu Lerui CHEN Yi LEI School of Computer and Information Science Hubei Engineering UniversityXiaogan 432000China Institute for AI Industrial Technology Research Hubei Engineering UniversityXiaogan 432000China College of Water Conservancy and Hydropower Engineering Hohai UniversityNanjing 210098China School of Computer Science and Engineering Central South UniversityChangsha 410083China College of Aviation Zhongyuan University of TechnologyZhengzhou 451191China School of Civil Engineering Central South UniversityChangsha 410083China

Disaster-resilient dams require accurate crack detection,but machine learning methods cannot capture dam structural reaction temporal patterns and *** research uses deep learning,convolutional neural networks,and transfer learning to improve dam crack *** deep-learning models are trained on 192 crack *** research aims to provide up-to-date detecting techniques to solve dam crack *** finding shows that the EfficientNetB0 model performed better than others in classifying borehole concrete crack surface tiles and normal(undamaged)surface tiles with 91%*** study’s pre-trained designs help to identify and to determine the specific locations of cracks.

关键词： concrete dam borehole closed-circuit television deep learning models crack detection water resources management management

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：