检索结果-内蒙古大学图书馆

Automatic summarization of cooking videos using transfer learning and transformer-based models

Discover Artificial Intelligence 2025年第1期5卷 1-20页

作者： Sadique, P. M. Alen Aswiga, R.V. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai600127 India

The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Brain tumor segmentation and classification using transfer learning based CNN model with model agnostic concept interpretation

引用

Multimedia Tools and Applications 2025年第5期84卷 2509-2538页

作者： Nancy, A. Maria Maheswari, R. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai632014 India

In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed

关键词： Magnetic resonance imaging

来源：评论

学校读者我要写书评

暂无评论

Patient-centric information management in blockchain and interplanetary storage

引用

Journal of Ambient Intelligence and Humanized Computing 2025年第1期16卷 85-96页

作者： Dewangan, Narendra Chandrakar, Preeti Department of Computer Science and Engineering National Institute of Technology Raipur G.E. Road Chhattisgarh Raipur492010 India

In the healthcare sector, the protection of patient information is an essential factor in terms of secrecy and privacy. This information is very useful for industrial and research purposes. Electronic medical records (EMR) hold data related to patients and their treatments. According to GDPR (general data protection regulation), patient information should be anonymous. There should not be any disclosure of patients private information during data transfer. For hiding patients private information, encryption is required. For developing countries, the treatment cost is also a matter for health services. Blockchain technology provides facilities like distributed and immutable data storage and peer-to-peer data transfer. We need cheaper blockchain-based solutions for the healthcare sector. IPFS provides a secure way to store the content of the files in a decentralized manner by using the Merkle tree. By using IPFS with blockchain, we can counter the use of fake medical records or the forge of EMR. In this paper, the interplanetary file system (IPFS) is used to store EMR in compliance with GDPR. We work on the treatment cycle of patients with a blockchain-maintained hospital and doctor. Our proposed system enhances the security and transparency of patients’ treatment information. We compared our proposed system with previously developed systems and provided analytical results in terms of security, privacy, and cost. We implemented our proposed system on a Python-customized blockchain with a customized consensus algorithm. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

关键词： Electronic health record

来源：评论

学校读者我要写书评

暂无评论

An AI-Enabled Framework for Transparency and Interpretability in Cardiovascular Disease Risk Prediction

引用

computers, Materials & Continua 2025年第3期82卷 5057-5078页

作者： Isha Kiran Shahzad Ali Sajawal ur Rehman Khan Musaed Alhussein Sheraz Aslam Khursheed Aurangzeb Department of Computer Science University of Engineering and TechnologyLahore39161Pakistan Dipartimento di Farmacia e Biotecnologie Alma Mater Studiorum—Universita di BolognaBologna40126Italy Department of Information Sciences University of EducationLahore54770Pakistan Department of Computer Science National Textile UniversityKarachi CampusKarachi74900Pakistan Department of Computer Science National Textile UniversityFaisalabad37610Pakistan Department of Computer Engineering College of Computer and Information SciencesKing Saud UniversityRiyadh11543Saudi Arabia Department of Electrical Engineering Computer Engineeringand InformaticsCyprus University of TechnologyLimassol3036Cyprus Department of Computer Science CTL EurocollegeLimassol3077Cyprus

Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular pulse *** diagnostic methods often struggle with the nuanced interplay of these risk factors,making early detection *** this research,we propose a novel artificial intelligence-enabled(AI-enabled)framework for CVD risk prediction that integrates machine learning(ML)with eXplainable AI(XAI)to provide both high-accuracy predictions and transparent,interpretable *** to existing studies that typically focus on either optimizing ML performance or using XAI separately for local or global explanations,our approach uniquely combines both local and global interpretability using Local Interpretable Model-Agnostic Explanations(LIME)and SHapley Additive exPlanations(SHAP).This dual integration enhances the interpretability of the model and facilitates clinicians to comprehensively understand not just what the model predicts but also why those predictions are made by identifying the contribution of different risk factors,which is crucial for transparent and informed decision-making in *** framework uses ML techniques such as K-nearest neighbors(KNN),gradient boosting,random forest,and decision tree,trained on a cardiovascular ***,the integration of LIME and SHAP provides patient-specific insights alongside global trends,ensuring that clinicians receive comprehensive and actionable *** experimental results achieve 98%accuracy with the Random Forest model,with precision,recall,and F1-scores of 97%,98%,and 98%,*** innovative combination of SHAP and LIME sets a new benchmark in CVD prediction by integrating advanced ML accuracy with robust interpretability,fills a critical gap in existing *** framework paves the way for more explainable and transparent decision-making in he

关键词： Artificial Intelligence cardiovascular disease(CVD) explainability eXplainable AI(XAI) interpretability LIME machine learning(ML) SHAP

来源：评论

学校读者我要写书评

暂无评论

A robust visual information hiding framework based on HVS pixel adaptive alpha blending (HPAAB) technique

引用

Multimedia Tools and Applications 2025年第14期84卷 13959-13982页

作者： Panda, Bishwabara Nayak, Manas Ranjan Mallick, Pradeep Kumar Basu, Abhishek School of Computer Engineering KIIT Deemed to be University Bhubaneswar India Department of ECE RCC Institute of Information Technology Kolkata India

The popularity of the Internet and digital consumer gadgets has fundamentally changed our society and daily lives by making digital data collection, transmission, and storage exceedingly easy and convenient. However, protecting these data and preventing unauthorized changes is an important task. This challenge has arisen in several domains, including copyright protection, content authentication, information concealment, and protected communications. Several academics have developed digital watermarking algorithms to solve this issue, aiming to implant hidden data (a watermark) in digital material to affix a mark on or seal the digital data content. The proposed approach builds a spatial domain image watermarking approach using entropy, a spectral residual pixel-based saliency map, and alpha blending. This method states that the value of entropy improves the concealing capacity. To increase imperceptibility, a spectral residual-based saliency map is used to discover less visible portions of the input images and progressively incorporate more information based on entropy data. Any modifications made other than those detectable regions will be less noticeable to viewers since they result from saliency, which decides which elements of an image are the most visible from a visual perspective. Lastly, utilizing the quality criteria, the mechanism's capability is analyzed for numerous existing frameworks. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Entropy

来源：评论

学校读者我要写书评

暂无评论

Event Management System Using Spatial and Event Attribute Information

引用

SN computer Science 2025年第3期6卷 1-9页

作者： Setia, Sonia Anjli, Km Bisht, Upasana Jyoti Raj, Dharm Department of Computer Science & Engineering Sharda School of Engineering & Technology Sharda University Uttar Pradesh Greater Noida India Department of Computer Science Institute of Innovation in Technology & Management Delhi Janakpuri India Department of Computer Science and Engineering ITS Engineering College Greater Noida India

The event management mechanism matches messages that have been subscribed to and events that have been published. To identify the subscriptions that correspond to the occurrence inside the category, it must first run a clustering operation on the subscription message data and determine the category to which the event belongs. In this work, we offer an on-the-fly publish-subscribe technique that can handle on-the-fly subscription changes and category modifications in a distributed environment, as well as the synchronization of spatial information and event attribute information. Additionally, without any prior information, clustering and matching processes can be carried out rapidly. The publish-subscribe method is implemented on a distributed system, and a load-balancing solution is proposed for the algorithm on the distributed system. We then constructed our cluster and used real data to empirically evaluate the suggested publish-subscribe technique. Experiments show that incorporating both similarity and relevance with frequency, compared to focusing on similarity alone, significantly decreases the number of categories which need to be matched for events. This leads to a 15.5% improvement in the speed of matching events and subscriptions. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.

关键词： Clustering algorithms Distributed systems Publish-and-subscribe systems

来源：评论

学校读者我要写书评

暂无评论

Multimodal and Temporal Graph Fusion Framework for Advanced Phishing Website Detection

引用

IEEE Access 2025年 13卷 74128-74146页

作者： Kavya, S. Sumathi, D. Vellore Institute of Technology School of Computer Science Engineering and Information Systems Vellore632014 India

Phishing attacks are among the persistent threats that are dynamically evolving and demand advanced detection mechanisms to counter more sophisticated techniques. Traditional detection approaches are usually based on single-modal features or static analysis, failing to capture the complex, multi-faceted nature of phishing websites and their dynamic behaviors. Thus, we present a robust Multi-Modal and Temporal Graph Fusion Framework integrating advanced learning paradigms that enhance accuracy and adaptability in phishing detection. Our work proposes four brand-new methods: Multi-Modal Hypergraph Fusion Network (MM-HFN), Temporal Graph Neural Network with Attention (TGNN-Att), Federated Graph Contrastive Learning Network (FGCL-Net), and Multi-Modal Temporal Hypergraph Fusion Network (MMTHF-Net). MM-HFN leverages hypergraphs to capture complex, high-order relationships at textual levels (BERT) and graph-based features versus visual ones (CNNs) for an accuracy in the 95-97% range. TGNN-Att addresses temporal variations in phishing behavior by using attention-enhanced temporal graph networks and LSTMs, providing dynamic detection with 94-96% accuracy. FGCL-Net ensures privacy-preserving learning across decentralized datasets through federated contrastive learning, achieving 93-95% accuracy while safeguarding data privacy. Finally, MMTHF-Net fuses multi-modal and temporal features into a dynamic hypergraph framework, achieving state-of-the-art accuracy of 96-98% with an F1-score of 0.97. These approaches together allow for exact, real-time phishing detection by capturing static and temporal behaviors, high-order relationships, and cross-modal features. The framework proposed demonstrates significant improvements compared to the state of the art, eliminating the shortcomings of single-modality and static analysis while offering scalability, privacy, and adaptability levels. © 2013 IEEE.

关键词： Federated learning

来源：评论

学校读者我要写书评

暂无评论

Unveiling the web of privacy: A comprehensive review of threats to information security in social networks 4th

Unveiling the web of privacy: A comprehensive review of thre...

引用

4th International Conference on Computational Methods in Science and technology, ICCMST 2024

作者： Singh, Shalini Baloni, Dev Sarkar, Manash Department of Computer Science and Engineering Quantum University Roorkee India Department of Computer Science and Engineering Atria Institute of Technology Banglore India

ISBN: (纸本)9781032911571

This review paper explores emerging threats to information privacy and security within the dynamic landscape of Online Social Networks (OSNs), which serve as repositories of vast amounts of user data. The rise in social media use emphasizes how crucial it is to inform users about the dangers that come with it, like cyber-attacks and privacy violations. Even though various social media platforms have been made possible by technological advancements, consumers’ ignorance of these platforms creates serious security risks. Although academics have come up with creative security solutions for social networks, there are obstacles in the way of their actual application. It is essential to continuously assess security vulnerabilities in light of technology improvements. This study examines several dangers to online social networks and suggests countermeasures based on a variety of models, frameworks, and encryption methods. Despite advancements, privacy concerns remain unaddressed, requiring parents to actively supervise their children’s online activity. © 2025 the Author(s).

关键词： Differential privacy

来源：评论

学校读者我要写书评

暂无评论

Adaptive-Cloud: Dynamic Computation Control for 3D Object Detection From LIDAR Point Clouds

引用

IEEE Robotics and Automation Letters 2025年第6期10卷 6512-6519页

作者： Mohammad, Mir Sayeed Kamal, Uday Mukhopadhyay, Saibal Georgia Institute of Technology School of Electrical and Computer Engineering AtlantaGA30318 United States

In this work, we introduce an adaptive hierarchical framework for efficient 3D object detection from point cloud data, designed to dynamically balance computational efficiency and detection performance. Our approach employs a shared feature extractor and multiple detector backbones of varying widths, enabling selective activation of models based on the complexity of the input scene. A novel feature gating mechanism dynamically determines the most relevant features for reduced-width backbones, while a surrogate loss prediction module ranks models in real-time, ensuring optimal backbone selection with minimal overhead. This adaptive strategy reduces compute costs by 41.4% while maintaining a negligible 2.44% reduction in detection accuracy across a range of real-world driving scenes (urban, highway, residential, campus, person) from the KITTI dataset. By addressing runtime adaptability-a critical gap in existing 3D detection frameworks-our method provides a significant algorithmic improvement for high-performance detection models in resource-constrained environments. © 2016 IEEE.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Neural Network-Driven Base Station Selection in Next-Generation Cellular Networks 17

Neural Network-Driven Base Station Selection in Next-Generat...

引用

17th International Conference on COMmunication Systems and NETworkS, COMSNETS 2025

作者： Harrithha, B. Kotagi, Vijeth J. Thakur, Rahul Indian Institute of Technology Dharwad Department of Computer Science and Engineering Karnataka Dharwad India Indian Institute of Technology Roorkee Department of Computer Science and Engineering Uttarakhand Roorkee India

ISBN: (纸本)9798331531195

With the exponential increase in data consumption from activities like streaming, gaming, and IoT applications, existing network infrastructure is under significant strain. To address the growing demand for higher data rates and better coverage, increasing base station density and exploring data offloading in unlicensed spectrum have become essential. While higher base station density reduces the distance between devices and network nodes, enhancing signal strength and data throughput, it also complicates the process of base station selection for users, particularly for those at the cell edge. Thus, ever-increasing demand for high-speed, low-latency communication services in 5G and 5G advanced New Radio (NR) unlicensed (NR-U) networks necessitates intelligent and dynamic approaches for user admission control and resource *** paper presents a novel solution employing neural network technology to address these critical challenges. We first propose an efficient algorithm for admission control and resource allocation. Upon generation of training data using the above algorithm, we propose a neural network-based framework for cellular networks in unlicensed spectrum such as LTE/LTE-U and 5G/NR-U networks. © 2025 IEEE.

关键词： Base stations

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：