检索结果-内蒙古大学图书馆

Automatic summarization of cooking videos using transfer learning and transformer-based models

Discover Artificial Intelligence 2025年第1期5卷 1-20页

作者： Sadique, P. M. Alen Aswiga, R.V. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai600127 India

The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Color Image Compression and Encryption Algorithm Based on 2D Compressed Sensing and Hyperchaotic System

引用

computers, Materials & Continua 2024年第2期78卷 1977-1993页

作者： Zhiqing Dong Zhao Zhang Hongyan Zhou Xuebo Chen School of Computer Science and Software Engineering University of Science and Technology LiaoningAnshan114051China School of Electronic and Information Engineering University of Science and Technology LiaoningAnshan114051China

With the advent of the information security era,it is necessary to guarantee the privacy,accuracy,and dependable transfer of *** study presents a new approach to the encryption and compression of color *** is predicated on 2D compressed sensing(CS)and the hyperchaotic ***,an optimized Arnold scrambling algorithm is applied to the initial color images to ensure strong ***,the processed images are con-currently encrypted and compressed using 2D *** them,chaotic sequences replace traditional random measurement matrices to increase the system’s ***,the processed images are re-encrypted using a combination of permutation and diffusion *** addition,the 2D projected gradient with an embedding decryption(2DPG-ED)algorithm is used to reconstruct *** with the traditional reconstruction algorithm,the 2DPG-ED algorithm can improve security and reduce computational ***,it has better *** experimental outcome and the performance analysis indicate that this algorithm can withstand malicious attacks and prove the method is effective.

关键词： Image encryption image compression hyperchaotic system compressed sensing

来源：评论

学校读者我要写书评

暂无评论

Advancing differential diagnosis: a comprehensive review of deep learning approaches for differentiating tuberculosis, pneumonia, and COVID-19

引用

Multimedia Tools and Applications 2025年第13期84卷 11871-11906页

作者： Kansal, Kajal Chandra, Tej Bahadur Singh, Akansha School of Computer Science Engineering and Technology Bennett University Uttar Pradesh Greater Noida India

In the realm of medical diagnostics, particularly in differential diagnosis, where differentiating between illnesses or ailments with comparable symptoms is essential, deep learning has gained importance. Recent developments in deep learning have demonstrated considerable promise for revolutionizing medical diagnostics by using the ability of artificial intelligence (AI) to accurately interpret radiological images. We examine the most cutting-edge deep learning techniques currently being utilized for the differential diagnosis of tuberculosis, pneumonia, and COVID-19 in this in-depth review. The study presents an in-depth critical review of several SOTA (state-of-the-art) studies used for differential diagnosis of different respiratory abnormalities like TB, Pneumonia, and COVID-19. In addition, an overview of various approaches, datasets employed in each method, various diagnosis tests, used assessment measures, and obtained performance is summarized and comprehensively compared to assist future research. We suggest a pathway for future research and development of deep learning solutions for differential diagnosis by critically analyzing the current literature and outlining the limitations and potential in this sector. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： COVID-19

来源：评论

学校读者我要写书评

暂无评论

Focusing on your subject:Deep subject-aware image composition recommendation networks

引用

Computational Visual Media 2023年第1期9卷 87-107页

作者： Guo-Ye Yang Wen-Yang Zhou Yun Cai Song-Hai Zhang Fang-Lue Zhang BNRist Department of Computer Science and TechnologyTsinghua UniversityBeiing 100084China School of Engineering and Computer Science Victoria University of WellingtonWellington 6012New Zealand

Photo composition is one of the most important factors in the aesthetics of *** a popular application,composition recommendation for a photo focusing on a specific subject has been ignored by recent deep-learning-based composition recommendation *** this paper,we propose a subject-aware image composition recommendation method,SAC-Net,which takes an RGB image and a binary subject window mask as input,and returns good compositions as crops containing the *** model first determines candidate scores for all possible coarse cropping *** crops with high candidate scores are selected and further refined by regressing their corner points to generate the output recommended cropping *** final scores of the refined crops are predicted by a final score regression *** existing methods that need to preset several cropping windows,our network is able to automatically regress cropping windows with arbitrary aspect ratios and *** propose novel stability losses for maximizing smoothness when changing cropping windows along with view *** results show that our method outperforms state-of-the-art methods not only on the subject-aware image composition recommendation task,but also for general purpose composition *** also have designed a multistage labeling scheme so that a large amount of ranked pairs can be produced *** use this scheme to propose the first subject-aware composition dataset SACD,which contains 2777 images,and more than 5 million composition ranked *** SACD dataset is publicly available at https://***/SACD/.

关键词： subject-aware image composition image cropping deep learning recommendation

来源：评论

学校读者我要写书评

暂无评论

Brain tumor segmentation and classification using transfer learning based CNN model with model agnostic concept interpretation

引用

Multimedia Tools and Applications 2025年第5期84卷 2509-2538页

作者： Nancy, A. Maria Maheswari, R. School of Computer Science and Engineering Vellore Institute of Technology Tamil Nadu Chennai632014 India

In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed

关键词： Magnetic resonance imaging

来源：评论

学校读者我要写书评

暂无评论

Event-based nonsingular fixed-time containment control for nonlinear multiagent systems with dynamic uncertainties

引用

science China(Information sciences) 2025年第5期68卷 413-414页

作者： Yuanbo SU Qihe SHAN Tieshan LI C.L.Philip CHEN Navigation College Dalian Maritime University School of Automation Engineering University of Electronic Science and Technology of China School of Computer Science and Engineering South China University of Technology

Owing to the extensive applications in many areas such as networked systems,formation flying of unmanned air vehicles,and coordinated manipulation of multiple robots,the distributed containment control for nonlinear multiagent systems (MASs) has received considerable attention,for example [1,2].Although the valued studies in [1,2] investigate containment control problems for MASs subject to nonlinearities,the proposed distributed nonlinear protocols only achieve the asymptotic *** a crucial performance indicator for distributed containment control of MASs,the fast convergence is conducive to achieving better control accuracy [3].The work in [4] first addresses the backstepping-based adaptive fuzzy fixed-time containment tracking problem for nonlinear high-order MASs with unknown external ***,the designed fixedtime control protocol [4] cannot escape the singularity problem in the backstepping-based adaptive control *** is well known,the singularity problem has become an inherent problem in the adaptive fixed-time control design,which may cause the unbounded control inputs and even the instability of controlled ***,how to solve the nonsingular fixed-time containment control problem for nonlinear MASs is still open and awaits breakthrough to the best of our knowledge.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A vision-based hybrid ensemble learning approach for classification of gait disorders

引用

Multimedia Tools and Applications 2025年第17期84卷 17597-17644页

作者： Kour, Navleen Gupta, Sunanda Arora, Sakshi School of Computer Science and Engineering Shri Mata Vaishno Devi University Katra182320 India

computer vision-based (VB) gait analysis has become the popular platform for detecting Knee Osteoarthritis (KOA) and Parkinson’s disease (PD). The scrutinization of the literature revealed the heavy usage of sensor and markerless platforms but involved certain issues such as exposure to harmful radiations, wearing discomfort, a requirement of background, etc. Further, some aspects are lacking in the previous studies including the exploration of the marker-based (MB) approach, experimentation on disease severity levels using enhanced learning techniques, comparison of abnormal and normal (NM) gait, etc. Therefore, this research aims to predict the pathological and NM gait based on the marker-based (MB) VB platform. In this paper, first, a VB gait dataset is used namely "KOA-PD-NM" which includes three stages: KOA i.e. Early (EL), Moderate (MD), Severe (SV);PD i.e. Mild (ML), MD, SV, and NM subjects, thus, forming a total of seven labels. Then, an improved technique namely Color Segmentation based Fractional Order Darwinian Particle Swarm Optimization (CS-FODPSO) is employed to segment the region of interest (ROI). Next, a hybrid ensemble using k-nearest neighbor (KNN), Decision tree (DT), and Naive Bayes (NB) is proposed to predict the gait patterns of the considered groups. The efficiency of the proposed methodology is evaluated based on performance metrics. The evaluation results achieved provided the highest results using the presented segmentation and hybrid ensemble approaches within less time in comparison to other techniques as well as state-of-the-art. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Gait analysis

来源：评论

学校读者我要写书评

暂无评论

Progress on the Fault Diagnosis Approach for Lithium-ion Battery Systems:Advances,Challenges,and Prospects

引用

Protection and Control of Modern Power Systems 2024年第5期9卷 16-41页

作者： Hanxiao Liu Luan Zhang Liwei Li School of Control Science and Engineering Shandong UniversityJinan 250061China School of Computer Science and Technology Shandong UniversityQingdao 266237China

Because of their advantages of high energy and power density,low self-discharge rate,and long lifespan,lithium-ion batteries(LIBs)have been widely used in many applications such as electric vehicles,energy storage systems,smart grids,***,lithium-ion battery systems(LIBSs)frequently malfunction because of complex working conditions,harsh operating environment,battery inconsistency,and inherent defects in battery ***,safety of LIBSs has become a prominent problem and has attracted wide ***,efficient and accurate fault diagnosis for LIBs is very *** paper provides a comprehensive review of the latest research progress in fault diagnosis for ***,the types of battery faults are comprehensively introduced and the characteristics of each fault are ***,the fault diagnosis methods are systematically elaborated,including model-based,data processing-based,machine learning-based and knowledge-based *** latest research is discussed and existing issues and challenges are presented,while future developments are also *** aim is to promote further researches into efficient and advanced fault diagnosis methods for more reliable and safer LIBs.

关键词： Battery management system battery safety fault diagnosis lithiumion battery system

来源：评论

学校读者我要写书评

暂无评论

BA-GNN: Behavior-aware graph neural network for session-based recommendation

引用

Frontiers of computer science 2023年第6期17卷 135-144页

作者： Yongquan LIANG Qiuyu SONG Zhongying ZHAO Hui ZHOU Maoguo GONG School of Computer Science and Engineering Shandong University of Science and TechnologyQingdao 266590China School of Electronic Engineering Xidian UniversityXi’an 710071China

Session-based recommendation is a popular research topic that aims to predict users’next possible interactive item by exploiting anonymous *** existing studies mainly focus on making predictions by considering users’single interactive *** recent efforts have been made to exploit multiple interactive behaviors,but they generally ignore the influences of different interactive behaviors and the noise in interactive *** address these problems,we propose a behavior-aware graph neural network for session-based ***,different interactive sequences are modeled as directed ***,the item representations are learned via graph neural ***,a sparse self-attention module is designed to remove the noise in behavior ***,the representations of different behavior sequences are aggregated with the gating mechanism to obtain the session *** results on two public datasets show that our proposed method outperforms all competitive *** source code is available at the website of GitHub.

关键词： session-based recommendation multiple interactive behaviors graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Robust Regularization Design of Graph Neural Networks Against Adversarial Attacks Based on Lyapunov Theory

引用

Chinese Journal of Electronics 2024年第3期33卷 732-741页

作者： Wenjie YAN Ziqi LI Yongjun QI School of Artificial Intelligence Hebei University of Technology School of Computer Science and Engineering of North China Institute of Aerospace Engineering

The robustness of graph neural networks（GNNs） is a critical research topic in deep *** researchers have designed regularization methods to enhance the robustness of neural networks,but there is a lack of theoretical analysis on the principle of *** order to tackle the weakness of current robustness designing methods,this paper gives new insights into how to guarantee the robustness of GNNs.A novel regularization strategy named Lya-Reg is designed to guarantee the robustness of GNNs by Lyapunov *** results give new insights into how regularization can mitigate the various adversarial effects on different graph *** experiments on various public datasets demonstrate that the proposed regularization method is more robust than the state-of-theart methods such as L1-norm,L2-norm,L2-norm,Pro-GNN,PA-GNN and GARNET against various types of graph adversarial attacks.

关键词： Deep learning Graph neural network Robustness Lyapunov Regularization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：