The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classific...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed
In the healthcare sector, the protection of patient information is an essential factor in terms of secrecy and privacy. This information is very useful for industrial and research purposes. Electronic medical records ...
详细信息
Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular puls...
详细信息
Cardiovascular disease(CVD)remains a leading global health challenge due to its high mortality rate and the complexity of early diagnosis,driven by risk factors such as hypertension,high cholesterol,and irregular pulse *** diagnostic methods often struggle with the nuanced interplay of these risk factors,making early detection *** this research,we propose a novel artificial intelligence-enabled(AI-enabled)framework for CVD risk prediction that integrates machine learning(ML)with eXplainable AI(XAI)to provide both high-accuracy predictions and transparent,interpretable *** to existing studies that typically focus on either optimizing ML performance or using XAI separately for local or global explanations,our approach uniquely combines both local and global interpretability using Local Interpretable Model-Agnostic Explanations(LIME)and SHapley Additive exPlanations(SHAP).This dual integration enhances the interpretability of the model and facilitates clinicians to comprehensively understand not just what the model predicts but also why those predictions are made by identifying the contribution of different risk factors,which is crucial for transparent and informed decision-making in *** framework uses ML techniques such as K-nearest neighbors(KNN),gradient boosting,random forest,and decision tree,trained on a cardiovascular ***,the integration of LIME and SHAP provides patient-specific insights alongside global trends,ensuring that clinicians receive comprehensive and actionable *** experimental results achieve 98%accuracy with the Random Forest model,with precision,recall,and F1-scores of 97%,98%,and 98%,*** innovative combination of SHAP and LIME sets a new benchmark in CVD prediction by integrating advanced ML accuracy with robust interpretability,fills a critical gap in existing *** framework paves the way for more explainable and transparent decision-making in he
The popularity of the Internet and digital consumer gadgets has fundamentally changed our society and daily lives by making digital data collection, transmission, and storage exceedingly easy and convenient. However, ...
详细信息
The event management mechanism matches messages that have been subscribed to and events that have been published. To identify the subscriptions that correspond to the occurrence inside the category, it must first run ...
详细信息
Phishing attacks are among the persistent threats that are dynamically evolving and demand advanced detection mechanisms to counter more sophisticated techniques. Traditional detection approaches are usually based on ...
详细信息
This review paper explores emerging threats to information privacy and security within the dynamic landscape of Online Social Networks (OSNs), which serve as repositories of vast amounts of user data. The rise in soci...
详细信息
In this work, we introduce an adaptive hierarchical framework for efficient 3D object detection from point cloud data, designed to dynamically balance computational efficiency and detection performance. Our approach e...
详细信息
With the exponential increase in data consumption from activities like streaming, gaming, and IoT applications, existing network infrastructure is under significant strain. To address the growing demand for higher dat...
详细信息
暂无评论