The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
With the advent of the information security era,it is necessary to guarantee the privacy,accuracy,and dependable transfer of *** study presents a new approach to the encryption and compression of color *** is predicat...
详细信息
With the advent of the information security era,it is necessary to guarantee the privacy,accuracy,and dependable transfer of *** study presents a new approach to the encryption and compression of color *** is predicated on 2D compressed sensing(CS)and the hyperchaotic ***,an optimized Arnold scrambling algorithm is applied to the initial color images to ensure strong ***,the processed images are con-currently encrypted and compressed using 2D *** them,chaotic sequences replace traditional random measurement matrices to increase the system’s ***,the processed images are re-encrypted using a combination of permutation and diffusion *** addition,the 2D projected gradient with an embedding decryption(2DPG-ED)algorithm is used to reconstruct *** with the traditional reconstruction algorithm,the 2DPG-ED algorithm can improve security and reduce computational ***,it has better *** experimental outcome and the performance analysis indicate that this algorithm can withstand malicious attacks and prove the method is effective.
In the realm of medical diagnostics, particularly in differential diagnosis, where differentiating between illnesses or ailments with comparable symptoms is essential, deep learning has gained importance. Recent devel...
详细信息
Photo composition is one of the most important factors in the aesthetics of *** a popular application,composition recommendation for a photo focusing on a specific subject has been ignored by recent deep-learning-base...
详细信息
Photo composition is one of the most important factors in the aesthetics of *** a popular application,composition recommendation for a photo focusing on a specific subject has been ignored by recent deep-learning-based composition recommendation *** this paper,we propose a subject-aware image composition recommendation method,SAC-Net,which takes an RGB image and a binary subject window mask as input,and returns good compositions as crops containing the *** model first determines candidate scores for all possible coarse cropping *** crops with high candidate scores are selected and further refined by regressing their corner points to generate the output recommended cropping *** final scores of the refined crops are predicted by a final score regression *** existing methods that need to preset several cropping windows,our network is able to automatically regress cropping windows with arbitrary aspect ratios and *** propose novel stability losses for maximizing smoothness when changing cropping windows along with view *** results show that our method outperforms state-of-the-art methods not only on the subject-aware image composition recommendation task,but also for general purpose composition *** also have designed a multistage labeling scheme so that a large amount of ranked pairs can be produced *** use this scheme to propose the first subject-aware composition dataset SACD,which contains 2777 images,and more than 5 million composition ranked *** SACD dataset is publicly available at https://***/SACD/.
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classific...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed
Owing to the extensive applications in many areas such as networked systems,formation flying of unmanned air vehicles,and coordinated manipulation of multiple robots,the distributed containment control for nonlinear m...
Owing to the extensive applications in many areas such as networked systems,formation flying of unmanned air vehicles,and coordinated manipulation of multiple robots,the distributed containment control for nonlinear multiagent systems (MASs) has received considerable attention,for example [1,2].Although the valued studies in [1,2] investigate containment control problems for MASs subject to nonlinearities,the proposed distributed nonlinear protocols only achieve the asymptotic *** a crucial performance indicator for distributed containment control of MASs,the fast convergence is conducive to achieving better control accuracy [3].The work in [4] first addresses the backstepping-based adaptive fuzzy fixed-time containment tracking problem for nonlinear high-order MASs with unknown external ***,the designed fixedtime control protocol [4] cannot escape the singularity problem in the backstepping-based adaptive control *** is well known,the singularity problem has become an inherent problem in the adaptive fixed-time control design,which may cause the unbounded control inputs and even the instability of controlled ***,how to solve the nonsingular fixed-time containment control problem for nonlinear MASs is still open and awaits breakthrough to the best of our knowledge.
computer vision-based (VB) gait analysis has become the popular platform for detecting Knee Osteoarthritis (KOA) and Parkinson’s disease (PD). The scrutinization of the literature revealed the heavy usage of sensor a...
详细信息
Because of their advantages of high energy and power density,low self-discharge rate,and long lifespan,lithium-ion batteries(LIBs)have been widely used in many applications such as electric vehicles,energy storage sys...
详细信息
Because of their advantages of high energy and power density,low self-discharge rate,and long lifespan,lithium-ion batteries(LIBs)have been widely used in many applications such as electric vehicles,energy storage systems,smart grids,***,lithium-ion battery systems(LIBSs)frequently malfunction because of complex working conditions,harsh operating environment,battery inconsistency,and inherent defects in battery ***,safety of LIBSs has become a prominent problem and has attracted wide ***,efficient and accurate fault diagnosis for LIBs is very *** paper provides a comprehensive review of the latest research progress in fault diagnosis for ***,the types of battery faults are comprehensively introduced and the characteristics of each fault are ***,the fault diagnosis methods are systematically elaborated,including model-based,data processing-based,machine learning-based and knowledge-based *** latest research is discussed and existing issues and challenges are presented,while future developments are also *** aim is to promote further researches into efficient and advanced fault diagnosis methods for more reliable and safer LIBs.
Session-based recommendation is a popular research topic that aims to predict users’next possible interactive item by exploiting anonymous *** existing studies mainly focus on making predictions by considering users...
详细信息
Session-based recommendation is a popular research topic that aims to predict users’next possible interactive item by exploiting anonymous *** existing studies mainly focus on making predictions by considering users’single interactive *** recent efforts have been made to exploit multiple interactive behaviors,but they generally ignore the influences of different interactive behaviors and the noise in interactive *** address these problems,we propose a behavior-aware graph neural network for session-based ***,different interactive sequences are modeled as directed ***,the item representations are learned via graph neural ***,a sparse self-attention module is designed to remove the noise in behavior ***,the representations of different behavior sequences are aggregated with the gating mechanism to obtain the session *** results on two public datasets show that our proposed method outperforms all competitive *** source code is available at the website of GitHub.
The robustness of graph neural networks(GNNs) is a critical research topic in deep *** researchers have designed regularization methods to enhance the robustness of neural networks,but there is a lack of theoretical...
详细信息
The robustness of graph neural networks(GNNs) is a critical research topic in deep *** researchers have designed regularization methods to enhance the robustness of neural networks,but there is a lack of theoretical analysis on the principle of *** order to tackle the weakness of current robustness designing methods,this paper gives new insights into how to guarantee the robustness of GNNs.A novel regularization strategy named Lya-Reg is designed to guarantee the robustness of GNNs by Lyapunov *** results give new insights into how regularization can mitigate the various adversarial effects on different graph *** experiments on various public datasets demonstrate that the proposed regularization method is more robust than the state-of-theart methods such as L1-norm,L2-norm,L2-norm,Pro-GNN,PA-GNN and GARNET against various types of graph adversarial attacks.
暂无评论