This research aims to develop a new approach to increase the safety and reliability of Autonomous Vehicle (AV) through the proposed risk assessment framework, supported by the trust evaluation approach derived from a ...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of software engineering theo...
Foundation models(FMs) [1] have revolutionized software development and become the core components of large software systems. This paradigm shift, however, demands fundamental re-imagining of software engineering theories and methodologies [2]. Instead of replacing existing software modules implemented by symbolic logic, incorporating FMs' capabilities to build software systems requires entirely new modules that leverage the unique capabilities of ***, while FMs excel at handling uncertainty, recognizing patterns, and processing unstructured data, we need new engineering theories that support the paradigm shift from explicitly programming and maintaining user-defined symbolic logic to creating rich, expressive requirements that FMs can accurately perceive and implement.
A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually *** enrich the services in mobile communications,developers have combined Web ...
详细信息
A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually *** enrich the services in mobile communications,developers have combined Web APIs and developed a new service,which is known as a *** emergence of mashups greatly increases the number of services in mobile communications,especially in mobile networks and the Internet-of-Things(IoT),and has encouraged companies and individuals to develop even more mashups,which has led to the dramatic increase in the number of *** a trend brings with it big data,such as the massive text data from the mashups themselves and continually-generated usage ***,the question of how to determine the most suitable mashups from big data has become a challenging *** this paper,we propose a mashup recommendation framework from big data in mobile networks and the *** proposed framework is driven by machine learning techniques,including neural embedding,clustering,and matrix *** employ neural embedding to learn the distributed representation of mashups and propose to use cluster analysis to learn the relationship among the *** also develop a novel Joint Matrix Factorization(JMF)model to complete the mashup recommendation task,where we design a new objective function and an optimization *** then crawl through a real-world large mashup dataset and perform *** experimental results demonstrate that our framework achieves high accuracy in mashup recommendation and performs better than all compared baselines.
In the realm of medical diagnostics, particularly in differential diagnosis, where differentiating between illnesses or ailments with comparable symptoms is essential, deep learning has gained importance. Recent devel...
详细信息
Predicting stock market movements presents a formidable challenge due to the inherent nonlinearity and ever-changing nature of financial markets. In this research endeavor, we employ an innovative approach, harnessing...
详细信息
In recent years, academics have placed a high value on multi-modal emotion identification, as well as extensive research has been conducted in the areas of video, text, voice, and physical signal emotion detection. Th...
详细信息
With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapi...
详细信息
With the development of Industry 4.0 and big data technology,the Industrial Internet of Things(IIoT)is hampered by inherent issues such as privacy,security,and fault tolerance,which pose certain challenges to the rapid development of *** technology has immutability,decentralization,and autonomy,which can greatly improve the inherent defects of the *** the traditional blockchain,data is stored in a Merkle *** data continues to grow,the scale of proofs used to validate it grows,threatening the efficiency,security,and reliability of blockchain-based ***,this paper first analyzes the inefficiency of the traditional blockchain structure in verifying the integrity and correctness of *** solve this problem,a new Vector Commitment(VC)structure,Partition Vector Commitment(PVC),is proposed by improving the traditional VC ***,this paper uses PVC instead of the Merkle tree to store big data generated by *** can improve the efficiency of traditional VC in the process of commitment and ***,this paper uses PVC to build a blockchain-based IIoT data security storage mechanism and carries out a comparative analysis of *** mechanism can greatly reduce communication loss and maximize the rational use of storage space,which is of great significance for maintaining the security and stability of blockchain-based IIoT.
Mobile Ad hoc Network (MANET) is broadly applicable in various sectors within a short amount of time, which is connected to mobile developments. However, the communication in the MANET faces several issues like synchr...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classific...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed
暂无评论