A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually *** enrich the services in mobile communications,developers have combined Web ...
详细信息
A large number of Web APIs have been released as services in mobile communications,but the service provided by a single Web API is usually *** enrich the services in mobile communications,developers have combined Web APIs and developed a new service,which is known as a *** emergence of mashups greatly increases the number of services in mobile communications,especially in mobile networks and the Internet-of-Things(IoT),and has encouraged companies and individuals to develop even more mashups,which has led to the dramatic increase in the number of *** a trend brings with it big data,such as the massive text data from the mashups themselves and continually-generated usage ***,the question of how to determine the most suitable mashups from big data has become a challenging *** this paper,we propose a mashup recommendation framework from big data in mobile networks and the *** proposed framework is driven by machine learning techniques,including neural embedding,clustering,and matrix *** employ neural embedding to learn the distributed representation of mashups and propose to use cluster analysis to learn the relationship among the *** also develop a novel Joint Matrix Factorization(JMF)model to complete the mashup recommendation task,where we design a new objective function and an optimization *** then crawl through a real-world large mashup dataset and perform *** experimental results demonstrate that our framework achieves high accuracy in mashup recommendation and performs better than all compared baselines.
Deep learning approaches have attained remarkable success across various artificial intelligence applications, spanning healthcare, finance, and autonomous vehicles, profoundly impacting human existence. However, thei...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
Disaster-resilient dams require accurate crack detection,but machine learning methods cannot capture dam structural reaction temporal patterns and *** research uses deep learning,convolutional neural networks,and tran...
详细信息
Disaster-resilient dams require accurate crack detection,but machine learning methods cannot capture dam structural reaction temporal patterns and *** research uses deep learning,convolutional neural networks,and transfer learning to improve dam crack *** deep-learning models are trained on 192 crack *** research aims to provide up-to-date detecting techniques to solve dam crack *** finding shows that the EfficientNetB0 model performed better than others in classifying borehole concrete crack surface tiles and normal(undamaged)surface tiles with 91%*** study’s pre-trained designs help to identify and to determine the specific locations of cracks.
Person Image Synthesis has been widely used in fashion with extensive application *** point of this task is how to synthesise person image from a single source image under arbitrary *** methods generate the person ima...
详细信息
Person Image Synthesis has been widely used in fashion with extensive application *** point of this task is how to synthesise person image from a single source image under arbitrary *** methods generate the person image with target pose well;however,they fail to preserve the fine style details of the source *** address this problem,a robust style injection(RSI)model is proposed,which is a coarse-to-fine framework to synthesise target the person *** develops a simple and efficient cross-attention based module to fuse the features of both source semantic styles and target pose for achieving the coarse aligned *** adaptive instance normalisation is employed to enhance the aligned features in conjunction with source semantic ***,source semantic styles are further injected into the positional normalisation scheme to avoid the fine style details erosion caused by massive *** training losses,optimal transport theory in the form of energy distance is introduced to constrain data distribution to refine the texture style ***,the authors’model is capable of editing the shape and texture of garments to the target style *** experiments demonstrate that the authors’RSI achieves better performance over the state-of-art methods.
In the realm of medical diagnostics, particularly in differential diagnosis, where differentiating between illnesses or ailments with comparable symptoms is essential, deep learning has gained importance. Recent devel...
详细信息
In the evolving landscape of surveillance and security applications, the task of person re-identification(re-ID) has significant importance, but also presents notable difficulties. This task entails the process of acc...
详细信息
In the evolving landscape of surveillance and security applications, the task of person re-identification(re-ID) has significant importance, but also presents notable difficulties. This task entails the process of accurately matching and identifying persons across several camera views that do not overlap with one another. This is of utmost importance to video surveillance, public safety, and person-tracking applications. However, vision-related difficulties, such as variations in appearance, occlusions, viewpoint changes, cloth changes, scalability, limited robustness to environmental factors, and lack of generalizations, still hinder the development of reliable person re-ID methods. There are few approaches have been developed based on these difficulties relied on traditional deep-learning techniques. Nevertheless, recent advancements of transformer-based methods, have gained widespread adoption in various domains owing to their unique architectural properties. Recently, few transformer-based person re-ID methods have developed based on these difficulties and achieved good results. To develop reliable solutions for person re-ID, a comprehensive analysis of transformer-based methods is necessary. However, there are few studies that consider transformer-based techniques for further investigation. This review proposes recent literature on transformer-based approaches, examining their effectiveness, advantages, and potential challenges. This review is the first of its kind to provide insights into the revolutionary transformer-based methodologies used to tackle many obstacles in person re-ID, providing a forward-thinking outlook on current research and potentially guiding the creation of viable applications in real-world scenarios. The main objective is to provide a useful resource for academics and practitioners engaged in person re-ID. IEEE
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classific...
详细信息
In recent decades, brain tumors have been regarded as a severe illness that causes significant damage to the health of the individual, and finally it results to death. Hence, the Brain Tumor Segmentation and Classification (BTSC) has gained more attention among researcher communities. BTSC is the process of finding brain tumor tissues and classifying the tissues based on the tumor types. Manual tumor segmentation from is prone to error and a time-consuming task. A precise and fast BTSC model is developed in this manuscript based on a transfer learning-based Convolutional Neural Networks (CNN) model. The utilization of a variant of CNN is because of its superiority in distinct tasks. In the initial phase, the Magnetic Resonance Imaging (MRI) brain images are acquired from the Brain Tumor Image Segmentation Challenge (BRATS) 2019, 2020 and 2021 databases. Then the image augmentation is performed on the gathered images by using zoom-in, rotation, zoom-out, flipping, scaling, and shifting methods that effectively reduce overfitting issues in the classification model. The augmented images are segmented using the layers of the Visual-Geometry-Group (VGG-19) model. Then feature extraction using An Attribute Aware Attention (AWA) methodology is carried out on the segmented images following the segmentation block in the VGG-19 model. The crucial features are then selected using the attribute category reciprocal attention phase. These features are inputted to the Model Agnostic Concept Extractor (MACE) to generate the relevance score between the features for assisting in the final classification process. The obtained relevance scores from the MACE are provided to the max-pooling layer of the VGG-19 model. Then, the final classified output is obtained from the modified VGG-19 architecture. The implemented Relevance score with the AWA-based VGG-19 model is used to classify the tumor as the whole tumor, enhanced tumor, and tumor core. In the classification section, the proposed
computer vision-based (VB) gait analysis has become the popular platform for detecting Knee Osteoarthritis (KOA) and Parkinson’s disease (PD). The scrutinization of the literature revealed the heavy usage of sensor a...
详细信息
The Computational Visual Media(CVM)conference series is intended to provide a prominent international forum for exchanging innovative research ideas and significant computational methodologies that either underpin or ...
详细信息
The Computational Visual Media(CVM)conference series is intended to provide a prominent international forum for exchanging innovative research ideas and significant computational methodologies that either underpin or apply visual media.
暂无评论