Various algorithms of traditional visual Simultaneous Localization and Mapping (SLAM) can well match with static scenes, but mismatches will occur in dynamic scenes, which makes the positioning and mapping of the SLAM...
详细信息
Blockchain technologies pave a promising way for implementing the inter-organizational processes. Most of the current research works translate the execution logic in the process models into the smart contracts, which ...
Blockchain technologies pave a promising way for implementing the inter-organizational processes. Most of the current research works translate the execution logic in the process models into the smart contracts, which can run independently on the blockchain without the outside process engine. However, the works usually suffer from the execution and storage costs, since the translation needs to be done when the processes are deployed. In this paper, we customize a process engine for executing the inter-organizational business processes via a blockchain-style procedure, i.e., checking the validity of transactions, adding the valid transactions into the blockchain through the consensus mechanism, and then updating the process states according to the committed transactions. And then, we build a blockchain system by embedding the customized process engine into the blockchain nodes. Moreover, in order to realize the interactions between the inter-organizational processes running on blockchain and the services outside blockchain, we propose a blockchain-based approach for service registration, binding and invocation, and design a lease-based concurrency control protocol to logically isolate transactions from each other when invoking the services simultaneously. Finally, we implement a prototype system based on a permissioned blockchain platform Hyperledger Fabric and a process engine Activiti. The experimental results show the proposed blockchain system can execute the inter-organizational processes correctly and efficiently.
Federated Learning, as a popular paradigm for collaborative training, is vulnerable against privacy attacks. Different privacy levels regarding users’ attitudes need to be satisfied locally, while a strict privacy gu...
详细信息
Visual grounding aims to ground an image region through natural language, which heavily relies on cross-modal alignment. Most existing methods transfer visual/linguistic knowledge separately by fully fine-tuning uni-m...
详细信息
Zero-shot learning aims to recognize unseen-classes using some seen-class samples as training set. It is challenging owing to that the feature representations of unseen-class samples are unavailable. Existing methods ...
详细信息
Zero-shot learning aims to recognize unseen-classes using some seen-class samples as training set. It is challenging owing to that the feature representations of unseen-class samples are unavailable. Existing methods transfer the mapping from seen-classes to unseen-classes with the correlation as a bridge, in which, the semantic representations are used to discriminate the classes. However, the unavailability of visual representations for unseen-classes and the insufficient discrimination of semantic representations make the zero-shot learning challenging. Therefore, the visual representations are learned as complements to semantic representations to construct a multi-modal knowledge graph (KG), and a zero-shot learning method based on multi-modal KG is proposed in this paper. Specially, a semantic KG is introduced to capture the correlation of classes, and with the correlation, the visual feature representations of all classes are learned. Then, the discriminative visual representations and the semantic representations are used together to construct a multi-modal KG. With the multi-modal KG, the classifier for seen-classes is transferred to unseen classes. Extensive experimental results show the effectiveness of our method.
Most modern recommender systems predict users’ preferences with two components: user and item embedding learning, followed by the user-item interaction modeling. By utilizing the auxiliary review information accompan...
详细信息
Vision and diverse languages are important information sources in our living world. A model that understands multi-modalities and multi-languages can be applied to a wider range of real-life scenarios. To build such a...
ISBN:
(纸本)9781713871088
Vision and diverse languages are important information sources in our living world. A model that understands multi-modalities and multi-languages can be applied to a wider range of real-life scenarios. To build such a multimodal and multilingual model, existing works try to ensemble vision-language data from multiple languages in pre-training. However, due to the large number of languages, these works often require huge computing resources and cannot be flexibly extended to new languages. In this work, we propose a Multi-Lingual Acquisition (MLA) framework that can easily empower a monolingual Vision-Language Pre-training (VLP) model with multilingual capability. Specifically, we design a lightweight language acquisition encoder based on state-of-the-art monolingual VLP models. We further propose a two-stage training strategy to optimize the language acquisition encoder, namely the Native Language Transfer stage and the Language Exposure stage. With much less multilingual training data and computing resources, our model achieves state-of-the-art performance on multilingual image-text and video-text retrieval benchmarks.
Cross-network node classification aims to use a labeled source network to classify nodes in an unlabeled target network. Most of the existing cross-network node classification methods learn the network representations...
详细信息
Cross-network node classification aims to use a labeled source network to classify nodes in an unlabeled target network. Most of the existing cross-network node classification methods learn the network representations by capturing the node neighborhood and train the classifier on these representations. The performance is highly dependent on the high-quality neighborhood in the network. However, in applications, the degree of nodes generally follows a long-tail distribution, i.e., a significant proportion of nodes are tail nodes with sparse neighborhood. It poses a challenge to existing methods. To this end, a structure similarity graph for cross-network node classitication method (SCNC) is proposed in this paper. Firstly, the potential links between nodes are predicted with the structural similarity metric to construct structure similarity graph, which can enrich the neighborhood of tail nodes. Then, the embedding representations of the structural similarity graph are learned to capture more neighborhood information. Finally, the adversarial is used to learn the domain invariant representations to address cross-network divergence. Extensive experimental results show that our SCNC outperforms the state-of-the-art methods.
Label-noise constitutes a major challenge for facial expression recognition in the wild due to the ambiguity of facial expressions worsened by low-quality images. To deal with this problem, we propose a simple but eff...
详细信息
Kolmogorov-Arnold Networks (KAN) is an emerging neural network architecture in machine learning. It has greatly interested the research community about whether KAN can be a promising alternative of the commonly used M...
详细信息
暂无评论