Physics-informed neural networks (PINNs) have recently been demonstrated to be effective for the numerical solution of differential equations, with the advantage of small real labelled data needed. However, the perfor...
详细信息
Spectral Kernel Networks (SKNs) emerge as a promising approach in machine learning, melding solid theoretical foundations of spectral kernels with the representation power of hierarchical architectures. At its core, t...
详细信息
Spectral Kernel Networks (SKNs) emerge as a promising approach in machine learning, melding solid theoretical foundations of spectral kernels with the representation power of hierarchical architectures. At its core, the spectral density function plays a pivotal role by revealing essential patterns in data distributions, thereby offering deep insights into the underlying framework in real-world tasks. Nevertheless, prevailing designs of spectral density often overlook the intricate interactions within data structures. This phenomenon consequently neglects expanses of the hypothesis space, thus curtailing the performance of SKNs. This paper addresses the issues through a novel approach, the Copula-Nested Spectral Kernel Network (CokeNet). Concretely, we first redefine the spectral density with the form of copulas to enhance the diversity of spectral densities. Next, the specific expression of the copula module is designed to allow the excavation of complex dependence structures. Finally, the unified kernel network is proposed by integrating the corresponding spectral kernel and the copula module. Through rigorous theoretical analysis and experimental verification, CokeNet demonstrates superior performance and significant advancements over SOTA algorithms in the field. Copyright 2024 by the author(s)
In recent years, the merging of vast datasets with powerful computational resources has led to the emergence of large pre-trained models in the field of deep learning. However, the common practices often overgeneraliz...
Image segmentation is a crucial task in the field of computer vision. Markov random fields (MRF) based image segmentation method can effectively capture intricate relationships among pixels. However, MRF typically req...
详细信息
Deep learning has advanced through the combination of large datasets and computational power, leading to the development of extensive pretrained models like Vision Transformers (ViTs). However, these models often assu...
Deep learning has advanced through the combination of large datasets and computational power, leading to the development of extensive pretrained models like Vision Transformers (ViTs). However, these models often assume a one-size-fits-all utility, lacking the ability to initialize models with elastic scales tailored to the resource constraints of specific downstream tasks. To address these issues, we propose Probabilistic Expansion from LearnGene (PEG) for mixture sampling and elastic initialization of Vision Transformers. Specifically, PEG utilizes a probabilistic mixture approach to sample Multi-Head Self-Attention layers and Feed-Forward Networks from a large ancestry model into a more compact part termed as learngene. Theoretically, we demonstrate that these learngene can approximate the parameter distribution of the original ancestry model, thereby preserving its significant knowledge. Next, PEG expands the sampled learngene through non-linear mapping, enabling the initialization of descendant models with elastic scales to suit various resource constraints. Our extensive experiments demonstrate the effectiveness of PEG and outperforming traditional initialization strategies. Copyright 2024 by the author(s)
Traditional supervised medical image segmentation models require large amounts of labeled data for training;however, obtaining such large-scale labeled datasets in the real world is extremely challenging. Recent semi-...
详细信息
Nowadays, massive amounts of multimedia contents are exchanged in our daily life, while tampered images are also flooding the social networks. Tampering detection is therefore becoming increasingly important for multi...
详细信息
Vision Transformers (ViTs) are widely used in a variety of applications, while they usually have a fixed architecture that may not match the varying computational resources of different deployment environments. Thus, ...
作者:
You, ShuaiChen, CuiqunFeng, YujianLiu, HaiJi, YimuYe, Mang
School of Internet of Things Nanjing China Anhui University
School of Computer Science and Technology Hefei China South China Normal University
school of computer Guangdong China NJUPT
School of Computer Science Nanjing China Wuhan University
National Engineering Research Center for Multimedia Software Hubei Key Laboratory of Multimedia and Network Communication Engineering Institute of Artificial Intelligence School of Computer Science Wuhan430072 China
Text-based Person Retrieval (TPR) plays a pivotal role in video surveillance systems for safeguarding public safety. As a fine-grained retrieval task, TPR faces the significant challenge of precisely capturing highly ...
详细信息
Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all pos...
详细信息
暂无评论