Automatic detection of PCB defects become a difficult work in electronics industry, with the rapid development of Integrated Circuit. A good detection method can effectively improve production efficiency and reduce th...
详细信息
Purpose: Bronchoscopic intervention is a widely-used clinical technique for pulmonary diseases, which requires an accurate and topological complete airway map for its localization and guidance. The airway map could be...
详细信息
Learning reliable motion representation between consecutive frames, such as optical flow, has proven to have great promotion to video understanding. However, the TV-L1 method, an effective optical flow solver, is time...
详细信息
Learning reliable motion representation between consecutive frames, such as optical flow, has proven to have great promotion to video understanding. However, the TV-L1 method, an effective optical flow solver, is time...
详细信息
ISBN:
(纸本)9781728176055;9781728176062
Learning reliable motion representation between consecutive frames, such as optical flow, has proven to have great promotion to video understanding. However, the TV-L1 method, an effective optical flow solver, is time-consuming and expensive in storage for caching the extracted optical flow. To fill the gap, we propose UF-TSN, a novel end-to-end action recognition approach enhanced with an embedded lightweight unsupervised optical flow estimator. UF-TSN estimates motion cues from adjacent frames in a coarse-to-fine manner and focuses on small displacement for each level by extracting pyramid of feature and warping one to the other according to the estimated flow of the last level. Due to the lack of labeled motion for action datasets, we constrain the flow prediction with multi-scale photometric consistency and edge-aware smoothness. Compared with state-of-the-art unsupervised motion representation learning methods, our model achieves better accuracy while maintaining efficiency, which is competitive with some supervised or more complicated approaches.
Hypergraph, an expressive structure with flexibility to model the higher-order correlations among entities, has recently attracted increasing attention from various research domains. Despite the success of Graph Neura...
详细信息
Optical flow estimation is an essential step for many real-world computer vision tasks. Existing deep networks have achieved satisfactory results by mostly employing a pyramidal coarse-to-fine paradigm, where a key pr...
详细信息
Personalized federated learning aims to address data heterogeneity across local clients in federated learning. However, current methods blindly incorporate either full model parameters or predefined partial parameters...
详细信息
The generative adversarial network(GAN)is first proposed in 2014,and this kind of network model is machine learning systems that can learn to measure a given distribution of data,one of the most important applications...
详细信息
The generative adversarial network(GAN)is first proposed in 2014,and this kind of network model is machine learning systems that can learn to measure a given distribution of data,one of the most important applications is style *** transfer is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output ***-GAN is a classic GAN model,which has a wide range of scenarios in style *** its unsupervised learning characteristics,the mapping is easy to be learned between an input image and an output ***,it is difficult for CYCLE-GAN to converge and generate high-quality *** order to solve this problem,spectral normalization is introduced into each convolutional kernel of the *** convolutional kernel reaches Lipschitz stability constraint with adding spectral normalization and the value of the convolutional kernel is limited to[0,1],which promotes the training process of the proposed ***,we use pretrained model(VGG16)to control the loss of image content in the position of l1 *** avoid overfitting,l1 regularization term and l2 regularization term are both used in the object loss *** terms of Frechet Inception Distance(FID)score evaluation,our proposed model achieves outstanding performance and preserves more discriminative *** results show that the proposed model converges faster and achieves better FID scores than the state of the art.
With the development of neural networks and the increasing popularity of automatic driving, the calibration of the LiDAR and the camera has attracted more and more attention. This calibration task is multi-modal, wher...
详细信息
In this paper, a self-attention-based Vision Transformer (VIT) method is introduced into estimate human head pose parameters. Firstly, the head pose image is divided into 32X32 patches, each image patch is regarded as...
详细信息
暂无评论