Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved ***,existing methods usually need complex...
详细信息
Recently,segmentation-based scene text detection has drawn a wide research interest due to its flexibility in describing scene text instance of arbitrary shapes such as curved ***,existing methods usually need complex post-processing stages to process ambiguous lab.ls,i.e.,the lab.ls of the pixels near the text boundary,which may belong to the text or *** this paper,we present a framework for segmentation-based scene text detection by learning from ambiguous *** use the lab.l distribution learning method to process the lab.l ambiguity of text annotation,which achieves a good performance without using additional post-processing *** on benchmark datasets demonstrate that our method produces better results than state-of-the-art methods for segmentation-based scene text detection.
Although deep neural networks have achieved remarkable success, they often exhibit a significant deficiency in reliable uncertainty calibration. This paper focus on model calibratability, which assesses how amenable a...
详细信息
Although deep neural networks have achieved remarkable success, they often exhibit a significant deficiency in reliable uncertainty calibration. This paper focus on model calibratability, which assesses how amenable a model is to be well recalibrated post-hoc. We find that the widely used weight decay regularizer detrimentally affects model calibratability, subsequently leading to a decline in final calibration performance after post-hoc calibration. To identify the underlying causes leading to poor calibratability, we delve into the calibratability of intermediate features across the hidden layers. We observe a U-shaped trend in the calibratability of intermediate features from the bottom to the top layers, which indicates that over-compression of the top representation layers significantly hinders model calibratability. Based on the observations, this paper introduces a weak classifier hypothesis, i.e., given a weak classification head that has not been over-trained, the representation module can be better learned to produce more calibratable features. Consequently, we propose a progressively layer-peeled training (PLP) method to exploit this hypothesis, thereby enhancing model calibratability. Our comparative experiments show the effectiveness of our method, which improves model calibration and also yields competitive predictive performance. Copyright 2024 by the author(s)
The opportunistic network is a type of ad hoc network that relies on the chance encounters between nodes to transmit messages. It also uses store-and-carry-forward techniques for data transfer between nodes. Developin...
详细信息
With the rapid development of the Internet of Things(Io T),the amount of data from intelligent devices is propagating at unprecedented scales. Meanwhile, machine learning(ML),which relies heavily on such data, is revo...
详细信息
With the rapid development of the Internet of Things(Io T),the amount of data from intelligent devices is propagating at unprecedented scales. Meanwhile, machine learning(ML),which relies heavily on such data, is revolutionizing many aspects of our lives [1]. However, conventional centralized ML offers little scalab.lity for efficiently processing this huge amount of data.
The opportunistic networks are a kind of ad hoc networks that rely on the chance of nodes meeting to transmit messages. Acting as an effective supplement to 4G and 5G networks in some special scenarios where hardware ...
详细信息
The UAVs' deployment decision and task computation offloading decision in the UAV-assisted edge computing network significantly impact the operating efficiency of edge network. On the basis of this, the Optimizati...
详细信息
With the development of Live-Virtual-Constructive (LVC) simulation technology, numerous LVC simulation resources have been developed. To construct more large-scale LVC simulation, it is necessary to integrate existing...
详细信息
1 Introduction and main *** alignment,especially human action alignment,is a challenging *** purpose is to develop an algorithm to match the same process so that videos can be *** the progress that has been made in im...
详细信息
1 Introduction and main *** alignment,especially human action alignment,is a challenging *** purpose is to develop an algorithm to match the same process so that videos can be *** the progress that has been made in image recognition,it is still a difficult problem for distinguishing from similar frames for alignment without context information.
Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval per...
详细信息
Video-text retrieval (VTR) is an essential task in multimodal learning, aiming to bridge the semantic gap between visual and textual data. Effective video frame sampling plays a crucial role in improving retrieval performance, as it determines the quality of the visual content representation. Traditional sampling methods, such as uniform sampling and optical flow-based techniques, often fail to capture the full semantic range of videos, leading to redundancy and inefficiencies. In this work, we propose CLIP4Video-Sampling: Global Semantics-Guided Multi-Granularity Frame Sampling for Video-Text Retrieval, a global semantics-guided multi-granularity frame sampling strategy designed to optimize both computational efficiency and retrieval accuracy. By integrating multi-scale global and local temporal sampling and leveraging the CLIP (Contrastive Language-Image Pre-training) model’s powerful feature extraction capabilities, our method significantly outperforms existing approaches in both zero-shot and fine-tuned video-text retrieval tasks on popular datasets. CLIP4Video-Sampling reduces redundancy, ensures keyframe coverage, and serves as an adaptable pre-processing module for multimodal models.
Mobile Social network(MSN) is an opportunity network that considers the social attributes of nodes, and also uses the 'store-carry-forward' model to carry out data transfer between nodes. The community nature ...
详细信息
暂无评论