As a global marine disaster, red tides pose serious threats to marine ecology and the blue economy, making their monitoring crucial for preventing harmful algal blooms (HABs) and protecting the marine environment. In ...
详细信息
As a global marine disaster, red tides pose serious threats to marine ecology and the blue economy, making their monitoring crucial for preventing harmful algal blooms (HABs) and protecting the marine environment. In this study, satellite remote sensing was utilized to provide timely, large-scale, and continuous observation capabilities, overcoming the high cost and spatial and temporal limitations of in situ monitoring. However, existing remote sensing-based methods often exhibit coarse segmentation granularity and suffer from high computational complexity. To overcome these challenges, we propose a novel bimodal multispectral dynamic offset binary quantization visual transformer (DoBi-SWiP-ViT) that utilizes the ViT for global feature aggregation and parameter quantization for efficient segmentation. With the bimodal Swin-ViT with unified perceptual parsing (UPP) architecture, our model integrates data from multiple spectral bands to achieve fine-grained segmentation of large-scale remote sensing images. Additionally, we introduce a dynamic magnitude offset binary quantization ViT block to reduce the parameter redundancy and improve the computational efficiency. In addition, we validated the performance of our model through extensive comparative experiments on high-resolution imagery datasets of sea surface red tides collected from different satellite platforms. The results show that our proposed DoBi-SWiP-ViT has significantly improved the mean accuracy (mAcc) of the segmentation results. For the two test areas acquired from different satellite platforms, the improvements are 8.78% and 10.18%, respectively. This has demonstrated the superior performance of our model in detecting the red tides from high-resolution visible images, highlighting its effectiveness in capturing complex patterns and subtle features in multispectral imagery.
Building multiple hash tables serves as a very successful technique for gigantic data indexing, which can simultaneously guarantee both the search accuracy and efficiency. However, most of existing multitable indexing...
详细信息
Building multiple hash tables serves as a very successful technique for gigantic data indexing, which can simultaneously guarantee both the search accuracy and efficiency. However, most of existing multitable indexing solutions, without informative hash codes and strong table complementarity, largely suffer from the table redundancy. To address the problem, we propose a complementary binary quantization (CBQ) method for jointly learning multiple tables and the corresponding informative hash functions in a centralized way. Based on CBQ, we further design a distributed learning algorithm (D-CBQ) to accelerate the training over the large-scale distributed data set. The proposed (D-)CBQ exploits the power of prototype-based incomplete binary coding to well align the data distributions in the original space and the Hamming space and further utilizes the nature of multi-index search to jointly reduce the quantization loss. (D-)CBQ possesses several attractive properties, including the extensibility for generating long hash codes in the product space and the scalability with linear training time. Extensive experiments on two popular large-scale tasks, including the Euclidean and semantic nearest neighbor search, demonstrate that the proposed (D-)CBQ enjoys efficient computation, informative binary quantization, and strong table complementarity, which together help significantly outperform the state of the arts, with up to 57.76% performance gains relatively.
Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative bi...
详细信息
Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.
An noise removal algorithm for meteorological facsimile maps with zero-mean Gaussian noise is proposed based on binary ***,according to the time and location of the ship and the frequency loaded in the meteorologic el...
详细信息
ISBN:
(纸本)9781612848334
An noise removal algorithm for meteorological facsimile maps with zero-mean Gaussian noise is proposed based on binary ***,according to the time and location of the ship and the frequency loaded in the meteorologic electrograph,the type and the geographic information of the meteorological facsimile map is *** the noised map subtracts the geographic information(also called as bottom map) after map ***,quantize the noised maps to binary values via optimal *** perform refining on key words and marks to obtain the refined map and add the subtracted bottom *** prove that the proposed algorithm has better performance for the noise removal in meteorological facsimile maps.
The objective of large-scale object retrieval systems is to search for images that contain the target object in an image database. Where state-of-the-art approaches rely on global image representations to conduct sear...
详细信息
The objective of large-scale object retrieval systems is to search for images that contain the target object in an image database. Where state-of-the-art approaches rely on global image representations to conduct searches, we consider many boxes per image as candidates to search locally in a picture. In this paper, a feature quantization algorithm called binary quantization is proposed. In binary quantization, a scale-invariant feature transform (SIFT) feature is quantized into a descriptive and discriminative bit-vector, which allows itself to adapt to the classic inverted file structure for box indexing. The inverted file, which stores the bit-vector and box ID where the SIFT feature is located inside, is compact and can be loaded into the main memory for efficient box indexing. We evaluate our approach on available object retrieval datasets. Experimental results demonstrate that the proposed approach is fast and achieves excellent search quality. Therefore, the proposed approach is an improvement over state-of-the-art approaches for object retrieval. (C) 2015 SPIE and IS&T
With the development of informatization of IoT devices, nonterrestrial networks (NTNs) are becoming more and more important. NTN, including air and space networks, face challenges, such as high-computational complexit...
详细信息
With the development of informatization of IoT devices, nonterrestrial networks (NTNs) are becoming more and more important. NTN, including air and space networks, face challenges, such as high-computational complexity, bandwidth requirements, and memory constraints. An intelligent automatic modulation classification (AMC) mechanism based on neural networks plays a pivotal role in enhancing spectrum efficiency, throughput, and link reliability. Past work in AMC has evolved from likelihood-based and feature-based methods to traditional machine learning techniques and, more recently, to deep neural networks (DNNs). However, existing DNN architectures pose challenges for NTN due to high-computational complexity, bandwidth requirements, and memory consumption. Addressing this problems, we proposes a spiking transformer-based model for AMC, exploiting temporal dynamics for enhanced performance. Biologically inspired spiking neural networks enable us to exploit the sparse and binarized activation properties of spiking neurons, allowing us to build AMC models with high-energy efficiency and high availability that can be used in NTN systems. Furthermore, we introduce a weight binarization method to reduce the model size, which also further reduces the bandwidth and memory requirements of AMC in NTN edge deployment. Experimental results demonstrate the superiority of our approach over state-of-the-art methods, with the binarized model achieving comparable accuracy at a fraction of the size.
In recent years an explosion of online multimedia data has been witnessed. As an example, abundant photos recording every aspect of human life are available through social media. Among tremendous amount of photos, a s...
详细信息
In recent years an explosion of online multimedia data has been witnessed. As an example, abundant photos recording every aspect of human life are available through social media. Among tremendous amount of photos, a significant fraction contains human faces. Faces are usually salient features of the photos. To understand and extract useful information from such gigantic data corpus, efficient and effective retrieval algorithms are demanded. Most face retrieval techniques rely on low-level image features to compare faces based on visual similarity. However, as humans we tend to simplify the recognition task by utilizing human attributes such as gender or race to help differentiate people on a higher semantic level. In this paper, we propose to use human attributes as high-level semantic cues to determine people's identities. To this end, we develop discriminative image features with attribute information encoded to achieve more accurate face image retrieval. To guarantee scalability, we propose using a binary coding scheme for the proposed attributed-based features. A re-ranking step after initial retrieval is incorporated to further improve the retrieval performance. We demonstrate the superiority of the proposed method compared to state-of-the-art on the LFW and Pubfig face datasets. (C) 2015 Elsevier B.V. All rights reserved.
Pseudo-random number generator (PRNG) has been widely used in digital image encryption and secure communication. This paper reports a novel PRNG based on a generalized Sprott-A system that is conservative. To validate...
详细信息
Pseudo-random number generator (PRNG) has been widely used in digital image encryption and secure communication. This paper reports a novel PRNG based on a generalized Sprott-A system that is conservative. To validate whether the system can produce high quality chaotic signals, we numerically investigate its conservative chaotic dynamics and the complexity based on the approximate entropy algorithm. In this PRNG, we first select an initial value as a key to generate conservative chaotic sequence, then a scrambling operation is introduced into the process to enhance the complexity of the sequence, which is quantified by the binary quantization method. The national institute of standards and technology statistical test suite is used to test the randomness of the scrambled sequence, and we also analyze its correlation, keyspace, key sensitivity, linear complexity, information entropy and histogram. The numerical results show that the binary random sequence produced by the PRNG algorithm has the advantages of the large keyspace, high sensitivity, and good randomness. Moreover, an improved finite precision period calculation (FPPC) algorithm is proposed to calculate the repetition rate of the sequence and further discuss the relationship between the repetition rate and fixed-point accuracy;the proposed FPPC algorithm can be used to set the fixed-point notation for the proposed PRNG and avoid the degradation of the chaotic system due to the data precision.
binary neural networks leverage Sign function to binarize weights and activations, which require gradient estimators to overcome its non-differentiability and will inevitably bring gradient errors during backpropagati...
详细信息
binary neural networks leverage Sign function to binarize weights and activations, which require gradient estimators to overcome its non-differentiability and will inevitably bring gradient errors during backpropagation. Although many hand-designed soft functions have been proposed as gradient estimators to better approximate gradients, their mechanism is not clear and there are still huge performance gaps between binary models and their full-precision counterparts. To address these issues and reduce gradient error, we propose to tackle network binarization as a binary classification problem and use a multi-layer perceptron (MLP) as the classifier in the forward pass and gradient estimator in the backward pass. Benefiting from the MLP's theoretical capability to fit any continuous function, it can be adaptively learned to binarize networks and backpropagate gradients without any prior knowledge of soft functions. From this perspective, we further empirically justify that even a simple linear function can outperform previous complex soft functions. Extensive experiments demonstrate that the proposed method yields surprising performance both in image classification and human pose estimation tasks. Specifically, we achieve 65.7% top-1 accuracy of ResNet-34 on ImageNet dataset, with an absolute improvement of 2.6%. Moreover, we take binarization as a lightweighting approach for pose estimation models and propose well-designed binary pose estimation networks SBPN and BHRNet. When evaluating on the challenging Microsoft COCO keypoint dataset, the proposed method enables binary networks to achieve a mAP of up to 60.6 for the first time. Experiments conducted on real platforms demonstrate that BNN achieves a better balance between performance and computational complexity, especially when computational resources are extremely low.
Quanta image sensor is a binary imaging device envisioned to be the next generation image sensor after CCD and CMOS. Equipped with a massive number of single photon detectors, the sensor has a threshold q above which ...
详细信息
Quanta image sensor is a binary imaging device envisioned to be the next generation image sensor after CCD and CMOS. Equipped with a massive number of single photon detectors, the sensor has a threshold q above which the number of arriving photons will trigger a binary response "1", or "0" otherwise. Existing methods in the device literature typically assume that q = 1 uniformly. We argue that a spatial-temporally varying threshold can significantly improve the signal-to-noise ratio of the reconstructed image. In this paper, we present an optimal threshold design framework. We make two contributions. First, we derive a set of oracle results to theoretically inform the maximally achievable performance. We show that the oracle threshold should match exactly with the underlying pixel intensity. Second, we show that around the oracle threshold there exists a set of thresholds that give asymptotically unbiased reconstructions. The asymptotic unbiasedness has a phase transition behavior which allows us to develop a practical threshold update scheme using a bisection method. Experimentally, the new threshold design method achieves better rate of convergence than existing methods.
暂无评论