In image retrieval tasks, although efficient methods based on pre-computing information related to retrieval and effective methods utilizing re-ranking have been proposed, developing a method that achieves both effici...
ISBN:
(纸本)9783031683114;9783031683121
In image retrieval tasks, although efficient methods based on pre-computing information related to retrieval and effective methods utilizing re-ranking have been proposed, developing a method that achieves both efficiency and effectiveness at the same time, remains challenging. To develop an efficient and effective image retrieval method, we propose a simple-yet-effective novel image retrieval framework;R-DiP (Re-ranking based Diffusion Pre-computation). It incorporates an effective re-ranking model into the pre-computation step of an existing efficient method, namely, Offline Diffusion that pre-computes the diffusion process in the offline step and provides a simple linear combination-based retrieval in the online step. Experimental results on standard benchmarks shows that R-DiP performs comparable to the State-Of-The-Art (SOTA) image retrieval method, namely SuperGlobal, while maintaining the efficiency of Offline Diffusion. Notably, in million-scale datasets, R-DiP improves the mAP (mean Average Precision) by about 2.0%, and reduces the speed by about 75% on average, surpassing SOTA methods. These results indicate that R-DiP is a promising solution to the efficiency-effectiveness trade-off in image retrieval, that offers the flexibility to incorporate any advanced re-ranking method in the future.
There are many optimization problems inmilitary applications, among which the weapon target assignment (WTA) problem is the most typical and the most widely studied problem. Plenty of evolutionary algorithms-based met...
ISBN:
(纸本)9789819771806;9789819771813
There are many optimization problems inmilitary applications, among which the weapon target assignment (WTA) problem is the most typical and the most widely studied problem. Plenty of evolutionary algorithms-based methods are studied for resolving it. However, the quality of the solutions of WTA still has a lot of room for improvement. We propose a prominent method called diversity genetic algorithm (DGA) which has three significant components to handle WTA. A hybrid crossover strategy combining two operators is introduced to improve DGA's exploration performance. Levy flight mutation is used to control the mutation percentage of offspring chromosomes, which could improve DGA's exploitation. Besides, an enhanced mechanism is put forward based on the fitness of best solutions and Logistic chaotic mapping, which balances the performance of DGA. Five representative algorithms and twelve classical benchmark testing instances are adopted to evaluate DGA. Experiment results indicate that DGA has superior ability and suitable time cost.
The reduced cost filtering is a technique that consists in filtering a constraint using the reduced cost of a linear program that encodes this constraint. Sellmann [16] shows that while doing a Lagrangian relaxation o...
ISBN:
(数字)9783031605970
ISBN:
(纸本)9783031605963;9783031605970
The reduced cost filtering is a technique that consists in filtering a constraint using the reduced cost of a linear program that encodes this constraint. Sellmann [16] shows that while doing a Lagrangian relaxation of a constraint, suboptimal Lagrange multipliers can provide more filtering than optimal ones. Boudreault and Quimper [5] make an algorithm that locally altered the Lagrange multipliers for the Weighted Circuit constraint to enhance filtering and achieve a speedup of 30%. We seek to design an algorithm like Boudreault and Quimper, but for the AtMostNValue constraint. Based on the work done by Cambazard and Fages [7] on this constraint, we use a subgradient algorithm which takes into consideration the reduced cost to boost the Lagrange multipliers in the optimal filtering direction. We test our methods on the dominating queens and the p-median problem. On the first, we record a speedup of 71% on average. On the second, there are three classes of instances. On the first two, we have an average speedup of 33% and 8%. On the hardest class, we find up to 13 better solutions than the previous algorithm on the 30 instances in the class.
Coronary atherosclerosis is a leading cause of morbidity and mortality worldwide. It is often treated by placing stents in the coronary arteries. Inappropriately placed stents or malappositions can result in post-inte...
ISBN:
(纸本)9783031470752;9783031470769
Coronary atherosclerosis is a leading cause of morbidity and mortality worldwide. It is often treated by placing stents in the coronary arteries. Inappropriately placed stents or malappositions can result in post-interventional complications. Intravascular Ultrasound (IVUS) imaging offers a potential solution by providing real-time endovascular guidance for stent placement. The signature of malapposition is very subtle and requires exploring second-order relationships between blood flow patterns, vessel walls, and stents. In this paper, we perform a comparative study of various deep learning methods and their feature extraction capabilities for building a malapposition detector. Our results in the study address the importance of incorporating domain knowledge in performance improvement while still indicating the limitations of current systems for achieving clinically ready performance.
In this paper, we investigate self-supervised pre-training methods for document text recognition. Nowadays, large unlabeled datasets can be collected for many research tasks, including text recognition, but it is cost...
ISBN:
(纸本)9783031705458;9783031705465
In this paper, we investigate self-supervised pre-training methods for document text recognition. Nowadays, large unlabeled datasets can be collected for many research tasks, including text recognition, but it is costly to annotate them. Therefore, methods utilizing unlabeled data are researched. We study self-supervised pre-training methods based on masked label prediction using three different approaches - Feature Quantization, VQ-VAE, and Post-Quantized AE. We also investigate joint-embedding approaches with VICReg and NT-Xent objectives, for which we propose an image shifting technique to prevent model collapse where it relies solely on positional encoding while completely ignoring the input image. We perform our experiments on historical handwritten (Bentham) and historical printed datasets mainly to investigate the benefits of the self-supervised pre-training techniques with different amounts of annotated target domain data. We use transfer learning as strong baselines. The evaluation shows that the self-supervised pre-training on data from the target domain is very effective, but it struggles to outperform transfer learning from closely related domains. This paper is one of the first researches exploring self-supervised pre-training in document text recognition, and we believe that it will become a cornerstone for future research in this area. We made our implementation of the investigated methods publicly available at https://***/DCGM/pero-pretraining.
In fingerprint-based authentication system, cancelable fingerprint templates are generated to defend the fingerprint information. In this paper, we proposed a novel cancelable fingerprint template using Visual Secret ...
ISBN:
(数字)9783031127007
ISBN:
(纸本)9783031126994;9783031127007
In fingerprint-based authentication system, cancelable fingerprint templates are generated to defend the fingerprint information. In this paper, we proposed a novel cancelable fingerprint template using Visual Secret Sharing (VSS). Using VSS, each fingerprint image is encrypted into different shares. Finally, these shares are preserved in distinct databases and treated as fingerprint template. Traditional VSS schemes are suffering from pixel expansion and contrast reduction. We have used grid-based VSS and data embedding mechanisms to succeed these limitations. The proposed fingerprint templates satisfy ideal properties of cancelable templates such as non-invertibility, diversity, and revocability without altering the performance of the authentication system. To enhance the speed of the template generation and reconstruction, we have used General Purpose Graphical Processing Unit (GPGPU) to fulfill the operations. The experimental evaluation validates that the reconstructed fingerprints have equivalent performance as the initial fingerprints with upgraded security.
Side channel evaluations benefit from sound characterisations of adversarial leakage models, which are the determining factor for attack success. Two questions are of interest: can we define and estimate a quantity th...
ISBN:
(纸本)9783031683909;9783031683916
Side channel evaluations benefit from sound characterisations of adversarial leakage models, which are the determining factor for attack success. Two questions are of interest: can we define and estimate a quantity that captures the ideal adversary (who knows all the distributions that are involved in an attack), and can we define and estimate a quantity that captures a concrete adversary (represented by a given leakage model)? Existing work has led to a proliferation of custom quantities to measure both types of adversaries, which can be data intensive to estimate in the ideal case, even for discrete side channels and especially when the number of dimensions in the side channel traces grows. In this paper, we show how to define the mutual information between carefully chosen variables of interest and how to instantiate a recently suggested mutual information estimator for practical estimation. We apply our results to real-world data sets and are the first to provide a mutual information-based characterisation of ideal and concrete adversaries utilising up to 30 data points.
In a discussion on the computational complexity of "parameterized" NL (nondeterministic logarithmic-space complexity class), Syntactic NL or succinctly SNL was first introduced in 2017 as a "syntactical...
ISBN:
(数字)9783031626876
ISBN:
(纸本)9783031626869;9783031626876
In a discussion on the computational complexity of "parameterized" NL (nondeterministic logarithmic-space complexity class), Syntactic NL or succinctly SNL was first introduced in 2017 as a "syntactically"-defined natural subclass of NL using a restricted form of second-order logic in close connection to the so-called linear space hypothesis. We further explore various properties of this complexity class SNL. In particular, we consider the expressibility of "complementary" problems of SNL problems. As a variant of SNL, we also study an optimization version of SNL, calledMAXSNL, and its natural subclass, called MAXtSNL.
Accurate identification of forest and grassland pests is crucial for ecosystem stability and biodiversity. Given the characteristics of pests in forest and grassland environments-such as a wide variety of species, col...
ISBN:
(纸本)9789819755905;9789819755912
Accurate identification of forest and grassland pests is crucial for ecosystem stability and biodiversity. Given the characteristics of pests in forest and grassland environments-such as a wide variety of species, color similarity to the background, small inter-class variability, and large intra-class variability-an improved lightweight target detection model, Pest-YOLO, is proposed. This model addresses the deficiencies of traditional models in terms of recognition accuracy and response speed. Firstly, an online data enhancement method is introduced, applying different image transformation strategies to enable the model to learn more feature representations. Secondly, a multi-level feature fusion method is proposed and combined with the Ghost model to maintain adaptability while compressing the network size. Additionally, the SERes detection head is proposed to improve the network's ability to detect similar targets. Experimental results show that Pest-YOLO achieves the best detection performance on the PEST27 dataset, with 14.16 M fewer parameters and a 3% improvement in the model's mAP0.5 compared to the original YOLOv8 algorithm, demonstrating its effectiveness and potential for forest and grassland pest detection tasks.
XR technology leads the arrival of spatial computing, and outlines a clearer form for the immersive experience of image -- "ultimate cinema". From the birth of film to interactive film, to VR film and game, ...
ISBN:
(纸本)9783031600111;9783031600128
XR technology leads the arrival of spatial computing, and outlines a clearer form for the immersive experience of image -- "ultimate cinema". From the birth of film to interactive film, to VR film and game, it provides us with a revolutionary vein of image technology media and immersive experience, and finally points to immersive theater. From the Minimalism space to public art, the physical liberation of performance art provides us with a theoretical vein of art history in which the audience is constantly enlarged and the body experience is advanced. In the development of games, Positive Psychology and Maslow's Hierarchy of Needs provide theoretical support for the occurrence of immersive experience, and have achieved remarkable results in practice. Finally, with the integration of technology and theory, immersive theater and the ultimate form of VR, the holodeck based on XR is becoming more and more visible, allowing us to glimpse the opportunities and difficulties that future immersive experience may face. At the same time, we should be more aware of the original intention of human's pursuit of immersive experience, and avoid the domination of technology over people.
暂无评论