Weeds are one of the utmost damaging agricultural annoyers that have a major influence on *** have the responsibility to get higher production costs due to the waste of crops and also have a major influence on the wor...
详细信息
Weeds are one of the utmost damaging agricultural annoyers that have a major influence on *** have the responsibility to get higher production costs due to the waste of crops and also have a major influence on the worldwide agricultural *** significance of such concern got motivation in the research community to explore the usage of technology for the detection of weeds at early stages that support farmers in agricultural *** weed methods have been proposed for these fields;however,these algorithms still have challenges as they were implemented against controlled ***,in this paper,a weed image analysis approach has been proposed for the system of weed *** this system,for preprocessing,a Homomorphic filter is exploited to diminish the environmental ***,for feature extraction,an adaptive feature extraction method is proposed that exploited edge *** proposed technique estimates the directions of the edges while accounting for non-maximum *** method has several benefits,including its ease of use and ability to extend to other types of ***,low-level details in the formof features are extracted to identify weeds,and additional techniques for detecting cultured weeds are utilized if *** the processing of weed images,certain edges may be verified as a footstep function,and our technique may outperform other operators such as gradient *** relevant details are extracted to generate a feature vector that is further given to a classifier for weed ***,the features have been used in logistic regression for weed *** model was assessed against logistic regression that accurately identified different kinds of weed images in naturalistic *** proposed approach attained weighted average recognition of 98.5%against the weed images ***,it is assumed that the proposed approach might help in the weed classification s
This special feature issue covers the intersection of topical areas in artificial intelligence (AI)/machine learning (ML) and optics. The papers broadly span the current state-of-the-art advances in areas including im...
详细信息
This special feature issue covers the intersection of topical areas in artificial intelligence (AI)/machine learning (ML) and optics. The papers broadly span the current state-of-the-art advances in areas including imagerecognition, signal and imageprocessing, machine inspection/vision and automotive as well as areas of traditional optical sensing, interferometry and imaging. (C) 2022 Optica Publishing Group
Forest fire is a natural disaster that is difficult to control and has a very wide scope that threatens forest ecosystems [1]. In Indonesia itself, forest area decreases every year, one of the causes of the reduction ...
详细信息
Line segment matching in two or multiple views is helpful to 3D reconstruction and patternrecognition. To fully utilize the geometry constraint of different features for line segment matching, a novel graph-based alg...
详细信息
Line segment matching in two or multiple views is helpful to 3D reconstruction and patternrecognition. To fully utilize the geometry constraint of different features for line segment matching, a novel graph-based algorithm denoted as GLSM (Graph-based Line Segment Matching) is proposed in this paper, which includes: (1) the employment of three geometry types, i.e., homography, epipolar, and trifocal tensor, to constrain line and point candidates across views;(2) the method of unifying different geometry constraints into a line-point association graph for two or multiple views;and (3) a set of procedures for ranking, assigning, and clustering with the linepoint association graph. The experimental results indicate that GLSM can obtain sufficient matches with a satisfactory accuracy in both two and multiple views. Moreover, GLSM can be employed with large image datasets. The implementation of GLSM will be available soon at https://***/research/.
image classification is a hot spot in the field of patternrecognition and artificial intelligence. When there are apparent inter-class similarity and intra-class diversity, such as in the area of remotesensing, imag...
详细信息
Aerial scene classification is a challenging problem in understanding high-resolution remotesensingimages. Most recent aerial scene classification approaches are based on Convolutional Neural Networks (CNNs). These ...
详细信息
Aerial scene classification is a challenging problem in understanding high-resolution remotesensingimages. Most recent aerial scene classification approaches are based on Convolutional Neural Networks (CNNs). These CNN models are trained on a large amount of labeled data and the de facto practice is to use RGB patches as input to the networks. However, the importance of color within the deep learning framework is yet to be investigated for aerial scene classification. In this work, we investigate the fusion of several deep color models, trained using color representations, for aerial scene classification. We show that combining several deep color models significantly improves the recognition performance compared to using the RGB network alone. This improvement in classification performance is, however, achieved at the cost of a high-dimensional final image representation. We propose to use an information theoretic compression approach to counter this issue, leading to a compact deep color feature set without any significant loss in accuracy. Comprehensive experiments are performed on five remotesensing scene classification benchmarks: UC-Merced with 21 scene classes, WHU-RS19 with 19 scene types, RSSCN7 with 7 categories, AID with 30 aerial scene classes, and NWPU-RESISC45 with 45 categories. Our results clearly demonstrate that the fusion of deep color features always improves the overall classification performance compared to the standard RGB deep features. On the large-scale NWPU-RESISC45 dataset, our deep color features provide a significant absolute gain of 4.3% over the standard RGB deep features.
Semantic segmentation of remotesensingimages usually faces the problems of unbalanced foreground-background, large variation of object scales, and significant similarity of different classes. The FCN-based fully con...
详细信息
ISBN:
(纸本)9781450397056
Semantic segmentation of remotesensingimages usually faces the problems of unbalanced foreground-background, large variation of object scales, and significant similarity of different classes. The FCN-based fully convolutional encoder-decoder architecture seems to have become the standard for semantic segmentation, and this architecture is also prevalent in remotesensingimages. However, because of the limitations of CNN, the encoder cannot obtain global contextual information, which is extraordinarily important to the semantic segmentation of remotesensingimages. By contrast, in this paper, the CNN-based encoder is replaced by Swin Transformer to obtain rich global contextual information. Besides, for the CNN-based decoder, we propose a multi-level connection module (MLCM) to fuse high-level and low-level semantic information to help feature maps obtain more semantic information and use a multi-scale upsample module (MSUM) to join the upsampling process to recover the resolution of images better to get segmentation results preferably. The experimental results on the ISPRS Vaihingen and Potsdam datasets demonstrate the effectiveness of our proposed method.
作者:
Li, ShangzeLu, AndongHuang, YanLi, ChenglongWang, LiangAnhui Univ
Sch Comp Sci & Technol Anhui Prov Key Lab Multimodal Cognit Computat Hefei 230601 Peoples R China Anhui Univ
Sch Artificial Intelligence Informat Mat & Intelligent Sensing Lab Anhui Prov Anhui Prov Key Lab Multimodal Cognit Computat Hefei 230601 Peoples R China Chinese Acad Sci
Inst Automat Natl Lab Pattern Recognit Beijing 100190 Peoples R China
Text-based person search is a challenging cross-modal retrieval task. Existing works reduce the inter-modality and intra-class gaps by aligning local features extracted from image and text modalities, which easily lea...
详细信息
Text-based person search is a challenging cross-modal retrieval task. Existing works reduce the inter-modality and intra-class gaps by aligning local features extracted from image and text modalities, which easily lead to mismatching problems due to the lack of annotation information. Besides, it is sub-optimal to reduce two gaps simultaneously in the same feature space. This work proposes a novel joint token and feature alignment framework to reduce the inter-modality and intra-class gaps progressively. Specifically, we first build a dual-path feature learning network to extract features and conduct feature alignment to reduce the inter-modality gap. Second, we design a text generation module to generate token sequences using visual features, and then token alignment is performed to reduce the intra-class gap. Last, a fusion interaction module is introduced to further eliminate the modality heterogeneity using the strategy of multi-stage feature fusion. Extensive experiments on the CUHK-PEDES dataset demonstrate the effectiveness of our model, which significantly outperforms previous state-of-the-art methods.
remotesensing (RS) images typically exhibit com plex spatial distributions, making non-local features critical for achieving high-quality super-resolution (SR). Most existing SR networks extract local and non-local f...
ISBN:
(纸本)9798400707674
remotesensing (RS) images typically exhibit com plex spatial distributions, making non-local features critical for achieving high-quality super-resolution (SR). Most existing SR networks extract local and non-local features alternately, making the non-local feature to be explored at high spatial resolution. This leads to substantial computational costs and limits the performance of these networks. In this paper, we propose an efficient non-local feature extraction strategy to solve this problem. Specifically, we propose a dual branch super-resolution network (DBSRN) with different branches focusing on local and non-local feature extraction. For the local feature extraction branch (LFEBranch), we design an adaptive feature enhancement block (AFEB) to optimize its processing of local features. For the non-local feature extraction branch (NFEBranch), we propose a non-local feature aggregation block (NFAB) to extract non-local features more efficiently by continuously reducing the spatial resolution of the input. Extensive experiments have demonstrated that the proposed DBSRN can effectively leverage the non-local features of RS images, resulting in superior SR performance compared to state-of-the-art networks.
The Contrast Language-image Pre-training (CLIP) model learns to associate image and text content by pre-training on a large number of image and text pairs. The CLIP model can understand the relationship between image ...
详细信息
ISBN:
(数字)9798331515669
ISBN:
(纸本)9798331515676
The Contrast Language-image Pre-training (CLIP) model learns to associate image and text content by pre-training on a large number of image and text pairs. The CLIP model can understand the relationship between image content and text description, thereby achieving functions such as imagerecognition, classification, and retrieval in various tasks. And due to its zero sample learning ability, it can perform effective task inference even without a large amount of labeled data. The proposal of CLIP model has brought new research ideas and application possibilities to the field of multimodal learning, and has shown excellent performance in the zero-shot domain. When we want to apply CLIP to the field of few-shot remotesensingimage classification, the main challenge is how to fine-tune the knowledge stored in the clip so that the entire model can better adapt to downstream tasks and how to solve the problem of overfitting in training models under the situation of few shot samples. The commonly used CLIP-based fine-tuning methods have poor performance under few-shot conditions. In this paper, we propose a semi-supervised learning based cached model method, which classifies the test set for the first time based on a small amount of labeled data in the training set as the cached model. The high confidence pseudo labeled data in the test set is used as a supplement to jointly classify the remaining low confidence test samples. The experiment shows that our proposed innovative method has significantly improved accuracy compared to previous methods on two benchmark datasets.
暂无评论