Neural machine translation (NMT) systems have been shown to give undesirable translation when a small change is made in the source sentence. In this paper, we study the behaviour of NMT systems when multiple changes a...
详细信息
Optical Character Recognition (OCR) is one of the continuously explored problems. Presently, commercial character recognizers are available reporting near to 100% recognition rates on text in a number of scripts. Desp...
详细信息
Optical Character Recognition (OCR) is one of the continuously explored problems. Presently, commercial character recognizers are available reporting near to 100% recognition rates on text in a number of scripts. Despite these advancements, OCR systems however, have yet to mature for cursive scripts like Urdu. This study presents a holistic technique for recognition of Urdu text in Nastaliq font using "complete" ligatures as recognition units. The term "complete" refers to a partial word including its main body and secondary components (dots and diacritic marks). Discrete Wavelet Transform (DWT) is employed as feature extractor while a separate Hidden Markov Model (HMM) is trained for each ligature considered in our study. More than 2000 frequently used unique Urdu ligatures from the standard CLE (Center of Language Engineering) dataset are considered in our evaluations. The system reads a promising accuracy of 88.87% on more than 10,000 partial words.
Standard Unsupervised Domain Adaptation (UDA) methods assume the availability of both source and target data during the adaptation. In this work, we investigate Source-free Unsupervised Domain Adaptation (SF-UDA), a s...
Standard Unsupervised Domain Adaptation (UDA) methods assume the availability of both source and target data during the adaptation. In this work, we investigate Source-free Unsupervised Domain Adaptation (SF-UDA), a specific case of UDA where a model is adapted to a target domain without access to source data. We propose a novel approach for the SF-UDA setting based on a loss reweighting strategy that brings robustness against the noise that inevitably affects the pseudo-labels. The classification loss is reweighted based on the reliability of the pseudo-labels that is measured by estimating their uncertainty. Guided by such reweighting strategy, the pseudo-labels are progressively refined by aggregating knowledge from neighbouring samples. Furthermore, a self-supervised contrastive framework is leveraged as a target space regulariser to enhance such knowledge aggregation. A novel negative pairs exclusion strategy is proposed to identify and exclude negative pairs made of samples sharing the same class, even in presence of some noise in the pseudo-labels. Our method outperforms previous methods on three major benchmarks by a large margin. We set the new SF-UDA state-of-the-art on VisDA-C and DomainNet with a performance gain of + 1.8% on both benchmarks and on PACS with + 12.3% in the single-source setting and +6.6% in multi-target adaptation. Additional analyses demonstrate that the proposed approach is robust to the noise, which results in significantly more accurate pseudo-labels compared to state-of-the-art approaches.
This article presents our recent study on fusion of information at feature and classifier output levels for improved performance of offline handwritten Devanagari word recognition. We consider here two state-of-the-ar...
详细信息
ISBN:
(纸本)9781479961016
This article presents our recent study on fusion of information at feature and classifier output levels for improved performance of offline handwritten Devanagari word recognition. We consider here two state-of-the-art features, viz., Directional Distance Distribution (DDD) and Gradient-Structural-Concavity (GSC) features along with multi-class SVM classifiers. Here, we study various combinations of DDD features along with one or more features from the GSC feature set. We experiment by presenting different combined feature vectors as input to SVM classifiers. Also, the output vectors of different SVM classifiers fed with different feature vectors are combined by another SVM classifier. The combination of the outputs of two SVMs each being fed with a different feature vector provides superior performance to the performance of a single SVM classifier fed with the combined feature vector. Experimental results are obtained on a large handwritten Devanagari word sample image database of 100 Indian town names. The recognition results on its test samples show that SVM recognition output of DDD features combined with the SVM output of GSC features improves the final recognition accuracy significantly.
Stereo computation is one of the vision problems where the presence of outliers cannot be neglected. Most standard algorithms make unrealistic assumptions about noise distributions, which leads to erroneous results th...
详细信息
Most of the countries use bi-script documents. This is because every country uses its own national language and English as second/foreign language. Therefore, bi-lingual document with one language being the English an...
详细信息
At present, adversarial attacks are designed in a task-specific fashion. However, for downstream computervision tasks such as image captioning, image segmentation etc., the current deep learning systems use an image ...
详细信息
In this paper a rule based rough set decision system for development of a disease inference engine is described. For this purpose an off-line data acquisition system of paper electrocardiogram (ECG) records are develo...
详细信息
In this paper a rule based rough set decision system for development of a disease inference engine is described. For this purpose an off-line data acquisition system of paper electrocardiogram (ECG) records are developed using image processing techniques. A QRS detector is developed for detection of R-R interval from ECG waves. After detection of this R-R interval the P and T waves are detected based on syntactic approaches and different time-plane features are extracted from every ECG signals. From a knowledgebase which is developed from the feedback of different reputed cardiologists and consultation of different medical books the essential time plane features for ECG interpretation have been selected. Finally, a rule-based roughest decision system is generated for the development of an inference engine for disease identification from these time-plane features.
Form document image processing has become an increasingly essential technology in office automation tasks. One of the problems is that the document image may appear skewed for many reasons. Therefore, the skew estimat...
详细信息
OCR errors hurt retrieval performance to a great extent. Research has been done on modelling and correction of OCR errors. However, most of the existing systems use language dependent resources or training texts for s...
详细信息
暂无评论