Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and pat...
详细信息
Answering to a query like when a particular document was printed is quite helpful in practice especially forensic purposes. This study attempts to develop a general framework that makes use of image processing and patternrecognition principles for ink age determination in printed documents. The approach, at first, computationally extracts a set of suitable color features and then analyzes them to properly associate them with ink age. Finally, a neural net is designed and trained to determine ages of unknown samples. The dataset used for the present experiment consists of the cover pages of LIFE magazines published in between 1930's and 70's (five decades). Test results show that a viable framework for involving machines in assisting human experts for determining age of printed documents.
In this paper a scheme for segmentation of unconstrained handwritten Devanagari and Bangla words into characters and its sub-parts is proposed. Firstly, the region above headline is identified by counting the number o...
详细信息
ISBN:
(纸本)9781479980482
In this paper a scheme for segmentation of unconstrained handwritten Devanagari and Bangla words into characters and its sub-parts is proposed. Firstly, the region above headline is identified by counting the number of white to black transitions in each row, which followed by its separation. Then the characters are segmented using fuzzy logic. For each column, the inputs to the fuzzy system are the location of first white pixel, thickness of the first black stroke, count of white pixels, and the run length count of white pixels.
Word segmentation has become a research topic to improve OCR accuracy for video text recognition, because a video text line suffers from arbitrary orientation, complex background and low resolution. Therefore, for wor...
详细信息
Word segmentation has become a research topic to improve OCR accuracy for video text recognition, because a video text line suffers from arbitrary orientation, complex background and low resolution. Therefore, for word segmentation from arbitrarily-oriented video text lines, in this paper, we extract four new gradient directional features for each Canny edge pixel of the input text line image to produce four respective pixel candidate images. The union of four pixel candidate images is performed to obtain a text candidate image. The sequence of the components in the text candidate image according to the text line is determined using nearest neighbor criteria. Then we propose a two-stage method for segmenting words. In the first stage, for the distances between the components, we apply K-means clustering with K=2 to get probable word and non-word spacing clusters. The words are segmented based on probable word spacing and all other components are passed to the second stage for segmenting correct words. For each segmented and un-segmented words passed to the second stage, the method repeats all the steps until the K-means clustering step to find probable word and non-word spacing clusters. Then the method considers cluster nature, height and width of the components to identify the correct word spacing. The method is tested extensively on video curved text lines, non-horizontal straight lines, horizontal straight lines and text lines from the ICDAR-2003 competition data. Experimental results and a comparative study shows the results are encouraging and promising.
In this paper, a new sequence matching algorithm called as Exemplary Sequence Cardinality (ESC) is proposed. ESC combines several abilities of other sequence matching algorithms e.g. DTW, SSDTW, CDP, FSM, MVM, OSB~1. ...
详细信息
ISBN:
(纸本)9781479918065
In this paper, a new sequence matching algorithm called as Exemplary Sequence Cardinality (ESC) is proposed. ESC combines several abilities of other sequence matching algorithms e.g. DTW, SSDTW, CDP, FSM, MVM, OSB~1. Depending on the application domain, ESC can be tuned to behave such as these different sequence matching algorithms. Its generality and robustness comes from its ability to find subsequences (as in CDP and SSDTW), to skip outliers inside the target sequences (as in MVM and FSM) and also in the query sequence (as in OSB) and it has the ability to have many to one and one to many correspondences (as in DTW) between the elements of the query and the target sequences. It's special characteristic of skipping noisy elements from query sequence along with other afore mentioned properties gives it an edge over FSM. In case of word spotting application, the outliers skipping capability of ESC makes it less sensible to local variations in the spelling of words, and also to noise present in the query and/or in the target word images. Due to it's capability of sub-sequence matching, the ESC algorithm has the ability to retrieve a query inside a line or piece of line. Finally, its multiple matching facilities (many to one and one to many matching) is proven to be well advantageous in case of different length of target and query sequences due to the variability in scale, font, type/size factors. By experimenting on printed historical document images, we have demonstrated the interest of proposed ESC algorithm in specific cases when incorrect word segmentation and word level local variations occur regularly.
An approach to QRS complex detection based on the selection of maximum area zone of square derivative curve is presented in this paper. At first the ECG signals are being extracted from paper ECG records by an automat...
详细信息
An approach to QRS complex detection based on the selection of maximum area zone of square derivative curve is presented in this paper. At first the ECG signals are being extracted from paper ECG records by an automated data acquisition system which has been developed by using image processing techniques. Then the QRS complex of each ECG signal is detected even in the presence of power line interference and baseline drift to calculate the sampling period for further analysis in frequency plane for disease identification. A very high accuracy level (∼ 99.4%) has been achieved in detection of QRS complex by this method.
In this paper, we have presented a new and faster word retrieval approach, which is able to deal with heterogeneous document image collections. A certain amount of image features (statistical and Gabor Wavelet) are ex...
详细信息
ISBN:
(纸本)9781479901937
In this paper, we have presented a new and faster word retrieval approach, which is able to deal with heterogeneous document image collections. A certain amount of image features (statistical and Gabor Wavelet) are extracted, which inherently represent word's images. These features are used for generating hash table for fast retrieval of similar image from a very large image dataset. The decomposition and embedding of high-dimensional features and complex distance functions into a low-dimensional Hamming space helps to efficiently search items. However, existing methods do not apply for high-dimensional kernelized data when the underlying features' embedding for the kernel is unknown. The generalization of locality sensitive hashing (LSH) for arbitrary kernel is presented in the paper. The proposed algorithm provides sub-linear time similarity search and works for a wide class of similarity functions.
To date, paper-based examinations are still in use worldwide on all levels of education levels (e.g. secondary, tertiary levels). However, literature regarding off-line automatic assessment systems employing off-line ...
详细信息
To date, paper-based examinations are still in use worldwide on all levels of education levels (e.g. secondary, tertiary levels). However, literature regarding off-line automatic assessment systems employing off-line handwriting recognition is not numerous. This paper proposes an off-line automatic assessment system employing a hybrid feature extraction technique - a newly proposed Modified Direction and Gaussian Grid Feature (MDGGF), along with its enhanced technique. In this study other original feature extraction techniques, together with their enhanced features, were also used for feature extraction technique efficiency comparison purposes. Classifiers, namely artificial neural networks and support vector machines, were selected to be employed in the experiments. Two types of datasets were employed in the experiment for both feature extraction technique accuracy and efficiency comparisons. The best correctly recognised rate of 98.33% with 100% accuracy was obtained when employing the proposed MDGGF to the off-line automatic assessment system.
In this paper, we propose an efficient skew estimation technique based on Piece-wise Painting Algorithm (PPA) for scanned documents. Here we, at first, employ the PPA on the document image horizontally and vertically....
详细信息
In this paper, we propose an efficient skew estimation technique based on Piece-wise Painting Algorithm (PPA) for scanned documents. Here we, at first, employ the PPA on the document image horizontally and vertically. Applying the PPA on both the directions, two painted images (one for horizontally painted and other for vertically painted) are obtained. Next, based on statistical analysis some regions with specific height (width) from horizontally (vertically) painted images are selected and top (left), middle (middle) and bottom (right) points of such selected regions are categorized in 6 separate lists. Utilizing linear regression, a few lines are drawn using the lists of points. A new majority voting approach is also proposed to find the best-fit line amongst all the lines. The skew angle of the document image is estimated from the slope of the best-fit line. The proposed technique was tested extensively on a dataset containing various categories of documents. Experimental results showed that the proposed technique achieved more accurate results than the state-of-the-art methodologies.
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned d...
详细信息
ISBN:
(纸本)9781424475421
Automatic identification of an individual based on his/her handwriting characteristics is an important forensic tool. In a computational forensic scenario, presence of huge amount of text/information in a questioned document cannot be always ensured. Also, compromising in terms of systems reliability under such situation is not desirable. We here propose a system to encounter such adverse situation in the context of Bengali script. Experiments with discrete directional feature and gradient feature are reported here, along with Support Vector Machine (SVM) as classifier. We got promising results of 95.19% writer identification accuracy at first top choice and 99.03% when considering first three top choices.
It is our great pleasure to welcome you to the 11th International Conference on Neural Information Processing (ICONIP 2004) to be held in Calcutta. ICONIP 2004 is organized jointly by the Indian Statistical Institute ...
详细信息
ISBN:
(数字)9783540304999
ISBN:
(纸本)9783540239314
It is our great pleasure to welcome you to the 11th International Conference on Neural Information Processing (ICONIP 2004) to be held in Calcutta. ICONIP 2004 is organized jointly by the Indian Statistical Institute (ISI) and Jadavpur University (JU). We are con?dent that ICONIP 2004, like the previous conf- ences in this series,will providea forum for fruitful interactionandthe exchange of ideas between the participants coming from all parts of the globe. ICONIP 2004 covers all major facets of computational intelligence, but, of course, with a primary emphasis on neural networks. We are sure that this meeting will be enjoyable academically and otherwise. We are thankful to the track chairs and the reviewers for extending their support in various forms to make a sound technical program. Except for a few cases, where we could get only two review reports, each submitted paper was reviewed by at least three referees, and in some cases the revised versions were againcheckedbythereferees. Wehad470submissionsanditwasnotaneasytask for us to select papers for a four-day conference. Because of the limited duration of the conference, based on the review reports we selected only about 40% of the contributed papers. Consequently, it is possible that some good papers are left out. We again express our sincere thanks to all referees for accomplishing a great job. In addition to 186 contributed papers, the proceedings includes two plenary presentations, four invited talks and 18 papers in four special sessions. The proceedings is organized into 26 coherent topical groups.
暂无评论