An NLP system for indian languages should have a lexical subsystem that is driven by a morphological analyzer. Such an analyzer should be able to parse a word into its constituent morphemes and obtain lexical projecti...
详细信息
An NLP system for indian languages should have a lexical subsystem that is driven by a morphological analyzer. Such an analyzer should be able to parse a word into its constituent morphemes and obtain lexical projection of the word as a unification of the projections of the constituent morphemes. Lexical projections considered here are f-structures of the Lexical Functional Grammar (LFG). A formalism has been proposed, by which the lexicon writer may specify the lexicon in four levels. The specifications are compiled into a stored lexical knowledge base on one hand and a formulation of derivational morphology called Augmented Finite State Automata (AFSA) on the other to achieve a compact lexical representation. The aspects of AFSA, especially its power of morphological parsing of words in a computationally attractive manner, has been discussed. An additional utility of the AFSA, in the form of spelling error corrector, has also been discussed. Bangla, or Bengali is considered as a case study. Implementation notes based on object-oriented programming principles has been provided.
Given a set of points in multi-dimensional space, we propose a new definition for the neighbors of an arbitrary point P. The definition tries to capture the idea that the neighbors should be as near to P and as symmet...
详细信息
Given a set of points in multi-dimensional space, we propose a new definition for the neighbors of an arbitrary point P. The definition tries to capture the idea that the neighbors should be as near to P and as symmetrically placed around P as possible. In contrast, the conventional nearest neighborhood considers only nearness as the criterion for neighborhood. We propose an iterative procedure to compute the neighbors where the first neighbor is the nearest neighbor. The second and other neighbors are chosen so that at any stage the distance between the centroid of the neighbors and P is as small as possible. The centroid criterion takes care of symmetrical placement of the neighbors. One can use median instead of centroid to define the neighbors. The new definition is free from any user-specified parameter and can be used for pattern classification, clustering and low-level description of dot patterns.
This paper deals with an OCR error detection and correction technique for a highly inflectional language script like Bangla (a major indian language). This is the first report of its kind. Using two separate lexicons ...
详细信息
In this paper we describe a texture segmentation approach without feature computation based on a multilayer perceptron network (MLP). Thus, the users need not bother about the selection and then computation of feature...
详细信息
We present a semi-automatic method for extracting the 3D boundary of the cells in a compact tissue cross-section photographed by a confocal microscope. The confocal microscope provides pictures at different depths of ...
详细信息
Extraction of skeletal shape from a 2D dot pattern is discussed. We use a self-organizing neural network model to get a piecewise linear approximation of a skeleton of the pattern. It is found that even without a prop...
详细信息
Road networks are important features of satellite imagery. The main contribution of the present road detection method consists of an effective enhancement technique and an efficient segmentation technique that removes...
详细信息
This paper deals with an OCR error detection and correction technique for a highly inflectional language script like Bangla (a major indian language). This is the first report of its kind. Using two separate lexicons ...
详细信息
This paper deals with an OCR error detection and correction technique for a highly inflectional language script like Bangla (a major indian language). This is the first report of its kind. Using two separate lexicons of root words and suffixes, candidate root-suffix pairs of each input word are detected, their grammatical agreement are tested and the root/suffix part in which the error has occurred is noted. The correction is made on the corresponding error part of the input string by a fast dictionary access technique. To do so some alternative strings are generated for an erroneous word. Among the alternative strings, those satisfying grammatical agreement in root-suffix and also having smallest Levenstein-Damerau distance are finally chosen as the correct ones. The system has an accuracy of 75.61%.
In this paper we describe a texture segmentation approach without feature computation based on a multilayer perceptron network (MLP). Thus, the users need not bother about the selection and then computation of feature...
详细信息
In this paper we describe a texture segmentation approach without feature computation based on a multilayer perceptron network (MLP). Thus, the users need not bother about the selection and then computation of feature set and hence real-time segmentation may be possible. The basic motivation of the work is the fact that human vision does not consciously compute features for distinguishing different textures in a scene. A single hidden layer MLP network has been found to be most suitable with heuristically chosen input and hidden layer sizes. A method has been used to speedup the learning of the MLP network. The result of segmentation by a trained network usually results in misclassification in the form of speckles. For the removal of such noise an edge-preserving-noise-smoothing technique is proposed. The final segmentation accuracy is well comparable with that of other existing techniques.
Extraction of skeletal shape from a 2D dot pattern is discussed. We use a self-organizing neural network model to get a piecewise linear approximation of a skeleton of the pattern. It is found that even without a prop...
详细信息
Extraction of skeletal shape from a 2D dot pattern is discussed. We use a self-organizing neural network model to get a piecewise linear approximation of a skeleton of the pattern. It is found that even without a proper definition of a skeleton, the proposed algorithm is able to produce skeletons that are quite close to what we intuitively feel it should be. In Kohonen's self-organizing model, the set of processors and their neighbourhoods are fixed. We suggest here some modifications of it in which the set of processors and their neighbourhoods change adaptively.
暂无评论