this paper summarises the current literature on immune system function and behaviour, including patternrecognition receptors, danger theory, central and peripheral tolerance, and memory cells. An artificial immune sy...
详细信息
ISBN:
(纸本)3540473319
this paper summarises the current literature on immune system function and behaviour, including patternrecognition receptors, danger theory, central and peripheral tolerance, and memory cells. An artificial immune system framework is then presented based on the analogies of these natural system components and a rule and feature-based problem representation. A data set for intrusion detection is used to highlight the principles of the framework.
Document clustering has been traditionally studied as a centralized process. there are scenarios when centralized clustering does not serve the required purpose;e.g. documents spanning multiple digital libraries need ...
详细信息
ISBN:
(纸本)9780898716115
Document clustering has been traditionally studied as a centralized process. there are scenarios when centralized clustering does not serve the required purpose;e.g. documents spanning multiple digital libraries need not be clustered in one location, but rather clustered at each location, then enriched by receiving more information from other locations. A distributed collaborative approach for document clustering is proposed in this paper. the main objective here is to allow peers in a network to form independent opinions of local document grouping, followed by exchange of cluster summaries in the form of keyphrase vectors. the nodes then expand and enrich their local solution by receiving recommended documents from their peers based on the peer judgement of the similarity of local documents to the exchanged cluster summaries. Results show improvement in final clustering after merging peer recommendations. the approach allows independent nodes to achieve better local clustering by having access to distributed data without the cost of centralized clustering, while maintaining the initial local clustering structure and coherency.
Automatic segmentation and classification of recorded meetings provides a basis that enables effective browsing and querying in a meeting archive. Yet, robustness of today's approaches is often not reliable enough...
详细信息
ISBN:
(纸本)9781424403660
Automatic segmentation and classification of recorded meetings provides a basis that enables effective browsing and querying in a meeting archive. Yet, robustness of today's approaches is often not reliable enough. We therefore strive to improve on this task by introduction of a hybrid approach combining the discriminative abilities of artificial neural nets and warping capabilities of hidden markov models. Dividing the task into two layers and defining a proper set of individual actions helps to cope withthe problem of lack of data and overcomes conventional single-layered approaches. Extensive test runs on the public M4 Scripted Meeting Corpus prove the great performance gain applying our suggested novel approach compared to other similar methods.
Classification of data with imbalanced class distribution has posed a significant drawback of the performance attainable by most standard classifier learning algorithms, which assume a relatively balanced class distri...
详细信息
Recent events have made it clear that some kinds of technical texts, generated by machine and essentially meaningless, can be confused with authentic, technical texts written by humans. We identify this as a potential...
详细信息
ISBN:
(纸本)9780898716115
Recent events have made it clear that some kinds of technical texts, generated by machine and essentially meaningless, can be confused with authentic, technical texts written by humans. We identify this as a potential problem. since no existing systems for, say the web, can or do discriminate on this basis. We believe that there are subtle, short- and long-range word or even string repetitions extant in human texts, but not in many classes of computer generated texts, that can be used to discriminate based on meaning. In this paper we employ universal lossless source coding to generate features in a high-dimensional space and then apply support vector machines to discriminate between the classes of authentic and inauthentic expository texts. Compression profiles for the two kinds of text are distinct the authentic texts being bounded by various classes of more compressible or less compressible texts that are computer generated. this in turn led to the high prediction accuracy of our models which support a conjecture that there exists a relationship between meaning and compressibility. Our results show that the learning algorithm based upon the compression profile outperformed standard term-frequency text categorization on several non-trivial classes of inauthentic texts. Availability: http://***/predrag/***.
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that ...
详细信息
ISBN:
(纸本)9783540692676
We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection tasks. the speaker diarization systems are based on the TNO and ICSI system submitted for RT05s. For the conference room evaluation Single Distant Microphone condition, the SAD results perform well at 4.23% error rate, and the 'HMM-BIC' SPKR results perform competatively at an error rate of 37.2% including overlapping speech.
the great heterogeneity of web based learning systems storing and providing digital e-learningdata requires the introduction of interoperability aspects in order to resolve integration problems in a flexible and dyna...
详细信息
the proceedings contain 53 papers. the topics discussed include: a bipolar possibilistic representation of knowledge and preferences and its applications;statistical distribution of chemical fingerprints;fuzzy transfo...
详细信息
ISBN:
(纸本)3540325298
the proceedings contain 53 papers. the topics discussed include: a bipolar possibilistic representation of knowledge and preferences and its applications;statistical distribution of chemical fingerprints;fuzzy transformation and their applications to image compression;development of neuro-fuzzy system for image mining;a possibilistic approach to combinatorial optimization problems on fuzzy-valued matroids;possibilistic planning using description logics: a first step;a method for characterizing tractable subsets of qualitative fuzzy temporal algebrae;programming with fuzzy logic and mathematical functions;imprecise temporal interval relations;genetic programming for inductive inference of chaotic series;evaluation of particle swarm optimization effectiveness in classification;adaptive feature selection for classification of microscope images;genetic algorithm against cancer;and active learning with wavelets for microarray data.
In this paper, a novel supervised information feature compression algorithm is set up. Firstly, according to the information theories, we carried out analysis for the concept and its properties of the cross entropy, t...
详细信息
ISBN:
(纸本)3540343792
In this paper, a novel supervised information feature compression algorithm is set up. Firstly, according to the information theories, we carried out analysis for the concept and its properties of the cross entropy, then put forward a kind of lately concept of symmetry cross entropy (SCE), and point out that the SCE is a kind of distance measure, which can be used to measure the difference of two random variables. Secondly, We make the SCE separability criterion of the classes for information feature compression, and design a novel algorithm for information feature compression. At last, the experimental results demonstrate that the algorithm here is valid and reliable, and provides a new research approach for feature compression, datamining and patternrecognition.
Determiningthe relevant features is a combinatorial task in various fields of machinelearning such as text mining, bioinformatics, patternrecognition, etc. Several scholars have developed various methods to extract...
详细信息
ISBN:
(纸本)3540464840
Determiningthe relevant features is a combinatorial task in various fields of machinelearning such as text mining, bioinformatics, patternrecognition, etc. Several scholars have developed various methods to extract the relevant features but no method is really superior. Breiman proposed Random Forest to classify a pattern based on CART tree algorithm and his method turns out good results compared to other classifiers. Taking advantages of Random Forest and using wrapper approach which was first introduced by Kohavi et. al, we propose an algorithm named Dynamic Recursive Feature Elimination (DRFE) to find the optimal subset of features for reducing noise of the data and increasing the performance of classifiers. In our method, we use Random Forest as induced classifier and develop our own defined feature elimination function by adding extra terms to the feature scoring. We conducted experiments with two public datasets: Colon cancer and Leukemia cancer. the experimental results of the real world data showed that the proposed method has higher prediction rate compared to the baseline algorithm. the obtained results are comparable and sometimes have better performance than the widely used classification methods in the same literature of feature selection.
暂无评论