The two main challenges associated with mining data streams are concept drifting and data noise. Current algorithms mainly depend on the robust of the base classifier or learning ensembles, and have no active mechanis...
详细信息
ISBN:
(纸本)3540459162
The two main challenges associated with mining data streams are concept drifting and data noise. Current algorithms mainly depend on the robust of the base classifier or learning ensembles, and have no active mechanisms to deal noisy. However, noise still can induce the drastic drops in accuracy. In this paper, we present a clustering-based method to filter out hard instances and noise instances from data streams. We also propose a trigger to detect concept drifting and build RobustBoosting, an ensemble classifier, by boosting the hard instances. We evaluated RobustBoosting algorithm and AdaptiveBoosting algorithm [1] on the synthetic and real-life data sets. The experiment results show that the proposed method has substantial advantage over AdaptiveBoosting algorithm in prediction accuracy, and that it can converge to target concepts efficiently with hi h accuracy on datasets with noise level as high as 40%.
The request of the Internet users enjoying privacy during their e-activities enforces the Internet society to develop techniques which offer privacy to the Internet users, known as Privacy Enhancing Technologies (PETs...
详细信息
Privacy Enhancing Technology (PET) is the technology responsible to hide the identification of Internet users, whereas network forensics is a technology responsible to reveal the identification of Internet users who p...
详细信息
softwareengineering researchers are increasingly relying on the empirical approach to advance the state of the art. The level of empirical rigor and evidence required to guide softwareengineering research, however, ...
详细信息
ISBN:
(纸本)1595933751
softwareengineering researchers are increasingly relying on the empirical approach to advance the state of the art. The level of empirical rigor and evidence required to guide softwareengineering research, however, can vary drastically depending on many factors. In this session we identify some of these factors through a discussion of the state of the art in performing empirical studies in softwareengineering, and we show how we can utilize the notion of empirical maturity to set and adjust the empirical expectations for softwareengineering research efforts. Regarding the state of the art in performing empirical studies, we will offer perspectives on two classes of study: those concerned with humans utilizing a technology, e.g., a person applying a methodology, a technique, or a tool, where human skills and the ability to interact with the technology are some of the primes issues, and those concerned with the application of the technology to an artifact, e.g., a technique or tool applied to a design or a program. In the first case, the emphasis is typically on issues like feasibility, usefulness, and then on effectiveness. The technology tends to be less well specified and based more on the experience and skills of the technology applier. In the second case, the emphasis is typically on the efficiency and effectiveness of the technology. The technology tends to be well defined and the assumption is that the individual skill and experience plays a less important role. We will discuss the set of factors that influence the design, implementation, and validity of these studies. Regarding empirical maturity and its implications on the SE community's expectations, we will provide examples of the large spectrum of studies with different maturity levels that can be performed to successfully support softwareengineering research. We will then identify and analyze the following aspects that are likely to impact a study's maturity level: technology (well-specified vs. unde
Deploying probes easily and dynamically is very important to the success of end to end measurement systems. A P2P-like dynamic probe deployment mechanism for large scale end to end network measurement is proposed in t...
详细信息
This paper proposes behavioral footprinting, a new dimension of worm profiling based on worm infection sessions. A worm's infection session contains a number of steps (e.g., for probing, exploitation, and replicat...
详细信息
ISBN:
(纸本)1595934472
This paper proposes behavioral footprinting, a new dimension of worm profiling based on worm infection sessions. A worm's infection session contains a number of steps (e.g., for probing, exploitation, and replication) that are exhibited in certain order in every successful worm infection. Behavioral footprinting complements content-based signature by enriching a worm's profile, which will be used in worm identification, an important task in post worm attack investigation and recovery. We propose an algorithm to extract a worm's behavioral footprint from the worm's traffic traces. Our evaluation with a number of real worms and their variants confirms the existence of worms' behavioral footprints and demonstrates their effectiveness in worm identification. Copyright 2006 ACM.
Image inpainting is an artistic procedure to recover a damaged painting or picture. In this paper, we propose a novel approach for image inpainting. In this approach, the Mumford-Shah (MS) model and the level set meth...
详细信息
This paper describes a data replication service for large-scale, data intensive applications whose results can be shared among geographically distributed scientists. We introduce two kinds of data replication techniqu...
详细信息
Cooperative caching is a very important technique for efficient data dissemination and sharing in mobile ad hoc networks (MANETs). Many applications have requirements on the consistency of the content cached on differ...
详细信息
The vision system of human beings is very sensitive to lines. Sketches composed of line drawings provide a useful representation for image structures, which can be used for recognition and which is capable of serving ...
详细信息
暂无评论