In the recent years, social networks have played a strategic role in the lives of many companies. therefore several decision makers have worked on these networks for making better decisions. Furthermore, the increased...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
In the recent years, social networks have played a strategic role in the lives of many companies. therefore several decision makers have worked on these networks for making better decisions. Furthermore, the increased interaction between social networks and web users has lead many companies to use a data warehouse to collect information about their fans. this paper deals with a multidimensional schema construction from unstructured data extracted from social network. this construction is carried out from Facebook page in order to analyze customers' opinions. A real case study has been developed to illustrate the proposed method and confirming that he social network analysis can predict chance of the success of products.
Zero-Latency data Warehouse (ZLDW) cannot be developed and formed on the basis of a standard ETL process, where time frames are limiting access to current data and blocking the ability to take users needs into account...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
Zero-Latency data Warehouse (ZLDW) cannot be developed and formed on the basis of a standard ETL process, where time frames are limiting access to current data and blocking the ability to take users needs into account. therefore, after profound analysis of this issue and ones related to workload balancing, an innovative system based on a Workload Balancing Unit (WBU) was created. In this paper we present innovative workload balancing algorithm - CTBE (Choose Transaction By Election), which allows to analyze all incoming transactions and create a schema of dependencies between them. Also, cache in the created WBU ensures ability to store information on incoming transactions and exchange messages with systems transmitting updates and users' queries. By this work we intend to present an innovative system designed to support Zero-Latency data Warehouse.
In this paper, a methodology for data validation and reconstruction of flow meter sensor data in water networks is presented. the raw data validation is inspired on the Spanish norm (AENOR-UNE norm 500540). the method...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
In this paper, a methodology for data validation and reconstruction of flow meter sensor data in water networks is presented. the raw data validation is inspired on the Spanish norm (AENOR-UNE norm 500540). the methodology consists in assigning a quality level to data. these quality levels are assigned according to the number of tests that data have passed. the methodology takes into account not only spatial models but also temporal models relating the different sensors. the methodology is applied to real-data acquired from the ATLLc Water Network. the results demonstrate the performance of the proposed methodology in detecting errors in measurements and in reconstructing them.
A drowsy driver detection system based on a new method for head posture estimation is proposed. In the first part, we introduced six possible models of head positions that can be detected by our algorithm which is exp...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
A drowsy driver detection system based on a new method for head posture estimation is proposed. In the first part, we introduced six possible models of head positions that can be detected by our algorithm which is explained in the second part. Indeed, there are three key stages characterizing our method: First of all, we proceed with driver's face detection by Viola and Jones algorithm. then, we extract the image reference and the non image reference coordinates from the face bounding's box. Finally, based on measuring boththe head inclination's angle and distances between the extracted coordinates, we classify the head state (normal or inclined). Test results demonstrate that the proposed system can efficiently measure the aforementioned parameters and detect the head state as a sign of driver's drowsiness.. . .
We consider an open issue in the design of pattern classifiers, i.e., choosing between the best classifier among a given ensemble, and combining all the available ones using a trainable fusion rule. While the latter c...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
We consider an open issue in the design of pattern classifiers, i.e., choosing between the best classifier among a given ensemble, and combining all the available ones using a trainable fusion rule. While the latter choice can in principle outperform the former, their actual effectiveness is affected by small sample size problems. this raises the need of investigating under which conditions one choice is better than the other one. We provide a first contribution, by deriving an analytical expressions of the expected error probability of best classifier selection, and by comparing it withthe one of a well known linear fusion rule, implemented withthe Fisher linear discriminant.
In this paper, we propose a novel dictionary learning method for hyperspectral image classification. the proposed method, linear regression Fisher discrimination dictionary learning (LRFDDL), obtains a more discrimina...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
In this paper, we propose a novel dictionary learning method for hyperspectral image classification. the proposed method, linear regression Fisher discrimination dictionary learning (LRFDDL), obtains a more discriminative dictionary and a classifier by incorporating linear regression term and the Fisher discrimination into the objective function during training. the linear regression term makes predicted and actual labels as close as possible;while the Fisher discrimination is imposed on the sparse codes so that they have small with-class scatters but large between-class scatters. Experiments show that LRFDDL significantly improves the performances of hyperspectral image classification.
Diabetic retinopathy is a chronic progressive eye disease associated to a group of eye problems as a complication of diabetes. this disease may cause severe vision loss or even blindness. Specialists analyze fundus im...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
Diabetic retinopathy is a chronic progressive eye disease associated to a group of eye problems as a complication of diabetes. this disease may cause severe vision loss or even blindness. Specialists analyze fundus images in order to diagnostic it and to give specific treatments. Fundus images are photographs taken of the retina using a retinal camera, this is a noninvasive medical procedure that provides a way to analyze the retina in patients with diabetes. the correct classification of these images depends on the ability and experience of specialists, and also the quality of the images. In this paper we present a method for diabetic retinopathy detection. this method is divided into two stages: in the first one, we have used local binary patterns (LBP) to extract local features, while in the second stage, we have applied artificial neural networks, random forest and support vector machines for the detection task. Preliminary results show that random forest was the best classifier with 97.46% of accuracy, using a data set of 71 images.
Imbalance data constitutes a great difficulty for most algorithms learning classifiers. However, as recent works claim, class imbalance is not a problem in itself and performance degradation is also associated with ot...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
Imbalance data constitutes a great difficulty for most algorithms learning classifiers. However, as recent works claim, class imbalance is not a problem in itself and performance degradation is also associated with other factors related to the distribution of the data as the presence of noisy and borderline examples in the areas surrounding class boundaries. this contribution proposes to extend SMOTE with a noise filter called Iterative-Partitioning Filter (IPF), which can overcome these problems. the properties of this proposal are discussed in a controlled experimental study against SMOTE and its most well-known generalizations. the results show that the new proposal performs better than exiting SMOTE generalizations for all these different scenarios.
learning from imbalanced multilabel data is a challenging task that has attracted considerable attention lately. Some resampling algorithms used in traditional classification, such as random undersampling and random o...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
learning from imbalanced multilabel data is a challenging task that has attracted considerable attention lately. Some resampling algorithms used in traditional classification, such as random undersampling and random oversampling, have been already adapted in order to work with multilabel datasets. In this paper MLeNN (MultiLabel edited Nearest Neighbor), a heuristic multilabel undersampling algorithm based on the well-known Wilson's Edited Nearest Neighbor Rule, is proposed. the samples to be removed are heuristically selected, instead of randomly picked. the ability of MLeNN to improve classification results is experimentally tested, and its performance against multilabel random undersampling is analyzed. As will be shown, MLeNN is a competitive multilabel undersampling alternative, able to enhance significantly classification results.
Finding all frequent itemsets (patterns) in a given database is a challenging process that in general consumes time and space. Time is measured in terms of the number of database scans required to produce all frequent...
详细信息
ISBN:
(纸本)9783319108407;9783319108391
Finding all frequent itemsets (patterns) in a given database is a challenging process that in general consumes time and space. Time is measured in terms of the number of database scans required to produce all frequent itemsets. Space is consumed by the number of potential frequent itemsets which will end up classified as not frequent. To overcome both limitations, namely space and time, we propose a novel approach for generating all possible frequent itemsets by introducing a new representation of items into groups of four items and within each group, items are assigned one of four prime numbers, namely 2, 3, 5, and 7. the reported results demonstrate the applicability and effectiveness of the proposed approach. Our approach satisfies scalability in terms of number of transactions and number of items.
暂无评论