This paper describes a Sentiment Analysis (SA) method to analyze tweets polarity and to enable government to describe quantitatively the opinion of active users on social networks with respect to the topics of interes...
详细信息
ISBN:
(纸本)9783319210247;9783319210230
This paper describes a Sentiment Analysis (SA) method to analyze tweets polarity and to enable government to describe quantitatively the opinion of active users on social networks with respect to the topics of interest to the Public Administration. We propose an optimized approach employing a document-level and a dataset-level supervised machinelearning classifier to provide accurate results in both individual and aggregated sentiment classification. The aim of this work is also to identify the types of features that allow to obtain the most accurate sentiment classification for a dataset of Italian tweets in the context of a Public Administration event, also taking into account the size of the training set. This work uses a dataset of 1,700 Italian tweets relating to the public event of "Lecce 2019 - European Capital of Culture".
The paper contributes to the problem solving in semantic browsing and analysis of scientific articles. With reference to presented visual interface, four - the most popular methods of mapping including own approach - ...
详细信息
ISBN:
(纸本)9788360810668
The paper contributes to the problem solving in semantic browsing and analysis of scientific articles. With reference to presented visual interface, four - the most popular methods of mapping including own approach - MDS with spherical topology, have been compared. For a comparison quantitative measures were applied which allowed to select the most appropriate mapping way with an accurate reflection of the dynamics of data. For the quantitative analysis the authors used machinelearning and patternrecognition algorithms and described: clusterization degree, fractal dimension and lacunarity. Local density differences, clusterization, homogeneity, and gappiness were measured to show the most acceptable layout for an analysis, perception and exploration processes. Visual interface for analysis how computer science evolved through the two last decades is presented on website. Results of both quantitative and qualitative analysis have revealed good convergence.
Fiber optical gyro (FOG) as one of the most important component in Fiber inertial measure unit (FIMU), the production quality of which will affect the accuracy of FOG and FIMU;through decade years improvement in craft...
详细信息
ISBN:
(纸本)9783662482247;9783662482230
Fiber optical gyro (FOG) as one of the most important component in Fiber inertial measure unit (FIMU), the production quality of which will affect the accuracy of FOG and FIMU;through decade years improvement in craftwork design, the main target has shifted to quality control promotion during production. This paper has proposed a new methodology for automatic production quality control;the method uses the computer vision technology to apply a system which can distinguish real-time fiber ring production images, applying patternrecognition of datamining technology for understanding of the type of fault production image. Pre-processing of computer vision has treatment to distil the image from real-time noised image for feature achievement, upon the qualified conducted result;a fast patternrecognition method support vector regression (SVR) has fast convergence which has utilized delightful result.
We propose an approach for recognition of mobile devices on interactive surfaces that do not support optical markers. Our system only requires an axis aligned bounding box of the object placed on the touchscreen in co...
详细信息
ISBN:
(纸本)9783319208046;9783319208039
We propose an approach for recognition of mobile devices on interactive surfaces that do not support optical markers. Our system only requires an axis aligned bounding box of the object placed on the touchscreen in combination with position data from the mobile devices integrated inertia measurement unit (IMU). We put special emphasis on maximum flexibility in terms of compatibility with varying multi-touch sensor techniques and different kinds of mobile devices. A new device can be added to the system with a short training phase, during which the device is moved across the interactive surface. A device model is automatically created from the recorded data using support vector machines. Different devices of the same size are identified by analyzing their IMU data streams for transitions into a horizontally resting state. The system has been tested in a museum environment.
Despite crucial recent advances, the problem of frequent itemset mining is still facing major challenges. This is particularly the case when: (i) the mining process must be massively distributed and;(ii) the minimum s...
详细信息
ISBN:
(纸本)9783319210247;9783319210230
Despite crucial recent advances, the problem of frequent itemset mining is still facing major challenges. This is particularly the case when: (i) the mining process must be massively distributed and;(ii) the minimum support (MinSup) is very low. In this paper, we study the effectiveness and leverage of specific data placement strategies for improving parallel frequent itemset mining (PFIM) performance in MapReduce, a highly distributed computation framework. By offering a clever data placement and an optimal organization of the extraction algorithms, we show that the itemset discovery effectiveness does not only depend on the deployed algorithms. We propose ODPR (Optimal data-Process Relationship), a solution for fast mining of frequent itemsets in MapReduce. Our method allows discovering itemsets from massive datasets, where standard solutions from the literature do not scale. Indeed, in a massively distributed environment, the arrangement of both the data and the different processes can make the global job either completely inoperative or very effective. Our proposal has been evaluated using real-world data sets and the results illustrate a significant scale-up obtained with very low MinSup, which confirms the effectiveness of our approach.
datamining is the process of extracting useful information from a huge amount of data. One of the most common applications of datamining is the use of different algorithms and tools to estimate future events based o...
详细信息
ISBN:
(纸本)9783319210247;9783319210230
datamining is the process of extracting useful information from a huge amount of data. One of the most common applications of datamining is the use of different algorithms and tools to estimate future events based on previous experiences. In this context, many researchers have been using datamining techniques to support and solve challenges in higher education. There are many challenges facing this level of education, one of which is helping students to choose the right course to improve their success rate. An early prediction of students' grades may help to solve this problem and improve students' performance, selection of courses, success rate and retention. In this paper we use different classification techniques in order to build a performance prediction model, which is based on previous students' academic records. The model can be easily integrated into a recommender system that can help students in their course selection, based on their and other graduated students' grades. Our model uses two of the most recognised decision tree classification algorithms: ID3 and J48. The advantages of such a system have been presented along with a comparison in performance between the two algorithms.
The difficulties of data streams, i.e. infinite length, the occurrence of concept-drift and the possible emergence of novel classes, are topics of high relevance in the field of recognition systems. To overcome all of...
详细信息
ISBN:
(纸本)9781467386753
The difficulties of data streams, i.e. infinite length, the occurrence of concept-drift and the possible emergence of novel classes, are topics of high relevance in the field of recognition systems. To overcome all of these problems, the system should be updated continuously with new data while the amount of processing time should be kept small. We propose an incremental Parzen window kernel density estimator ( IncPKDE) which addresses the problems of data streaming using a model that is insensitive to the training set size and has the ability to detect novelties within multi-class recognition systems. The results show that the IncPKDE approach has superior properties especially regarding processing time and that it is robust to wrongly labelled samples if used in a semi-supervised learning scenario.
This proceedings presents recent practical applications of Computational Biology and Bioinformatics. It contains the proceedings of the 9th internationalconference on Practical Applications of Computational Biology &...
ISBN:
(数字)9783319197753;9783319197760
ISBN:
(纸本)9783319197753;9783319197760
This proceedings presents recent practical applications of Computational Biology and Bioinformatics. It contains the proceedings of the 9th internationalconference on Practical Applications of Computational Biology & Bioinformatics held at University of Salamanca, Spain, at June 3rd-5th, 2015. The internationalconference on Practical Applications of Computational Biology & Bioinformatics (PACBB) is an annual international meeting dedicated to emerging and challenging applied research in Bioinformatics and Computational Biology. Biological and biomedical research are increasingly driven by experimental techniques that challenge our ability to analyse, process and extract meaningful knowledge from the underlying data. The impressive capabilities of next generation sequencing technologies, together with novel and ever evolving distinct types of omics data technologies, have put an increasingly complex set of challenges for the growing fields of Bioinformatics and Computational Biology. The analysis of the datasets produced and their integration call for new algorithms and approaches from fields such as databases, Statistics, datamining, machinelearning, Optimization, Computer Science and Artificial Intelligence. Clearly, Biology is more and more a science of information requiring tools from the computational sciences.
Patient-specific models are instance-based learn algorithms that take advantage of the particular features of the patient case at hand to predict an outcome. We introduce two patient-specific algorithms based on decis...
详细信息
ISBN:
(纸本)9783319210247;9783319210230
Patient-specific models are instance-based learn algorithms that take advantage of the particular features of the patient case at hand to predict an outcome. We introduce two patient-specific algorithms based on decision tree paradigm that use AUC as a metric to select an attribute. We apply the patient specific algorithms to predict outcomes in several datasets, including medical datasets. Compared to the standard entropy-based method, the AUC-based patient-specific decision path models performed equivalently on area under the ROC curve (AUC). Our results provide support for patient-specific methods being a promising approach for making clinical predictions.
In datamining systems, which operate on complex data with structural relationships, graphs are often used to represent the basic objects under study. Yet, the high representational power of graphs is also accompanied...
详细信息
ISBN:
(纸本)9783319210247;9783319210230
In datamining systems, which operate on complex data with structural relationships, graphs are often used to represent the basic objects under study. Yet, the high representational power of graphs is also accompanied by an increased complexity of the associated algorithms. Exact graph similarity or distance, for instance, can be computed in exponential time only. Recently, an algorithmic framework that allows graph dissimilarity computation in cubic time with respect to the number of nodes has been presented. This fast computation is at the expense, however, of generally overestimating the true distance. The present paper introduces six different post-processing algorithms that can be integrated in this suboptimal graph distance framework. These novel extensions aim at improving the overall distance quality while keeping the low computation time of the approximation. An experimental evaluation clearly shows that the proposed heuristics substantially reduce the overestimation in the existing approximation framework while the computation time remains remarkably low.
暂无评论