检索结果-内蒙古大学图书馆

data preprocessing for chromatographic fingerprint of herbal medicine with chemometric approaches

ANALYTICAL LETTERS 2005年第14期38卷 2475-2492页

作者： Gong, F Wang, BT Chau, FT Liang, YZ Hong Kong Polytech Univ Dept Appl Biol & Chem Technol Hong Kong Hong Kong Peoples R China Cent S Univ Coll Chem & Chem Engn Inst Chemometr & Intelligent Analyt Instruments Res Ctr Modernizat Chinese Herbal Med Changsha 410083 Peoples R China

Recently, the fingerprinting approach using chromatography has become one of the most potent tools for quality assessment of herbal medicine. Due to the complexity of the chromatographic fingerprint and the irreproducibility of chromatographic instruments and experimental conditions, several chemometric approaches such as variance analysis, peak alignment, correlation analysis, and pattern recognition were employed to deal with the chromatographic fingerprint in this work. To facilitate the data preprocessing, a software named Computer Aided Similarity Evaluation (CASE) was also developed. All programs of chemometric algorithms for CASE were coded in MATLAB5.3 based on Windows. data loading, removing, cutting, smoothing, compressing, background and retention time shift correction, normalization, peak identification and matching, variation determination of common peaks/regions, similarity comparison, sample classification, and other data processes associated with the chromatographic fingerprint were investigated in this software. The case study of high pressure liquid chromatographic HPLC fingerprints of 50 Rhizoma chuanxiong samples from different sources demonstrated that the chemometric approaches investigated in this work were reliable and user friendly for data preprocessing of chromatographic fingerprints of herbal medicines for quality assessment.

关键词： chromatographic fingerprint herbal medicine quality assessment chemometrics data preprocessing

来源：评论

学校读者我要写书评

暂无评论

data preprocessing for Web Combinatorial Problems

Data Preprocessing for Web Combinatorial Problems

引用

IEEE/WIC/ACM International Conference on Web Intelligence (WI)

作者： Drias, Habiba Kechid, Samir Adamou, Sofiane Benyoucef, Farouk USTHB LRIA Dept Comp Sci Algiers 16111 Algeria

ISBN: (纸本)9781509044702

In the field of data science, we consider usually data independently from a problem to be solved. The originality of this paper consists in handling huge instances of combinatorial problems with datamining technologies in order to reduce the complexity of their treatment. Such task can be performed on Web combinatorial optimization such as internet data packet routing and web clustering. We focus in particular on the satisfiability of Boolean formulae but the proposed idea could be adopted for any other complex problem. The aim is to explore the satisfiability instance using datamining techniques in order to reduce its size, prior to solve it. An estimated solution for the obtained instance is then computed using a hybrid algorithm based on DPLL technique and a genetic algorithm. It is then compared to the solution of the initial instance in order to validate the method effectiveness. We performed experiments on the well-known BMC datasets and show the benefits of using datamining techniques as a pretreatment, prior to solving the problem.

关键词： data preprocessing data reduction Genetic algorithm Satisfiability Statistical approaches Web combinatorial problems

来源：评论

学校读者我要写书评

暂无评论

data preprocessing for Web data Mining

Data Preprocessing for Web Data Mining

引用

Conference on Electronic Commerce, Web Application and Communication

作者： Zhang, Wei Chen, Tinggui Zhejiang Gongshang Univ Coll Comp Sci & Informat Engn Hangzhou 310018 Peoples R China Zhejiang Gongshang Univ Contemporary Business & Trade Res Ctr Hangzhou 310018 Peoples R China

ISBN: (纸本)9783642286575;9783642286582

data preprocessing includes data cleaning, data integration, data transformation and data reduction. data cleaning is aimed to remove unrelated or redundant items through two processes. data integration includes three main problems and each of them can be solved by kinds of methods. data transformation includes data generalization and property construction and standardization. Three algorithms can be used to normalize the data. The last step data reduction is used to compress the data in order to improve the quality of mining models. All these four steps are interrelated to each other and shouldn't be separated. They work together to improve the final result of data mining.

关键词： data preprocessing cleaning integration transformation reduction

来源：评论

学校读者我要写书评

暂无评论

data preprocessing of Agricultural IoT Based on Time Series Analysis 14th

Data Preprocessing of Agricultural IoT Based on Time Series ...

引用

14th International Conference on Intelligent Computing (ICIC)

作者： Ma, Yajie Jin, Jin Huang, Qihui Dan, Feng Wuhan Univ Sci & Technol Coll Informat Sci & Engn Wuhan 430081 Hubei Peoples R China Minist Educ Engn Res Ctr Met Automat & Detecting Technol Wuhan 430081 Hubei Peoples R China

ISBN: (纸本)9783319959290;9783319959306

Large-scale agricultural internet of things will generate a large amount of data every moment. After a certain period of time, the amount of data can reach hundreds of millions. It is very meaningful to analyze and mine agricultural big data and replace artificial experience with analysis results. However, the agricultural production environment is complex, and the raw data collected include a variety of anomalies, which can not be directly followed by analysis and mining. In this paper, a data preprocessing method based on time series analysis is proposed, which can quickly and efficiently obtain the prediction model, and can be used to fill and replace the abnormal data. On this basis, we add data preprocessing layer to the traditional three-layer Internet of things system (IoT), which is located between the application layer and the transmission layer, and designs a four layer of Agricultural IoT system. The system not only realizes the basic functions of data acquisition, transmission and storage, but also provides better data sources for subsequent analysis.

关键词： Agricultural IoT Time series analysis data preprocessing

来源：评论

学校读者我要写书评

暂无评论

data preprocessing for Distance-based Unsupervised Intrusion Detection

Data Preprocessing for Distance-based Unsupervised Intrusion...

引用

9th Annual International Conference on Privacy, Security and Trust

作者： Said, Dina Stirling, Lisa Federolf, Peter Barker, Ken Univ Calgary Dept Comp Sci Calgary AB Canada Univ Calgary Fac Kinesiol Calgary AB Canada

ISBN: (纸本)9781457705847

Since Intrusion Detection Systems (IDSs) operate in real-time, they should be light-weighted to detect intrusions as fast as possible. Distance-based Outlier Detection (DBOD) is one of the most widely-used techniques for detecting outliers due to its simplicity and efficiency. Additionally, DBOD is an unsupervised approach which overcomes the problem of the lack of training datasets with known intrusions. However, since IDSs usually have high-dimensional datasets, using DBOD becomes subject to the curse of the dimensionality problem. Furthermore, intrusion datasets should be normalized before calculating pair-wise distance between observations. The purpose of this research is conduct a comparative study among different normalization methods in conjunction with a well-known feature extraction technique;Principle Component Analysis (PCA). Therefore, the efficiency of these methods as data preprocessing techniques can be investigated when applying DBOD to detect intrusions. Experiments were performed using two kinds of distance metrics;Euclidean distance and Mahalanobis distance. We further examined the PCA using 7 threshold values to indicate the number of Principle components to consider according to their total contribution in the variability of features. These approaches have been evaluated using the KDD Cup 1999 intrusion detection (KDD-Cup) dataset. The main purpose of this study is to find the best attribute normalization method along with the correct threshold value for PCA so that a fast unsupervised IDS can discover intrusions effectively. The results recommended using the Log normalization method combined the Euclidean distance while performing PCA.

关键词： DBOD Equations Euclidean distance Euclidean distance Feature extraction IDS Intrusion detection Mahalanobis distance PCA Principal component analysis Training data preprocessing dimensionality problem distance based outlier detection distance based unsupervised intrusion detection feature extraction technique principal component analysis principle component analysis security of data unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

data preprocessing Evaluation for Web Log Mining: Reconstruction of Activities of a Web Visitor

Data Preprocessing Evaluation for Web Log Mining: Reconstruc...

引用

International Conference on Computational Science (ICCS)

作者： Munk, Michal Kapusta, Jozef Svec, Peter Constantine Philosopher Univ Nitra Dept Informat Nitra 94974 Slovakia

Presumptions of each data analysis are data themselves, regardless of the analysis focus ( visit rate analysis, optimization of portal, personalization of portal, etc.). Results of selected analysis highly depend on the quality of analyzed data. In case of portal usage analysis, these data can be obtained by monitoring web server log file. We are able to create data matrices and web map based on these data which will serve for searching for behaviour patterns of users. data preparation from the log file represents the most time-consuming phase of whole analysis. We realized an experiment so that we can find out to which criteria are necessary to realize this time-consuming data preparation. We aimed at specifying the inevitable steps that are required for obtaining valid data from the log file. Specially, we focused on the reconstruction of activities of the web visitor. This advanced technique of data preprocessing belongs to time consuming one. In the article we tried to assess the impact of reconstruction of activities of a web visitor on the quantity and quality of the extracted rules which represent the web users' behaviour patterns. (C) 2010 Published by Elsevier Ltd.

关键词： data preprocessing data cleaning identification of sessions reconstruction of activities of a web visitor web log mining evaluation

来源：评论

学校读者我要写书评

暂无评论

data preprocessing for ANN-based Industrial Time-Series Forecasting with Imbalanced data 27

Data Preprocessing for ANN-based Industrial Time-Series Fore...

引用

27th European Signal Processing Conference (EUSIPCO)

作者： Pisa, Ivan Santin, Ignacio Lopez Vicario, Jose Morell, Antoni Vilanova, Ramon Univ Autonoma Barcelona Wireless Informat Networking WIN Grp Bellaterra 08193 Spain Univ Autonoma Barcelona Adv Syst Automat & Control ASAC Grp Bellaterra 08193 Spain

ISBN: (纸本)9789082797039

The evolution of Industry towards the 4.0 paradigm has motivated the adoption of Artificial Neural Networks (ANNs) to deal with applications where predictive and maintenance tasks are performed. These tasks become difficult to carry out when rare events are present due to the imbalance of data. This is because training of ANN can be biased. Conventional techniques addressing this problem are mainly based on resampling-based approaches. However, these are not always feasible when dealing with time-series forecasting tasks in industrial scenarios. For that reason, this work proposes the application of data preprocessing techniques especially designed to face this scenario, a problem which has not been covered enough in the state-of-the-art. Considered techniques are applied over time-series data coming from Wastewater Treatment Plants (WWTPs). Our proposal significantly outperforms current strategies showing a 68% of improvement in terms of RMSE when rare events are addressed.

关键词： data preprocessing Imbalanced data Rare events Artificial Neural Network

来源：评论

学校读者我要写书评

暂无评论

data preprocessing of Magnetic Suspension Gyro-total-station by Vondrak Filter

Data Preprocessing of Magnetic Suspension Gyro-total-station...

引用

International Conference on Structures and Building Materials (ICSBM 2013)

作者： Li, Huiru Yang, Zhiqiang Shi, Zhen Changan Univ Coll Geol Engn & Geomat Xian 710054 Peoples R China

ISBN: (纸本)9783037856611

For Magnetic Suspension Gyro-total-station has vulnerable to outside interference factors, there are some random drifting containing in measurements which are unable to establish its mathematical model. Vondrak filter which does not require the model is used to pre-process measurements of Magnetic Suspension Gyro-total-station. In this paper a high-precision astronomical baseline is established in Xi'an, and the gyro azimuth is tested eight times in baseline. 40,000 north-seeking torque of the first and second place is filtered by the Vondrak filter for each test. The results show that the burr of data is reduced after filtered, and the filtered values reflect the trends of gyro north-seeking. Compared with the root mean square (RMS) of the measurements, RMS of Vondrak filter is decreased, the data is denser. Vondrak filter can effectively eliminate the random drifting containing in measurements, retain useful information in the maximum extent, and improve the accuracy of true north azimuth.

关键词： magnetic suspension gyro-total-station random drifting data preprocessing Vondrak filter north-seeking torque

来源：评论

学校读者我要写书评

暂无评论

data preprocessing AND RE KERNEL CLUSTERING FOR LETTER

引用

Journal of Electronics(China) 2014年第6期31卷 552-564页

作者： Zhu Changming Gao Daqi Department of Computer Science and Engineering East China University of Science and Technology

Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.

关键词： data preprocessing Kernel clustering Kernel Nearest Neighbor(KNN) Re kernel clustering

来源：评论

学校读者我要写书评

暂无评论

data preprocessing for Goal-oriented Process Discovery 27

Data Preprocessing for Goal-oriented Process Discovery

引用

IEEE 27th International Requirements Engineering Conference (REW)

作者： Ghasemi, Mahdi Amyot, Daniel Univ Ottawa Sch Elect Engn & Comp Sci Ottawa ON Canada

ISBN: (纸本)9781728151656

Goal-oriented process enhancement and discovery (GoPED) was recently proposed to take advantage of goal modeling capabilities in process mining activities. Conventional process mining aims to discover underlying process models from historical, crowdsourced event logs in an activity-oriented fashion. GoPED, however, infers goal-aligned process models from the event logs enhanced with some goal-related attributes. GoPED selects the historical behaviors that have yielded sufficient levels of satisfaction for (often conflicting) goals of different stakeholders. There are three algorithms available to select the subset of event logs from three different perspectives. The main input of all three algorithms is a version of the event log (EnhancedLog) that is (1) structured as a table showing each case and its trace in one row, (2) with rows enhanced with satisfaction levels of different goals. Therefore, typical event logs are not ready to be fed as-is to GoPED algorithms. This paper proposes a scheme for manipulating original event logs and turn them into EnhancedLog. Two tools were also developed and tested for this scheme: TraceMaker, to structure the log as explained above, and EnhancedLogMaker, to compute satisfaction levels of goals for all cases in the structured log.

关键词： Process Mining Goal Modeling Requirements Engineering GoPED Event Log Business Process Management data preprocessing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：