检索结果-内蒙古大学图书馆

big data classification with optimization driven MapReduce framework

INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS 2021年第2期25卷 173-183页

作者： Mohammed, Mujeeb Shaik Rachapudy, Praveen Sam Kasa, Madhavi Jawaharlal Nehru Technol Univ Anantapur Dept CSE Ananthapuramu Andhra Pradesh India G Pulla Reddy Engn Coll Kurnool India JNTUA Coll Engn Dept CSE Ananthapuramu Andhra Pradesh India

With the technical advances, the amount of big data is increasing day-by-day such that the traditional software tools face burden in handling them. Additionally, the presence of the imbalance data in the big data is a huge concern to the research industry. In order to assure the effective management of big data and to deal with the imbalanced data, this paper proposes a new optimization algorithm. Here, the big data classification is performed using the MapReduce framework, wherein the map and reduce functions are based on the proposed optimization algorithm. The optimization algorithm is named as Exponential Bat algorithm (E-Bat), which is the integration of the Exponential Weighted Moving Average (EWMA) and Bat Algorithm (BA). The function of map function is to select the features that are presented to the classification in the reducer module using the Neural Network (NN). Thus, the classification of big data is performed using the proposed E-Bat algorithm-based MapReduce Framework and the experimentation is performed using four standard databases, such as Breast cancer, Hepatitis, Pima Indian diabetes dataset, and Heart disease dataset. From, the experimental results, it can be shown that the proposed method acquired a maximal accuracy of 0.8829 and True Positive Rate (TPR) of 0.9090, respectively.

关键词： MapReduce framework big data classification EWMA BA NN

来源：评论

学校读者我要写书评

暂无评论

Intelligent intrusion detection based on fuzzy big data classification

引用

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS 2023年第6期26卷 3719-3736页

作者： Jemili, Farah Univ Sousse MARS Res Lab ISITCom LR17ES05 Hammam Sousse 4011 Tunisia

To cope with the rapid evolution of various attacks and the computer networks' increase, an intelligent intrusion detection system is considered as a promising emerging technique for the security of computer networks. Individual classification approaches have not provided complete protection. Indeed, it has been shown that none of them is efficient enough to provide good detection rates and reduce the false alarms rates. In previous works, a comparative study was conducted between the neuro-fuzzy and the genetic-fuzzy approaches. In this study, a hybrid approach is proposed based on the stacking scheme. This approach offers a solution to combine the two basic classifiers in order to take advantage of each one of them. The experimental results have shown the effectiveness of the proposed approach in terms of maximizing the detection rates and reducing the false alarm rates.

关键词： Intrusion detection Fuzzy ensemble classifier big data classification

来源：评论

学校读者我要写书评

暂无评论

Rider Chicken Optimization Algorithm-Based Recurrent Neural Network for big data classification in Spark Architecture

引用

COMPUTER JOURNAL 2022年第8期65卷 2183-2196页

作者： Vinoth, R. Ananth, J. P. Agni Coll Technol Dept Informat Technol Omr Chennai 600130 India Sri Krishna Coll Engn & Technol Dept Comp Sci & Engn Coimbatore 641008 Tamil Nadu India

This paper proposes an effective classification method named Rider Chicken Optimization Algorithm-based Recurrent Neural Network (RCOA-based RNN) to perform big data classification in spark architecture. Initially, the input data are collected from the network by the master node and then forwarded to the slave node. These nodes are responsible for storing the data and performing computations. The features are effectively selected in the slave node using the proposed RCOA. The selected features are forwarded to the master node. The big data classification is achieved in the master node by using the RNN classifier, and the training of the classifier is done using the proposed RCOA algorithm, which is the integration of the Rider optimization algorithm (ROA) with the standard Chicken Swarm Optimization (CSO). The experimentation is done by using the Switzerland dataset, Cleveland dataset, Hungarian dataset and Skin disease dataset, in which the proposed RCOA-based RNN attained better performance based on the quantitative properties, such as sensitivity, accuracy and specificity with the values of 9.3E+01%, 9.4E+01% and 9.3E+01% using Hungarian dataset. The existing learning methods failed to address the complex classification problems at a reasonable time, which is overcome by the proposed method.

关键词： Recurrent neural network (RNN) Rider optimization algorithm (ROA) Chicken Swarm Optimization (CSO) big data classification fictional computing

来源：评论

学校读者我要写书评

暂无评论

Meta-heuristic endured deep learning model for big data classification: image analytics

引用

KNOWLEDGE AND INFORMATION SYSTEMS 2023年第11期65卷 4655-4685页

作者： Naveen, P. Diwan, B. Anna Univ Dept Informat Commun & Engn Sardar Patel Rd Chennai 600025 Tamil Nadu India St Josephs Coll Engn Dept Comp Sci & Engn Chennai 600119 Tamilnadu India

Image processing is currently developing as a unique and the inventive field in computer research and applications in the modern area. Most image processing algorithms produce a large quantity of data as an outcome, which is termed as big-data. These algorithms process and store bulky information either as structured or unstructured data. The use of big data analytics to mine the data produced by image processing technology has huge potential in areas like education, governments, medical establishments, production units, finance and banking, and retail business centers. This paper well defined the innovations made in big data analytics and image processing. In this study, a novel data classification approach especially for image analytics is proposed. To improve image quality, pre-processing is applied to huge data that has been gathered. Then, most relevant features like spatial information, texture GLCM, and color and shape features are extracted from the pre-processed image. Since the dimensions of the features are huge in size, an adaptive map-reduce framework with Improved Shannon Entropy has been introduced to lessen the dimensionality of the extracted features. Then, in the big data classification phase, an optimized deep learning classifier deep convolutional neural network (DCNN) is introduced to classify the images accurately. The weight function of the DCNN is fine-tuned using the newly proposed dragonfly updated mothsearch (DAUMS) Algorithm to enhance the classification accuracy and to solve the optimization problems of the research work. The moth search algorithm and dragonfly algorithm are both concepts in this hybrid algorithm DAUMS.

关键词： Image processing big data big data classification DCNN DAUMS

来源：评论

学校读者我要写书评

暂无评论

Ant Cat Swarm Optimization-Enabled Deep Recurrent Neural Network for big data classification Based on Map Reduce Framework

引用

COMPUTER JOURNAL 2022年第12期65卷 3167-3180页

作者： Narayana, Satyala Chandanapalli, Suresh Babu Rao, Mekala Srinivasa Srinivas, Kalyanapu Seshadri Rao Gudlavalleru Engn Coll Dept Comp Sci & Engn Gudlavalleru 521356 Andhra Pradesh India Lakireddy Bali Reddy Coll Engn Dept Informat Technol Mylavaram 521230 Andhra Pradesh India

The amount of data generated is increasing day by day due to the development in remote sensors, and thus it needs concern to increase the accuracy in the classification of the big data. Many classification methods are in practice;however, they limit due to many reasons like its nature for data loss, time complexity, efficiency and accuracy. This paper proposes an effective and optimal data classification approach using the proposed Ant Cat Swarm Optimization-enabled Deep Recurrent Neural Network (ACSO-enabled Deep RNN) by Map Reduce framework, which is the incorporation of Ant Lion Optimization approach and the Cat Swarm Optimization technique. To process feature selection and big data classification, Map Reduce framework is used. The feature selection is performed using Pearson correlation-based Black hole entropy fuzzy clustering. The classification in reducer part is performed using Deep RNN that is trained using a developed ACSO scheme. It classifies the big data based on the reduced dimension features to produce a satisfactory result. The proposed ACSO-based Deep RNN showed improved results with maximal specificity of 0.884, highest accuracy of 0.893, maximal sensitivity of 0.900 and the maximum threat score of 0.827 based on the Cleveland dataset.

关键词： Map Reduce framework big data classification deep recurrent neural network Pearson correlation-based Black hole entropy fuzzy clustering Ant Lion Optimization algorithm

来源：评论

学校读者我要写书评

暂无评论

Improved cost-sensitive representation of data for solving the imbalanced big data classification problem

引用

JOURNAL OF big data 2022年第1期9卷 1-24页

作者： Fattahi, Mahboubeh Moattar, Mohammad Hossein Forghani, Yahya Islamic Azad Univ Dept Comp Engn Mashhad Branch Mashhad Razavi Khorasan Iran

Dimension reduction is a preprocessing step in machine learning for eliminating undesirable features and increasing learning accuracy. In order to reduce the redundant features, there are data representation methods, each of which has its own advantages. On the other hand, big data with imbalanced classes is one of the most important issues in pattern recognition and machine learning. In this paper, a method is proposed in the form of a cost-sensitive optimization problem which implements the process of selecting and extracting the features simultaneously. The feature extraction phase is based on reducing error and maintaining geometric relationships between data by solving a manifold learning optimization problem. In the feature selection phase, the cost-sensitive optimization problem is adopted based on minimizing the upper limit of the generalization error. Finally, the optimization problem which is constituted from the above two problems is solved by adding a cost-sensitive term to create a balance between classes without manipulating the data. To evaluate the results of the feature reduction, the multi-class linear SVM classifier is used on the reduced data. The proposed method is compared with some other approaches on 21 datasets from the UCI learning repository, microarrays and high-dimensional datasets, as well as imbalanced datasets from the KEEL repository. The results indicate the significant efficiency of the proposed method compared to some similar approaches.

关键词： Feature selection Feature extraction Imbalanced data big data classification Cost sensitive Optimization

来源：评论

学校读者我要写书评

暂无评论

Water atom search algorithm-based deep recurrent neural network for the big data classification based on spark architecture

引用

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS 2022年第8期13卷 2297-2312页

作者： Dabbu, Murali Karuppusamy, Loheswaran Pulugu, Dileep Vootla, Subba Ramaiah Reddyvari, Venkateswar Reddy CMR Coll Engn & Technol Comp Sci & Engn Medchal Rd Hyderabad 501401 India Malla Reddy Coll Engn & Technol Comp Sci & Engn Hyderabad 500100 Telangana India Mahatma Gandhi Inst Technol Comp Sci & Engn Hyderabad Telangana India

The innovation of big data has an intense impact on data context analytics. The big data processing platforms have gained immense popularity in evaluating big data as they offer low latency needed for big data applications. This paper introduces a novel method for big data classification using spark architecture that follows master-slave nodes. The input data is initially partitioned by data griding in the master node using Black Hole Entropy Fuzzy Clustering (BHEFC). Then, the feature selection is executed for each slave node based on Renyi entropy for choosing better features for further processing. Finally, the features selected from each slave node are concatenated together to form the feature vector. Consequently, obtained features from the slave node are passed to the classification module in the master node for performing the classification of big data. In this case, classification is carried out using the Deep Recurrent Neural network (DeepRNN), tuned by the novel Water Atom Search Optimization (WASO). The WASO is newly designed by integrating the Water Wave Optimization (WWO) algorithm and Atom Search Optimization (ASO) characteristics. Thus, the proposed WASO-based DeepRNN method effectively classifies big data using reduced dimension features to produce satisfactory results. Therefore, the output of developed WASO enabled DeepRNN is employed for big data classification. The proposed method is compared with Neural Network (NN), Support Vector Machine (SVM), Edited Nearest Neighbor for big data (ENN-BD), Speed-up Dendritic Cell Algorithm (Sp-DCA), Compact Fuzzy Models in big data (CFM-BD), and DeepRNN. The proposed method obtained improved results with a high specificity of 0.979, accuracy of 0.951, sensitivity of 0.963, and precision of 0.988 based on the skin disease dataset.

关键词： big data classification Deep recurrent neural network Renyi entropy Black hole entropy fuzzy clustering Water wave optimization

来源：评论

学校读者我要写书评

暂无评论

Elastic extreme learning machine for big data classification

引用

NEUROCOMPUTING 2015年第PartA期149卷 464-471页

作者： Xin, Junchang Wang, Zhiqiong Qu, Luxuan Wang, Guoren Northeastern Univ Coll Informat Sci & Engn Shenyang Peoples R China Northeastern Univ Sino Dutch Biomed & Informat Engn Sch Shenyang Peoples R China

Extreme Learning Machine (ELM) and its variants have been widely used for many applications due to its fast convergence and good generalization performance. Though the distributed ELM* based on MapReduce framework can handle very large scale training dataset in big data applications, how to cope with its rapidly updating is still a challenging task. Therefore, in this paper, a novel Elastic Extreme Learning Machine based on MapReduce framework, named Elastic ELM ((ELM)-L-2), is proposed to cover the shortage of ELM* whose learning ability is weak to the updated large-scale training dataset. Firstly, after analyzing the property of ELM* adequately, it can be found out that its most computation-expensive part, matrix multiplication, can be incrementally, decrementally and correctionally calculated. Next, the Elastic ELM based on MapReduce framework is developed, which first calculates the intermediate matrix multiplications of the updated training data subset, and then update the matrix multiplications by modifying the old matrix multiplications with the intermediate ones. Then, the corresponding new output weight vector can be obtained with centralized computing using the update the matrix multiplications. Therefore, the efficient learning of rapidly updated massive training dataset can be realized effectively. Finally, we conduct extensive experiments on synthetic data to verify the effectiveness and efficiency of our proposed (ELM)-L-2 in learning massive rapidly updated training dataset with various experimental settings. (C) 2014 Elsevier B.V. All rights reserved.

关键词： Extreme learning machine big data classification Incremental learning Decremental learning Correctional learning

来源：评论

学校读者我要写书评

暂无评论

Optimal Feature Selection for big data classification: Firefly with Lion-Assisted Model

引用

big data 2020年第2期8卷 125-146页

作者： Selvi, Ramar Senthamil Valarmathi, Muniyappan Lakshapalam Saranathan Coll Engn Tiruchirappalli 620012 Tamil Nadu India Dr Mahalingam Coll Engn & Technol Pollachi India

In this article, the proposed method develops a big data classification model with the aid of intelligent techniques. Here, the Parallel Pool Map reduce Framework is used for handling big data. The model involves three main phases, namely (1) feature extraction, (2) optimal feature selection, and (3) classification. For feature extraction, the well-known feature extraction techniques such as principle component analysis, linear discriminate analysis, and linear square regression are used. Since the length of feature vector tends to be high, the choice of the optimal features is complex task. Hence, the proposed model utilizes the optimal feature selection technology referred as Lion-based Firefly (L-FF) algorithm to select the optimal features. The main objective of this article is projected on minimizing the correlation between the selected features. It results in providing diverse information regarding the different classes of data. Once, the optimal features are selected, the classification algorithm called neural network (NN) is adopted, which effectively classify the data in an effective manner with the selected features. Furthermore, the proposed L-FF+NN model is compared with the traditional methods and proves the effectiveness over other methods. Experimental analysis shows that the proposed L-FF+NN model is 92%, 28%, 87%, 82%, and 78% superior to the state-of-art models such as GA+NN, FF+NN, PSO+NN, ABC+NN, and LA+NN, respectively.

关键词： big data classification feature classification Lion with Firefly algorithm optimal feature selection performance measures

来源：评论

学校读者我要写书评

暂无评论

SSPO-DQN spark: shuffled student psychology optimization based deep Q network with spark architecture for big data classification

引用

WIRELESS NETWORKS 2023年第1期29卷 369-385页

作者： Kantapalli, Bhaskar Markapudi, Babu Rao Jawaharlal Nehru Technol Univ Kakinada Dept Comp Sci & Engn Kakinada Andhra Pradesh India Sheshadri Rao Gudlavalleru Engn Coll Dept Comp Sci & Engn Gudlavalleru 521356 Andhra Pradesh India

In information analysis and systematic extraction of complex or huge dataset, big data plays a vital role. The massive growth of large-scale data causes a major issue in big data and hence it is required to classify the big data to solve data imbalance issues. The huge data can be explored in an efficient way by converting it into valuable knowledge and this data can be processed in the distributed environment with different application framework. In recent decades, spark framework gained more significance in big data domain due to its increasing achievement in incremental and iterative approaches. Due to imbalance of data distribution, big data classification with large sized dataset results a challenging task with the conventional methods as it leads wrong decision in generating classification result. In this paper, an efficient Shuffled Student Psychology Optimization_Deep Q network is proposed for big data classification with spark framework in order to overcome the issues faced by the traditional methods. Here, master and slave sets are used to perform unique operations, like data partitioning, feature fusion and data augmentation process in order to accomplish the task of data classification by proposed approach. The developed technique attained the maximum TPR of 0.960, accuracy of 0.942, and TNR of 0.929.

关键词： big data classification data augmentation Spark framework Deep residual network Deep Q network

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：