检索结果-内蒙古大学图书馆

5th International Conference on Computing, Communication Control and Automation (ICCUBEA)

作者： Deshmukh, Rajshree A. Bharathi, H. N. Tripathy, Amiya K. KJ Somaiya Coll Engn Comp Engn Mumbai Maharashtra India Don Bosco Inst Technol Comp Engn Mumbai Maharashtra India Edith Cowan Univ Sch Sci Perth WA Australia

ISBN: (纸本)9781728140421

Searching frequent itemset in large size diverse database is one of the most important data mining problem and as existing algorithms are insufficient in mechanism that enables automatic parallelization, fault tolerance and data distribution. Solution to this issue we design algorithm using mapreduce programming model. The overarching aim is to enhance the performance of parallel frequent itemset mining on Hadoop. Incorporating ultra-metric tress to improve more efficiency of mining frequent itemset and comparing Apriori algorithm and FP-Growth algorithm based on some parameters. We implement the algorithm with dataset of Market Basket Analytics

关键词： Frequent Itemset Mining Hadoop mapreduce programming Frequent itemset ultrametric tress (FIU-tree)

来源：评论

学校读者我要写书评

暂无评论

In-Mapper combiner based mapreduce algorithm for processing of big climate data

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 2018年 86卷 433-445页

作者： Manogaran, Gunasekaran Lopez, Daphne Chilamkurti, Naveen Univ Calif Davis Davis CA 95616 USA VIT Univ Sch Informat Technol & Engn Vellore Tamil Nadu India La Trobe Univ Dept Comp Sci & Comp Engn Melbourne Vic Australia

Big data refers to a collection of massive volume of data that cannot be processed by conventional data processing tools and technologies. In recent years, the data production sources are enlarged noticeably, such as high-end streaming devices, wireless sensor networks, satellite, wearable Internet of Things (IoT) devices. These data generation sources generate a massive volume of data in a continuous manner. The large volume of climate data is collected from the IoT weather sensor devices and NCEP. In this paper, the big data processing framework is proposed to integrate climate and health data and to find the correlation between the climate parameters and incidence of dengue. This framework is demonstrated with the help of mapreduce programming model, Hive, HBase and ArcGIS in a Hadoop Distributed File System (HDFS) environment. The following weather parameters such as minimum temperature, maximum temperature, wind, precipitation, solar and relative humidity are collected for the study are Tamil Nadu with the help of IoT weather sensor devices and NCEP. Proposed framework focuses only on climate data for 32 districts of Tamil Nadu where each district contains 1,57,680 rows and so there are 50,45,760 rows in total. Batch view precomputation for the monthly mean of various climate parameters would require 50,45,760 rows. Hence, this would create more latency in query processing. In order to overcome this issue, batch views can precompute for a smaller number of records and involve more computation to be done at query time. The In-Mapper based mapreduce framework is used to compute the monthly mean of climate parameter for each latitude and longitude. The experimental results prove the effectiveness of the response time for the In-Mapper based combiner algorithm is less when compared with the existing mapreduce algorithm. (C) 2018 Elsevier B.V. All rights reserved.

关键词： Big data Internet of Things Weather sensor devices mapreduce programming Model Hadoop distributed file system

来源：评论

学校读者我要写书评

暂无评论

Alternating direction method-based endmember extraction for a distributed fraction cover mapping of mineralogy at Jahazpur, India

引用

JOURNAL OF APPLIED REMOTE SENSING 2020年第4期14卷

作者： Roy, Sukanta Bhattacharya, Satadru Omkar, Subbaramajois Narasipur Indian Inst Sci Dept Aerosp Engn Computat Intelligence Laborat Bangalore Karnataka India Indian Space Res Org Space Applicat Ctr Ahmadabad Gujarat India

The quantification of mineral resources refers to the fractional contribution of endmembers at the pixel level, namely, fraction cover mapping of mineralogy. Over a large area, the mineral deposit occurs generally in a limited number either on a host rock or any geologic structure. In remote sensing, the purity of mineral's spectra is usually perturbed either because of the weathering effect or the compositional susceptibility, which may lead to a wrong fractional map of mineral endmembers. Having such physical disputes, the present paper establishes a fraction cover mapping model by incorporating the characterization of endmember variability, optimization model of endmember extraction (EE), and inverse model of abundance estimation. In this regard, a proposition of EE method was deployed, which comprises subproblems on the minimization of endmember variability by the alternating direction method. Next, the extracted endmembers were used to estimate abundances with the Hapke model by applying the fully constrained least-squares method. Experimenting on a synthetic image, both the qualitative analysis by correlation measure and quantitative analysis by statistical error measure were evaluated for the proposed fractional cover mapping model. Using airborne visible/infrared imaging spectrometer-next generation hyperspectral imagery, the fraction cover map of a validation area was justified first, then a distributed mapping of Jahazpur-mineralized belt was achieved by the mapreduce programming of the proposed model in Hadoop architecture. (C) 2020 Society of Photo-Optical Instrumentation Engineers (SPIE)

关键词： endmember extraction abundance estimation alternating direction method Hapke model mapreduce programming Hadoop architecture

来源：评论

学校读者我要写书评

暂无评论

LEVERAGING LIGHT-WEIGHT FORMAL METHODS WITH FUNCTIONAL programming APPROACH ON CLOUD

LEVERAGING LIGHT-WEIGHT FORMAL METHODS WITH FUNCTIONAL PROGR...

引用

4th International Conference on Software and Data Technologies

作者： Kusakabe, Shigeru Ohmori, Yoichi Araki, Keijiro Kyushu Univ Grad Sch Informat Sci & Elect Engn Nishi Ku Fukuoka 8190395 Japan

ISBN: (纸本)9789896740092

We discuss the features of functional programming related to formal methods and an emerging paradigm, Cloud Computing. Formal methods are useful in developing highly reliable mission-critical software. However, in light-weight formal methods, we do not rely on very rigorous means, such as theorem proofs. Instead, we use adequately less rigorous means, such as evaluation of pre/post conditions and testing specifications, to increase confidence in our specifications. Millions of tests may be conducted in developing highly reliable mission-critical software in a light-weight formal approach. We consider an approach to leveraging lightweight formal methods by using "Cloud." Given a formal specification language which has the features of functional programming, such as referential transparency, we can expect advantages of parallel processing. One of the basic foundations of VDM specification languages is Set Theory. The pre/post conditions and proof-obligations may be expressed in terms of set expressions. We can evaluate this kind of expression in a data-parallel style by using mapreduce framework for a huge set of test cases over cloud computing environments. Thus, we expect we can greatly reduce the cost of testing specifications in light-weight formal methods.

关键词： Light-weight formal methods Testing Cloud computing Parallel processing Functional programming mapreduce programming

来源：评论

学校读者我要写书评

暂无评论

Fast Construction of an Index Tree for Large Non-ordered Discrete Datasets Using Multi-way Top-down Split and mapreduce 4

Fast Construction of an Index Tree for Large Non-ordered Dis...

引用

4th International Conference on Advanced Cloud and Big Data (CBD)

作者： Zhou, Zhichao Liu, Xiaoqiang Wang, Yinglan Zhu, Qiang Donghua Univ Sch Comp Sci & Technol Shanghai 201620 Peoples R China Univ Michigan Dept Comp & Informat Sci Dearborn MI 48128 USA

ISBN: (纸本)9781509036776

Effective indexing schemes are crucial in supporting efficient queries on large datasets from multidimensional Non-ordered Discrete Data Spaces (NDDS) in many applications such as genome sequence analysis in bioinformatics. Although constructing an index structure for a large dataset in an NDDS via a bulk loading technique is quite efficient (comparing to using a conventional tuple loading technique), existing bulk loading techniques cannot meet the scalability requirement for the fast growing sizes of datasets in contemporary NDDS applications. To tackle this challenge, we propose a new bulk loading method for fast construction of an index structure, called the PND-tree, for large datasets in NDDSs. Specifically, utilizing the characteristics of an NDDS and a priori knowledge of the given dataset, we suggest an effective multi-way top-down dataset split strategy with a mapreduce implementation for our bulk loading procedure. Experiments demonstrate that the proposed bulk loading method is quite promising in terms of the index construction efficiency and the resulting index quality, comparing to the conventional tuple loading method and a popular serial bulk loading method for a state-of-arts index tree in NDDSs.

关键词： Mulitdimensional Index Tree Bulk Loading Non-orderd Discrete Data Space mapreduce programming Parallelism

来源：评论

学校读者我要写书评

暂无评论

Information content measures of semantic similarity between documents based on Hadoop system

Information content measures of semantic similarity between ...

引用

International Conference on Wireless Networks and Mobile Communications (WINCOM)

作者： Birjali, Marouane Beni-Hssane, Abderrahim Erritali, Mohammed Madani, Youness Univ Chouaib Doukkali Fac Sci Dept Comp Sci El Jadida Morocco Univ Sultan Moulay Slimane Fac Sci & Technol Dept Comp Sci Beni Mellal Morocco

ISBN: (纸本)9781509038374

Retrieving documents in response to the user's query is the most commonly text retrieval task. For our work, we have mainly focused on detecting the semantic similarity between documents in large documents collection and queries. In this paper, we investigated mapreduce as a specific framework for managing distributed processing in dataset pattern and semantic similarity measures of documents. Then we study the state of the art of different approaches for computing the semantic similarity of documents. We propose an approach based on parallel algorithm of semantic similarity measures using mapreduce and WordNet to detect the relevant documents in the face of the query. Finally, we are leading basic experiments to assess the performance of the proposed approach and noted the leverage of Hadoop and mapreduce to the semantic similarity measures between documents.

关键词： distributed processing Hadoop Big Data Semantic similarity mapreduce programming Wordnet

来源：评论

学校读者我要写书评

暂无评论

MR-Edge: a mapreduce-based Protocol for IoT Edge Computing with Resource Constraints 16

MR-Edge: a MapReduce-based Protocol for IoT Edge Computing w...

引用

16th IEEE Annual Consumer Communications and Networking Conference (CCNC)

作者： Wang, Qian Lee, Brian Murray, Niall Qiao, Yuansong Athlone Inst Technol Software Res Inst Athlone Westmeath Ireland

ISBN: (纸本)9781538655535

Edge computing is proposed to remedy the Cloud-only processing architecture for Internet of Things (IoT) because of the massive amounts of IoT data. The challenge is how to deploy and execute data processing tasks on heterogeneous IoT edge network. As mapreduce is a well-known model in Cloud computing for distributed processing of big data, this paper aims to devise a mapreduce-based protocol to achieve IoT edge computing. Our design is built upon the novel Information Centric Networking (ICN), which supports function naming and forwarding so as to facilitate task distribution among edge devices. To guarantee the correctness of task execution, a tree topology is formed in our approach to establish the logical connection between different types of edge devices, namely processing-capable nodes and forward-only ones. Moreover, the proposed protocol includes a task maintenance scheme that enables the coexistence of multiple IoT computation jobs. A testbed is implemented on ndnSIM to verify the feasibility of our design. The results show our approach could significantly decrease the network traffic compared with centralized data processing.

关键词： Internet of Things (IoT) Information Centric Networking (ICN) Edge computing mapreduce programming

来源：评论

学校读者我要写书评

暂无评论

MAP-REDUCE BASED DISTANCE WEIGHTED K-NEAREST NEIGHBOR MACHINE LEARNING ALGORITHM FOR BIG DATA APPLICATIONS

引用

SCALABLE COMPUTING-PRACTICE AND EXPERIENCE 2022年第4期23卷 129-145页

作者： Gothai, E. Muthukumaran, V. Valarmathi, K. Sathishkumar, V. Thillaiarasu, N. Karthikeyan, P. Kongu Engn Coll Dept Comp Sci & Engn Perundurai Erode 638060 Tamilnadu India REVA Univ Sch Appl Sci Dept Math Bangalore 560064 India Panimalar Engn Coll Bangalore Trunk Rd Chennai 600123 India Hanyang Univ Dept Ind Engn Rea Seoul South Korea REVA Univ Sch Comp & Informat Technol Bangalore 560064 India Vellore Inst Technol Sch Informat Technol & Engn Vellore Tamil Nadu India

With the evolution of Internet standards and advancements in various Internet and mobile technologies, especially since web 4.0, more and more web and mobile applications emerge such as e-commerce, social networks, online gaming applications and Internet of Things based applications. Due to the deployment and concurrent access of these applications on the Internet and mobile devices, the amount of data and the kind of data generated increases exponentially and the new era of Big Data has come into existence. Presently available data structures and data analyzing algorithms are not capable to handle such Big Data. Hence, there is a need for scalable, flexible, parallel and intelligent data analyzing algorithms to handle and analyze the complex massive data. In this article, we have proposed a novel distributed supervised machine learning algorithm based on the mapreduce programming model and Distance Weighted k-Nearest Neighbor algorithm called MR-DWkNN to process and analyze the Big Data in the Hadoop cluster environment. The proposed distributed algorithm is based on supervised learning performs both regression tasks as well as classification tasks on large-volume of Big Data applications. Three performance metrics, such as Root Mean Squared Error (RMSE), Determination coefficient (R2) for regression task, and Accuracy for classification tasks are utilized for the performance measure of the proposed MR-DWkNN algorithm. The extensive experimental results shows that there is an average increase of 3% to 4.5% prediction and classification performances as compared to standard distributed k-NN algorithm and a considerable decrease of Root Mean Squared Error (RMSE) with good parallelism characteristics of scalability and speedup thus, proves its effectiveness in Big Data predictive and classification applications.

关键词： Machine Learning Big Data Analytics mapreduce programming k-Nearest Neighbour Classification prediction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：