softwareengineering is a data-driven discipline and an integral part of data science. The introduction of bigdata systems has led to a great transformation in the architecture, methodologies, knowledge domains, and ...
详细信息
softwareengineering is a data-driven discipline and an integral part of data science. The introduction of bigdata systems has led to a great transformation in the architecture, methodologies, knowledge domains, and skills related to softwareengineering. Accordingly, education programs are now required to adapt themselves to up-to-date developments by first identifying the competencies concerning big data software engineering to meet the industrial needs and follow the latest trends. This paper aims to reveal the knowledge domains and skill sets required for big data software engineering and develop a taxonomy by mapping these competencies. A semi-automatic methodology is proposed for the semantic analysis of the textual contents of online job advertisements related to big data software engineering. This methodology uses the latent Dirichlet allocation (LDA), a probabilistic topic-modeling technique to discover the hidden semantic structures from a given textual corpus. The output of this paper is a systematic competency map comprising the essential knowledge domains, skills, and tools for big data software engineering. The findings of this paper are expected to help evaluate and improve IT professionals' vocational knowledge and skills, identify professional roles and competencies in personnel recruitment processes of companies, and meet the skill requirements of the industry through softwareengineering education programs. Additionally, the proposed model can be extended to blogs, social networks, forums, and other online communities to allow automatic identification of emerging trends and generate contextual tags.
The bigdata revolution began when the volume, velocity, and variety of data completely overwhelmed the systems used to store, manipulate and analyze that data. As a result, a new class of software systems emerged cal...
详细信息
ISBN:
(纸本)9781665437844
The bigdata revolution began when the volume, velocity, and variety of data completely overwhelmed the systems used to store, manipulate and analyze that data. As a result, a new class of software systems emerged called bigdata systems. While many attempted to harness the power of these new systems, it is estimated that approximately 75% of the bigdata projects have failed within the last decade. One of the root causes of this is softwareengineering and architecture aspect of these systems. This paper aims to facilitate bigdata system development by introducing a software reference architecture. The work provides an event driven microservices architecture that addresses specific limitations in current bigdata reference architectures (RA). The artefact development has followed the principles of empirically grounded RAs. The RA has been evaluated by developing a prototype that solves a real-world problem in practice. At the end, succesful implementation of the reference architecture have been presented. The results displayed a good degree of applicability with respect to quality factors.
An important area of work in big data software engineering involves the design and development of software frameworks for data-intensive systems that perform large-scale data collection and analysis. We report on our ...
详细信息
ISBN:
(纸本)9780769556703
An important area of work in big data software engineering involves the design and development of software frameworks for data-intensive systems that perform large-scale data collection and analysis. We report on our work to design and develop a software framework for analyzing the collaborative editing behavior of OpenStreetMap users when working on the task of crisis mapping. Crisis mapping occurs after a disaster or humanitarian crisis and involves the coordination of a distributed set of users who collaboratively work to improve the quality of the map for the impacted area in support of emergency response efforts. Our paper presents the challenges related to the analysis of OpenStreetMap and how our software framework tackles those challenges to enable the efficient processing of gigabytes of OpenStreetMap data. Our framework has already been deployed to analyze crisis mapping efforts in 2015 and has an active development community.
暂无评论