Performing Process Mining by analyzing event logs generated by various systems is a very computation and I/O intensive task. distributed computing and Big Data processing frameworks make it possible to distribute all ...
详细信息
ISBN:
(纸本)9781467387767
Performing Process Mining by analyzing event logs generated by various systems is a very computation and I/O intensive task. distributed computing and Big Data processing frameworks make it possible to distribute all kinds of computation tasks to multiple computers instead of performing the whole task in a single computer. This paper assesses whether contemporary structured query language (sql) supporting Big Data processing frameworks are mature enough to be efficiently used to distribute computation of two central Process Mining tasks to two dissimilar clusters of computers providing BPM as a service in the cloud. Tests are performed by using a novel automatic testing framework detailed in this paper and its supporting materials. As a result, an assessment is made on how well selected Big Data processing frameworks manage to process and to parallelize the analysis work required by Process Mining tasks.
There are the trends that the users hope to retrieve information among lots of individual database systems distributed in lots of virtual organizations. To connect all these individual database systems into a virtual ...
详细信息
There are the trends that the users hope to retrieve information among lots of individual database systems distributed in lots of virtual organizations. To connect all these individual database systems into a virtual database system is a significant issue. This paper presents VIRGO_DDBMS, a framework of distributed database system based on virtual hierarchical tree Grid organizationsfVIRGO) P2P network. The table space are classified as hierarchical domains like DNS. The servers hosting database systems joins several groups of VIRGO network according to the owned tables' domains. VIRGO_DDBMS is effective for data update and retrieve by using distributed sql statements. We use the policies which cache the host machines' addresses of database systems to avoid the overload of traffic of root nodes in tree structure. The proposal presented here can construct a virtual distributed database system which can manages huge volumes of data by connecting all database systems together which are owned by the individual virtual organizations.
暂无评论