Purpose - The aim of this paper is to propose a web environment for pre-processing and post-processing for 2D problems in generalized coordinate systems. Design/methodology/approach - The system consists of a web serv...
详细信息
Purpose - The aim of this paper is to propose a web environment for pre-processing and post-processing for 2D problems in generalized coordinate systems. Design/methodology/approach - The system consists of a web service for client-server communication, a database for user information, simulation requests and results storage, a module of (for) calculation processing (front-end) and a graphical interface for visualization of discretized mesh (back-end). Findings - The web system was able to model real problems and situations, where the user can describe the problem or upload a geometry file descriptor, generated from computer graphics software. The web system, programmed for finite difference solutions, was able to generate a mesh from other complex methods, such as finite elements method, adapting it to the proposed web system, respecting the finite difference mesh structure. Research limitations/implications - The proposed web system is limited to solve partial differential equations by finite difference discretization. We need to study about refinement and parameters adaptations to solve partial differential equations simulated with other methods. Practical implications - The web system includes implications for the development of a powerful real problems simulator, which is useful for computational physics researchers and engineers. The web system uses several technologies, such as Primefaces, JavaScript, JQuery and HTML, to provide an interactive user interface. Originality/value - The main contribution of this work is the availability of a generic web architecture for including other types of coordinate systems and to solve others partial differential equations. Moreover, this paper presents an extended version of the work presented in ICCSA 2014.
A plethora of modern-day techniques allows the detailed characterization of the transcriptome on a quantitative level. Analyses, based on techniques such as cDNA microarrays or RNA-seq (whole transcriptome shotgun seq...
详细信息
A plethora of modern-day techniques allows the detailed characterization of the transcriptome on a quantitative level. Analyses, based on techniques such as cDNA microarrays or RNA-seq (whole transcriptome shotgun sequencing), are usually genome wide in scope and readily detect small changes in gene expression levels across different biological samples. However, when it comes to spatial localization of gene expression within the context of complex tissues, traditional methods of in situ hybridization remain unparalleled with regard to their cellular resolution. Here we review methods that extend classical in situ hybridization protocols and techniques to the special needs of high-throughput (HT) studies and which can be readily scaled up to a genomic level to cover organs or even whole organisms in great detail. Moreover, we discuss suitable HT instrumentation and address postproduction issues typically arising with HT pipelines such as annotation of expression data and database organization. less
Purpose - The purpose of this study is to introduce several metrics that enable universal and fine-grained characterization of arbitrary Linked Data repositories. Publicly accessible SPARQL endpoints contain vast amou...
详细信息
Purpose - The purpose of this study is to introduce several metrics that enable universal and fine-grained characterization of arbitrary Linked Data repositories. Publicly accessible SPARQL endpoints contain vast amounts of knowledge from a large variety of domains. However, oftentimes these endpoints are not configured to process specific workloads as efficiently as possible. Assisting users in leveraging SPARQL endpoints requires insight into functional and non-functional properties of these knowledge bases. Design/methodology/approach - This study presents comprehensive approaches for deriving these metrics. More specifically, the study utilizes concrete SPARQL queries to determine corresponding values. Furthermore, it validates and discusses the introduced metrics through extensive evaluation on real-world SPARQL endpoints. Findings - The evaluation determined that endpoints exhibit different characteristics. While it comes as no surprise that latency and throughput are influenced by the network infrastructure, the costs for join operations depend on a number of factors that are not obvious to a data consumer. Moreover, as the author discusses mean, median and upper quartile values, it was found both endpoints behaving consistently as well as repositories offering varying levels of performance. Originality/value - On the one hand, the contribution of the authors work lies in assisting data consumers in evaluation of the quality of service of publicly available SPARQL endpoints. On the other hand, the performance metrics introduced in this study can also be considered as additional input features for distributed query processing frameworks. Moreover, the author provides a universal means for discerning characteristics of different SPARQL endpoints without the need of (synthetic or real-world) query workloads.
The growing structured web databases on the web, making large-scale Deep web data integration faces enormous challenges. Organizing such structured web databases into a hierarchy directory tree is one of critical step...
详细信息
The growing structured web databases on the web, making large-scale Deep web data integration faces enormous challenges. Organizing such structured web databases into a hierarchy directory tree is one of critical step towards the large-scale integration of Deep web. In this paper, a method for automatic classification of web database is addressed. Firstly, the method for calculating the semantic similarities among the web databases based on their interface schemas is proposed and be translated to the problem of extended optimal matching for bipartite graph. Then based on the achieved similarity matrix, an agglomerative hierarchical clustering algorithm is proposed, which can organize the web databases into a hierarchy tree automatically. Theoretical analysis and experimental results show that the method is efficient.
With the proliferation of very large data repositories hidden behind web interfaces, e.g., keyword search, form-like search and hierarchical/graph-based browsing interfaces for ***, ***, etc., efficient ways of search...
详细信息
ISBN:
(纸本)9781450323512
With the proliferation of very large data repositories hidden behind web interfaces, e.g., keyword search, form-like search and hierarchical/graph-based browsing interfaces for ***, ***, etc., efficient ways of searching, exploring and/or mining such web data are of increasing importance. There are two key challenges facing these tasks: how to properly understand web interfaces, and how to bypass the interface restrictions. In this tutorial, we start with a general overview of web search and data mining, including various exciting applications enabled by the effective search, exploration, and mining of web repositories. Then, we focus on the fundamental developments in the field, including web interface understanding, crawling, sampling, and data analytics over web repositories with various types of interfaces. We also discuss the potential changes required for query processing, data mining and machine learning algorithms to be applied to web data. Our goal is two-fold: one is to promote the awareness of existing web data search/exploration/mining techniques among all web researchers who are interested in leveraging web data, and the other is to encourage researchers, especially those who have not previously worked in web search and mining before, to initiate their own research in these exciting areas.
Recent efforts in data cleaning of structured data have focused exclusively on problems like data deduplication, record matching, and data standardization;none of these focus on fixing incorrect attribute values in tu...
详细信息
ISBN:
(纸本)9781479956661
Recent efforts in data cleaning of structured data have focused exclusively on problems like data deduplication, record matching, and data standardization;none of these focus on fixing incorrect attribute values in tuples. Correcting values in tuples is typically performed by a minimum cost repair of tuples that violate static constraints like CFDs (which have to be provided by domain experts, or learned from a clean sample of the database). In this paper, we provide a method for correcting individual attribute values in a structured database using a Bayesian generative model and a statistical error model learned from the noisy database directly. We thus avoid the necessity for a domain expert or clean master data. We also show how to efficiently perform consistent query answering using this model over a dirty database, in case write permissions to the database are unavailable. We evaluate our methods over both synthetic and real data.
Purpose - The purpose of this paper is to activate latent users posts by modeling user behaviors by a transition of clusters that represent particular posting activities. Twitter has rapidly spread and become an easy ...
详细信息
Purpose - The purpose of this paper is to activate latent users posts by modeling user behaviors by a transition of clusters that represent particular posting activities. Twitter has rapidly spread and become an easy and convenient microblog that enables users to exchange instant text messages called tweets. There are so many latent users whose posting activities have decreased. Design/methodology/approach - Under this model, two kinds of time-series analysis methods are proposed to clarify the lifecycles of Twitter users. In the first one, all users belong to a cluster consisting of several features at individual time slots and move among the clusters in a time series. In the second one, the posting activities of Twitter users are analyzed by the amount of tweets that vary with time. Findings - This sophisticated evaluation using a large actual tweet-set demonstrated the proposed methods effectiveness. The authors found a big difference in the state transition diagrams between long- and short-term users. Analysis of short-term users introduces effective knowledge for encouraging continued Twitter use. Originality/value - An the efficient user behavior model, which describes transitions of posting activities, is proposed. Two kinds of time longitudinal analysis method are evaluated using a large amount of actual tweets.
With the development of web 2.0 technology, enormous data are generated every day. Among these data, there exist quite a lot uncertainty due to careless data entry, incomplete information, and inconsistency among diff...
详细信息
With the development of web 2.0 technology, enormous data are generated every day. Among these data, there exist quite a lot uncertainty due to careless data entry, incomplete information, and inconsistency among different data description. Although significant effort has been paid to find effective and efficient solutions for managing and mining general uncertain data, little attention is paid to manage uncertain data on the web. This special issue is proposed to attract research attempts on handling uncertainty of the web data. This special issue has attracted 12 submissions, after two rounds of very careful reviews by domain experts, we accepted three excellent papers. These three papers present new ideas to address issues on Probabilistic web Data Management.
We are pleased to announce the publication of this third Database Issue of Plant and Cell Physiology (PCP). It contains four new databases and seven updated databases (Tables 1, 2). Our aim with this issue is to provi...
详细信息
We are pleased to announce the publication of this third Database Issue of Plant and Cell Physiology (PCP). It contains four new databases and seven updated databases (Tables 1, 2). Our aim with this issue is to provide a forum for discussion of bioinformatics research, in particular the development and maintenance of the infrastructure of web databases for plant science (Matsuoka and Yano 2010). The databases described in this issue cover a broad range of omics topics. The genome and transcriptome databases permit management of the ood of data from recent high-throughput sequencers, and have been rapidly extended to apply to non-model plants. On the other hand, metabolome and phenome data still require databases and web tools to store, annotate and compare the data. In the following paragraphs, we briey introduce the 11 databases in this issue and broadly describe their functions.
暂无评论