Extracting multi-records from web pages is useful, it allows us to integrate information from multiple sources to provide value-added services. Existing techniques still have some limitations because of their several ...
详细信息
In an effort to provide lawful interception for session initiation protocol (SIP) voice over Internet protocol (VoIP), an interception architecture using session border controller (SBC) is proposed. Moreover, a protot...
详细信息
Given a set of client locations, a set of facility locations where each facility has a service capacity, and the assumptions that: (i) a client seeks service from its nearest facility;(ii) a facility provides service ...
详细信息
Audio-Visual Question Answering (AVQA) is a challenging multimodal reasoning task requiring intelligent systems to answer natural language queries based on paired audio-video inputs accurately. However, existing AVQA ...
The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalabilit...
详细信息
The volume of RDF data increases dramatically within recent years, while cloud computing platforms like Hadoop are supposed to be a good choice for processing queries over huge data sets for their wonderful scalability. Previous work on evaluating SPARQL queries with Hadoop mainly focus on reducing the number of joins through careful split of HDFS files and algorithms for generating Map/Reduce jobs. However, the way of partitioning RDF data could also affect system performance. Specifically, a good partitioning solution would greatly reduce or even to- tally avoid cross-node joins, and significantly cut down the cost in query evaluation. Based on HadoopDB, this work processes SPARQL queries in a hybrid architecture, where Map/Reduce takes charge of the computing tasks, and RDF query engines like RDF-3X store the data and execute join operations. According to the analysis of query workloads, this work proposes a novel algorithm for automatically parti- tioning RDF data and an approximate solution to physically place the partitions in order to reduce data redundancy. It also discusses how to make a good trade-off between query evaluation efficiency and data redundancy. All of these pro- posed approaches have been evaluated by extensive experiments over large RDF data sets.
All-pairs SimRank calculation is a classic SimRank problem. However, all-pairs algorithms suffer from efficiency issues and accuracy issues. In this paper, we convert the non-linear simrank calculation into a new simp...
详细信息
The excellent performance of short texts classification has emerged in the past few years. However, massive short texts with few words like invoice data are different with traditional short texts like tweets in its no...
详细信息
The performance of online analytical processing (OLAP) is critical for meeting the increasing requirements of massive volume analytical applications. Typical techniques, such as in-memory processing, column-storage,...
详细信息
The performance of online analytical processing (OLAP) is critical for meeting the increasing requirements of massive volume analytical applications. Typical techniques, such as in-memory processing, column-storage, and join indexes focus on high perfor- mance storage media, efficient storage models, and reduced query processing. While they effectively perform OLAP applications, there is a vital limitation: main- memory database based OLAP (MMOLAP) cannot provide high performance for a large size data set. In this paper, we propose a novel memory dimension table model, in which the primary keys of the dimension table can be directly mapped to dimensional tuple addresses. To achieve higher performance of dimensional tuple access, we optimize our storage model for dimension tables based on OLAP query workload features. We present directly dimensional tuple accessing (DDTA) based join (DDTA- JOIN), a technique to optimize query processing on the memory dimension table by direct dimensional tuple access. We also contribute by proposing an optimization of the predicate tree to shorten predicate operation length by pruning useless predicate processing. Our experimental results show that the DDTA-JOIN algorithm is superior to both simulated row-store main memory query processing and the open-source column-store main memory database MonetDB, thanks to the reduced join cost and simple yet efficient query processing.
Domain adaptation aims to transfer knowledge from the labeled source domain to an unlabeled target domain that follows a similar but different ***,adversarial-based methods have achieved remarkable success due to the ...
详细信息
Domain adaptation aims to transfer knowledge from the labeled source domain to an unlabeled target domain that follows a similar but different ***,adversarial-based methods have achieved remarkable success due to the excellent performance of domain-invariant feature presentation ***,the adversarial methods learn the transferability at the expense of the discriminability in feature representation,leading to low generalization to the target *** this end,we propose a Multi-view Feature Learning method for the Over-penalty in Adversarial Domain ***,multi-view representation learning is proposed to enrich the discriminative information contained in domain-invariant feature representation,which will counter the over-penalty for discriminability in adversarial ***,the class distribution in the intra-domain is proposed to replace that in the inter-domain to capture more discriminative information in the learning of transferrable *** experiments show that our method can improve the discriminability while maintaining transferability and exceeds the most advanced methods in the domain adaptation benchmark datasets.
Existing research on extreme value query in wireless sensor networks is mainly focus on finding out sensors with highest metric. Yet in most actually scenarios, people cares more about special network regions than det...
详细信息
暂无评论