Recently, the computational requirements for large scale data-intensive analysis of scientific data have grown significantly. In High Energy Physics (HEP) for example, the Large Hadron Collider (LHC) produced 13 petab...
详细信息
Recently, the computational requirements for large scale data-intensive analysis of scientific data have grown significantly. In High Energy Physics (HEP) for example, the Large Hadron Collider (LHC) produced 13 petabytes of data in 2010. This huge amount of data are processed on more than 140 computingcenters distributed across 34 countries. The MapReduce paradigm has emerged as a highly successful programming model for large-scale data-intensive computing applications. However, current MapReduce implementations are developed to operate on single cluster environments and cannot be leveraged for large-scale distributed data processing across multiple clusters. On the other hand, workflow systems are used for distributed data processing across data centers. It has been reported that the workflow paradigm has some limitations for distributed data processing, such as reliability and efficiency. In this paper, we present the design and implementation of GHadoop, a MapReduce framework that aims to enable large-scale distributed computing across multiple clusters. G-Hadoop uses the Gfarm file system as an underlying file system and executes MapReduce tasks across distributed clusters. Experiments of the G-Hadoop framework on distributed clusters show encouraging results.
We present an efficient controlled quantum perfect teleportation scheme. In our scheme, multiple senders can teleport multiple arbitrary unknown multi-qubit states to a single receiver via a previously shared entangle...
详细信息
We present an efficient controlled quantum perfect teleportation scheme. In our scheme, multiple senders can teleport multiple arbitrary unknown multi-qubit states to a single receiver via a previously shared entanglement state with the help of one or more controllers. Furthermore, our scheme has a very good performance in the measurement and operation complexity, since it only needs to perform Bell state and single-particle measurements and to apply Controlled-Not gate and other single-particle unitary operations. In addition, compared with traditional schemes, our scheme needs less qubits as the quantum resources and exchanges less classical information, and thus obtains higher communication efficiency.
The chronic hyperglycemia of diabetes is associated with long-term damage, dysfunction, and failure of different organs, especially the eyes, kidneys, nerves, heart, and blood vessels. The regular examination of diabe...
详细信息
The chronic hyperglycemia of diabetes is associated with long-term damage, dysfunction, and failure of different organs, especially the eyes, kidneys, nerves, heart, and blood vessels. The regular examination of diabetic patients can potentially reduce the risk of vision impairment and in the last instance blindness. Early diabetic retinopathy detection enables application of laser therapy treatment in order to prevent or delay loss of vision. The diagnostics and detection of diabetic retinopathy is performed by specialized ophthalmologists manually and represents expensive procedure. Automatic exudates detection and retina images classification would be helpful for reducing diabetic retinopathy screening costs and encouraging regular examinations. We proposed the automated algorithm that applies mathematical modeling which enables light intensity levels emphasis, easier exudates detection, efficient and correct classification of retina images. The proposed algorithm is robust to various appearance changes of retinal fundus images which are usually processed in clinical environments.
Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system in...
详细信息
Extracting the protocol message format specifications of unknown applications from network traces is important for a variety of applications such as application protocol parsing, vulnerability discovery, and system integration. In this paper, we propose ProDecoder, a network trace based protocol message format inference system that exploits the semantics of protocol messages without the executable code of application protocols. ProDecoder is based on the key insight that the n-grams of protocol traces exhibit highly skewed frequency distribution that can be leveraged for accurate protocol message format inference. In ProDecoder, we first discover the latent relationship among n-grams by first grouping protocol messages with the same semantics and then inferring message formats by keyword based clustering and cluster sequence alignment. We implemented and evaluated ProDecoder to infer message format specifications of SMB (a binary protocol) and SMTP (a textual protocol). Our experimental results show that ProDecoder accurately parses and infers SMB protocol with 100% precision and recall. For SMTP, ProDecoder achieves approximately 95% precision and recall.
Association Link Network (ALN) is a kind of Semantic Link Network built by mining the association relations among Web resources for effectively supporting Web intelligent application such as Web-based learning, and kn...
详细信息
With the rapid growth of service scale, there are many services with the same functional properties but different non-flmctional properties on the Internet. There have been some global optimizing service selection alg...
详细信息
With the rapid growth of service scale, there are many services with the same functional properties but different non-flmctional properties on the Internet. There have been some global optimizing service selection algorithms for service selection. However, most of those approaches cannot fully reflect users' preferences or are not fully suitable for large-scale services selection. In this paper, an ant colony optimization (ACO) algorithm for the model of global optimizing service selection with various quality of srevice (QoS) properties is employed, and a user-preference based large-scale service selection algorithm is proposed. This algorithm aims at optimizing user-preferred QoS properties and selecting services that meet all user-defined QoS thresholds. Experiment results prove that this algorithm is very efficient in this regard.
The techniques of linear dimensionality reduction have been attracted widely attention in the fields of computer vision and pattern recognition. In this paper, we propose a novel framework called Sparse Bilinear Prese...
详细信息
Word Sense Disambiguation (WSD) is one of the fundamental natural language processing tasks. However, lack of training corpora is a bottleneck to construct a high accurate all-words WSD system. Annotating a large-scal...
详细信息
Recent research usually models POS tagging as a sequential labeling problem, in which only local context features can be used. Due to the lack of morphological inflections, many tagging ambiguities in Chinese are diff...
详细信息
In this paper, we formalize the task of finding a knowledge base entry that a given named entity mention refers to, namely entity linking, by identifying the most "important" node among the graph nodes repre...
详细信息
暂无评论