With the rapid development of information storage and networking technologies, quintillion bytes of data are generated every day from social networks, business transactions, sensors, and many other domains. The increa...
详细信息
With the rapid development of information storage and networking technologies, quintillion bytes of data are generated every day from social networks, business transactions, sensors, and many other domains. The increasing data volumes impose significant challenges to traditional data analysis tools in storing, processing, and analyzing these extremely large-scale data. For decades, hashing has been one of the most effective tools commonly used to compress data for fast access and analysis, as well as information integrity verification. Hashing techniques have also evolved from simple randomization approaches to advanced adaptive methods considering locality, structure, label information, and data security, for effective hashing. This survey reviews and categorizes existing hashing techniques as a taxonomy, in order to provide a comprehensive view of mainstream hashing techniques for different types of data and applications. The taxonomy also studies the uniqueness of each method and therefore can serve as technique references in understanding the niche of different hashing mechanisms for future development.
The mitochondrial genome (mitogenome) is one of the most widely used markers for phylogenetic analysis. Compared with whole-genome data, mitogenome data are less expensive to obtain and easier to manipulate. However, ...
详细信息
The mitochondrial genome (mitogenome) is one of the most widely used markers for phylogenetic analysis. Compared with whole-genome data, mitogenome data are less expensive to obtain and easier to manipulate. However, compositional bias and accelerated evolutionary rate reduce the effectiveness of the mitogenome in determining insect phylogeny. This study shows that mitogenome data are not suitable to reconstruct deep holometabolan evolution, even with a most comprehensive data coding scheme and the more realistic CAT model. For the deep levels of divergence within Holometabola, protein-coding genes only retain weak phylogenetic signals, leading to peculiar interordinal relationships. Consensus relationships in the Holometabola phylogeny, such as the monophyly of Holometabola, the most basal position of Hymenoptera, and the sister group relationship between the Strepsiptera and Coleoptera were rarely resolved in our analyses. The relationships of the holometabolan groups as inferred by mitogenomes are highly vulnerable to gene types, data coding regimes, model choice, and optimality criteria, and no consistent alternative hypothesis of Holometabola’s relationships is supported. Thus, we suggest that the slowly evolving nuclear genes or genome-scale approaches may be better options for resolving deep-level phylogeny of Holometabola.
A technique to save the frequency of write operation on the nonvolatile memory is proposed for reducing dynamic power dissipation of nonvolatile logic LSI which shortens its break-even time for power gating. The propo...
详细信息
ISBN:
(纸本)9781479917778
A technique to save the frequency of write operation on the nonvolatile memory is proposed for reducing dynamic power dissipation of nonvolatile logic LSI which shortens its break-even time for power gating. The proposed technique is realized by combining a selective write method with a coding technique. The selective write method compares input words and stored words, and rejects redundant write operation. Moreover, the use of the data coding technique shortens the Hamming distance between adjacent words in an input data sequence and reduces the frequency of bit reversal in the nonvolatile memory, which results in the further reduction in the power dissipation due to write operation. Through the design and evaluation of a nonvolatile 8-bit counter, it is observed that the proposed technique shortens the break-even time for power gating by up to 85.2% with a small hardware overhead.
Medication exposure is an important variable in virtually all clinical research, yet there is great variation in how the data are collected, coded, and analyzed. coding and classification systems for medication data a...
详细信息
Medication exposure is an important variable in virtually all clinical research, yet there is great variation in how the data are collected, coded, and analyzed. coding and classification systems for medication data are heterogeneous in structure, and there is little guidance for implementing them, especially in large research networks and multi-site trials. Current practices for handling medication data in clinical trials have emerged from the requirements and limitations of paper-based data collection, but there are now many electronic tools to enable the collection and analysis of medication data. This paper reviews approaches to coding medication data in multi-site research contexts, and proposes a framework for the classification, reporting, and analysis of medication data. The framework can be used to develop tools for classifying medications in coded data sets to support context appropriate, explicit, and reproducible data analyses by researchers and secondary users in virtually all clinical research domains. (C) 2014 Elsevier Inc. All rights reserved.
Wireless Sensor Network (WSN) have been attracting increasing interest for supporting a new generation of ubiquitous computing systems with great potential for many applications. However, the communication paradigms i...
详细信息
ISBN:
(纸本)9781479959792
Wireless Sensor Network (WSN) have been attracting increasing interest for supporting a new generation of ubiquitous computing systems with great potential for many applications. However, the communication paradigms in WSNs differ from the ones associated to traditional wireless networks, triggering the need for efficient wireless communication technology. Several wireless technologies have emerged ranging from short and medium distance. Bluetooth, ZigBee and Impulse Radio Ultra Wide Band (IR-UWB) are three popular short range wireless communications. Due to its various features and advantages (especially low power consumption and low complexity), IR-UWB is a very promising wireless communication technology for WSN. In this paper, we evaluate the main features and advantages of this new technology (IR-UWB) for WSN in terms of transmission time, data coding and power consumption compared to Bluetooth and ZigBee. To analyze and evaluate the main features and advantages of IR-UWB as an efficient short range wireless communication technology for WSN, we used MiXiM platform under OMNet++ simulator.
The paper describes proposed satellite mission BRICsat for which all electronics parts has been designed. The modes of operation as well as data formats for telemetry and archive data are described. The results from t...
详细信息
ISBN:
(纸本)9781467363969;9781467363952
The paper describes proposed satellite mission BRICsat for which all electronics parts has been designed. The modes of operation as well as data formats for telemetry and archive data are described. The results from the evaluation measurements are included in the paper.
The paper describes proposed BRICsat mission for which a receiver, transmitter and cooperating controller has been designed. The telemetry data format as well as coding for data compression is presented. The results f...
详细信息
ISBN:
(纸本)9781467355179;9781467355162
The paper describes proposed BRICsat mission for which a receiver, transmitter and cooperating controller has been designed. The telemetry data format as well as coding for data compression is presented. The results from the thermal vacuum test of the device are included in the paper.
Political knowledge research faces a problem, perhaps even a crisis. For two decades, the American National Election Studies asked open-ended questions about political knowledge and coded answers using procedures that...
详细信息
Political knowledge research faces a problem, perhaps even a crisis. For two decades, the American National Election Studies asked open-ended questions about political knowledge and coded answers using procedures that are neither reliable nor replicable and that were never shown to be optimally valid. Consequently, conclusions based on these widely used measures of the public's competence are in doubt. This article presents several new and overdue methodological improvements: coding knowledge data using formal and specific coding rules based on a substantive rationale for the validity of the codes, recognizing partially correct answers, using multiple coders working independently, using machine coding, and testing reliability and validity. The new methods are an improvement because they are transparent and replicable and they produce valid and extremely reliable knowledge data. Further, machine coding produces codes nearly identical to those from a team of human coders, at much lower cost.
The parallel computing method of max ordinal number in universal combinatorics coding based on GPU is advanced in this paper. The core of universal combinatorics coding computing is ordinal number computing while the ...
详细信息
ISBN:
(纸本)9781479913909
The parallel computing method of max ordinal number in universal combinatorics coding based on GPU is advanced in this paper. The core of universal combinatorics coding computing is ordinal number computing while the calculation of ordinal number depends on the value of max ordinal number. The calculation of max ordinal number is divided into two parts, multiplication and division of large numbers. This paper focus on the GPU parallel computing of the multiplication in max ordinal number calculation, the calculation speed of the multiplication part is increased substantially, and thus improves the efficiency of max ordinal number calculation.
Point source information system of geology and mineral resource consist of three major parts including reconnaissance data management, reconnaissance data processing and geological mineral source appraisal. Its struct...
详细信息
Point source information system of geology and mineral resource consist of three major parts including reconnaissance data management, reconnaissance data processing and geological mineral source appraisal. Its structure lies in the combination of technique methods and application models. The core of it is point source data base. With the application utilized more frequently, various datum on the Internet drastically increase which will cause more and more Internet attacks. Diverse security issues are hampering the normal running of the system. Therefore, it has been a critical problem and pressing job to guarantee a efficient and secure geological mineral information website.
暂无评论