The exponential growth in data volume has necessitated the adoption of alternative storage solutions, and DNA storage stands out as the most promising solution. However, the exorbitant costs associated with synthesis ...
详细信息
The exponential growth in data volume has necessitated the adoption of alternative storage solutions, and DNA storage stands out as the most promising solution. However, the exorbitant costs associated with synthesis and sequencing impeded its development. Pre-compressing the data is recognized as one of the most effective approaches for reducing storage costs. However, different compression methods yield varying compression ratios for the same file, and compressing a large number of files with a single method may not achieve the maximum compression ratio. This study proposes a multi-file dynamic compression method based on machine learning classification algorithms that selects the appropriate compression method for each file to minimize the amount of data stored into DNA as much as possible. Firstly, four different compression methods are applied to the collected files. Subsequently, the optimal compression method is selected as a label, as well as the file type and size are used as features, which are put into seven machine learning classification algorithms for training. The results demonstrate that k-nearest neighbor outperforms other machine learning algorithms on the validation set and test set most of the time, achieving an accuracy rate of over 85% and showing less volatility. Additionally, the compression rate of 30.85% can be achieved according to k-nearest neighbor model, more than 4.5% compared to the traditional single compression method, resulting in significant cost savings for DNA storage in the range of $0.48 to 3 billion/TB. In comparison to the traditional compression method, the multi-file dynamic compression method demonstrates a more significant compression effect when compressing multiple files. Therefore, it can considerably decrease the cost of DNA storage and facilitate the widespread implementation of DNA storage *** AbstractFile compression is an important step in DNA storage, and different types of files may have varyin
With the rapid development of nanotechnology, researchers can prepare nanomaterials by various methods. The rapid development of nanotechnology has greatly promoted the wide application of nano materials in the field ...
详细信息
With the rapid development of nanotechnology, researchers can prepare nanomaterials by various methods. The rapid development of nanotechnology has greatly promoted the wide application of nano materials in the field of detection and catalysis. Especially the metal nanomolecules have good biocompatibility, absorption of plasma surface resonance, enhanced Raman surface and other properties. Metal nanomolecules have good biocompatibility, absorption of plasma surface resonance and enhanced Raman surface Rice inspection and catalytic application have attracted wide attention. This paper mainly studies the defect detection and classification algorithm of metal nanomaterials based on deep learning. Through the experimental phenomena, we can understand and master the defect detection methods of metal nanomaterials, and review the problems and future development direction of the preparation of metal nanomaterials. Deep learning algorithm the first mock exam deep learning network model, metal defect multi-mode detection method and the classification of metal surface defects are investigated. The defect detection of metal nano materials based on deep learning is realized. The whole defect and the quantitative detection of metal defects are realized. The detection range of traditional single mode nondestructive testing technology is overcome. The deficiency of accurate quantitative detection is difficult. The results show that there are five characteristic parameters for metal nano surface defect detection. In the deep learning based defect detection and classification algorithm, the big data technology is used to analyze the complete defect data, environmental data and working intensity data to complete the prediction of the future development trend of defects, which can play an important role in the maintenance of materials It is also of great significance to the development of metal nano detection technology.
Land cover classification is a vital application area in the satellite image processing domain. Texture is a useful feature in land cover classification. In this paper, we propose a distributed texture-based land cove...
详细信息
Land cover classification is a vital application area in the satellite image processing domain. Texture is a useful feature in land cover classification. In this paper, we propose a distributed texture-based land cover classification algorithm using Hidden Markov Model (HMM). Here, HMM is used for texture-based classification of remotely sensed images. Furthermore, to enhance the performance, data-intensive remotely sensed image is segmented and distributed into parallel sessions. Experiments were conducted on IRS P6 LISS-IV data, and the results were evaluated based on the confusion matrix, classification accuracy, and Kappa statistics. These results indicate that the proposed algorithm achieves a classification accuracy of 88.75%.
The hotel management relationship is a good business strategy for hotels, which can promote the development of a hotel, when a classification algorithm is applied to customer relationship management system. First, the...
详细信息
The hotel management relationship is a good business strategy for hotels, which can promote the development of a hotel, when a classification algorithm is applied to customer relationship management system. First, the classification algorithm is based on a support vector machine is studied, the nearest neighbor sample density is used, and the corresponding mathematical model is constructed. Second, the procedure of a classification algorithm based on fuzzy support vector machine is designed. Third, a customer acquisition plan based on a classification algorithm is analyzed. Finally, a hotel is used as the research object, and a customer acquisition analysis is carried out, and the results show that the new method has quicker training speed and higher classification correctness.
This study focused on a novel approach for classifying hazardous chemicals to be used for chemical terrorism. We developed a novel algorithm to classify nationally customized chemicals of interest (COI) out of 325 COI...
详细信息
This study focused on a novel approach for classifying hazardous chemicals to be used for chemical terrorism. We developed a novel algorithm to classify nationally customized chemicals of interest (COI) out of 325 COI in USA. The proposed COI classification algorithm aims to identify a key set of factors that reflect nation-wide uniqueness: intentional use, objectives, toxicity, related laws (CWC, ITF-25, CAA, etc.) and responsive counter-actions to terrorism. Although the U.S. has managed 325 COI to prevent terrorism, there are some nations in which the management and control of all the hazardous chemicals are beyond their capability. Based upon the outcome of this study the Ministry of the Environment of Korea has made appropriate revisions on relevant law. As a result, the Korean government has officially added a new set of 13 chemical species to the list of existing hazardous chemicals. This work is worthwhile to contribute to protecting the people's lives and property from possible chemical accidents including tenor by chemicals.
CO_(2)huff and puff technology can enhance the recovery of heavy oil in high-water-cut ***,the effectiveness of this method varies significantly under different geological and fluid conditions,which leads to a high-di...
详细信息
CO_(2)huff and puff technology can enhance the recovery of heavy oil in high-water-cut ***,the effectiveness of this method varies significantly under different geological and fluid conditions,which leads to a high-dimensional and small-sample(HDSS)*** is difficult for conventional techniques that identify key factors that influence CO_(2)huff and puff effects,such as fuzzy mathematics,to manage HDSS datasets,which often contain nonlinear and irremovable abnormal *** accurately pinpoint the primary control factors for heavy oil CO_(2)huff and puff,four machine learning classification algorithms were *** algorithms were selected to align with the characteristics of HDSS datasets,taking into account algorithmic principles and an analysis of key control *** results demonstrated that logistic regression encounters difficulties when dealing with nonlinear data,whereas the extreme gradient boosting and gradient boosting decision tree algorithms exhibit greater sensitivity to abnormal *** contrast,the random forest algorithm proved to be insensitive to outliers and provided a reliable ranking of factors that influence CO_(2)huff and puff *** top five control factors identified were the distance between parallel wells,cumulative gas injection volume,liquid production rate of parallel wells,huff and puff timing,and heterogeneous Lorentz *** research find-ings not only contribute to the precise implementation of heavy oil CO_(2)huff and puff but also offer valuable insights into selecting classification algorithms for typical HDSS data.
A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting...
详细信息
A new classification algorithm for web mining is proposed on the basis of general classification algorithm for data mining in order to implement personalized information services. The building tree method of detecting class threshold is used for construction of decision tree according to the concept of user expectation so as to find classification rules in different layers. Compared with the traditional C4.5 algorithm, the disadvantage of excessive adaptation in C4.5 has been improved so that classification results not only have much higher accuracy but also statistic meaning.
A wide range of classification methods have been used for the early detection of financial risks in recent years. How to select an adequate classifier (or set of classifiers) for a given dataset is an important task i...
详细信息
A wide range of classification methods have been used for the early detection of financial risks in recent years. How to select an adequate classifier (or set of classifiers) for a given dataset is an important task in financial risk prediction. Previous studies indicate that classifiers' performances in financial risk prediction may vary using different performance measures and under different circumstances. The main goal of this paper is to develop a two-step approach to evaluate classification algorithms for financial risk prediction. It constructs a performance score to measure the performance of classification algorithms and introduces three multiple criteria decision making (MCDM) methods (i.e., TOPSIS, PROMETHEE, and VIKOR) to provide a final ranking of classifiers. An empirical study is designed to assess various classification algorithms over seven real-life credit risk and fraud risk datasets from six countries. The results show that linear logistic, Bayesian Network, and ensemble methods are ranked as the top-three classifiers by TOPSIS, PROMETHEE, and VIKOR. In addition, this work discusses the construction of a knowledge-rich financial risk management process to increase the usefulness of classification results in financial risk detection. (C) 2010 Elsevier B.V. All rights reserved.
An epilepsy classification system using electrocardiogram (ECG) data will ease the process of diagnosis. In epileptic patients, the seizures affect Heart Rate Variability (HRV). This emphasizes the importance of auton...
详细信息
ISBN:
(纸本)9781509036462
An epilepsy classification system using electrocardiogram (ECG) data will ease the process of diagnosis. In epileptic patients, the seizures affect Heart Rate Variability (HRV). This emphasizes the importance of autonomic function changes in diagnosing epilepsy. The present work proposes an algorithm that classifies a person as epileptic or nonepileptic using ECG signal. Time Domain Features (TDF) and Frequency Domain Features (FDF), derived from the R-R Intervals (RRI) of ECG signal are utilized. In addition, Statistical Features (SF) are derived from extracted TDF and FDF. The Support Vector Machines (SVM) classifier is used to classify the ECG signal as epileptic or nonepileptic based on the extracted TDF, FDF and SF. The classification accuracy of the proposed method exhibits 97.5%. Analysis on clinical data shows that the proposed combination of TDF, FDF and statistical HRV features gives excellent classification accuracy. These results indicate that the proposed method can be applied to wearable heart rate measuring devices for diagnostic purpose.
Early diagnosis of Breast Cancer is significantly important to treat the disease easily therefore it is necessary to develop techniques that can help physicians to get accurate diagnosis. This study suggests a hybrid ...
详细信息
ISBN:
(纸本)9781509009251
Early diagnosis of Breast Cancer is significantly important to treat the disease easily therefore it is necessary to develop techniques that can help physicians to get accurate diagnosis. This study suggests a hybrid classification algorithm which is based upon Genetic algorithm (GA) and k Nearest neighbor algorithm (kNN). GA algorithm has been used for its primary purpose as an optimization technique for kNN by selecting best features as well as optimization of the k value, while the kNN is used for classification purpose. The planned algorithm is tested by applying it on Wisconsin Breast Cancer Dataset from UCI Repository of Machine Learning Databases using different datasets in which the first is Wisconsin Breast Cancer Database (WBCD) and the second one is Wisconsin Diagnosis Breast Cancer (WDBC) which has changes in the number of attributes and number of instances. The proposed algorithm was measured against different classifier algorithms on the same database. The evaluation results of the algorithm proposed have achieved 99% accuracy.
暂无评论