Hydrogen production by dark fermentation, where hydrogenase enzymes catalyze the oxidation or evolution of molecular hydrogen from two protons (H(+)) and electrons, is an economic and environmentally friendly technolo...
详细信息
Hydrogen production by dark fermentation, where hydrogenase enzymes catalyze the oxidation or evolution of molecular hydrogen from two protons (H(+)) and electrons, is an economic and environmentally friendly technology for producing clean energy. However, the long-term operations of a continuous anaerobic reactor for fermentative hydrogen production were frequently unstable. In this study, a kernel partial least squares (KPLS) algorithm is employed to develop an online estimation of the key process variables in a biological hydrogen production process by Enterobacter aerogenes in minimal time and with minimal cost. The KPLS approach is potentially very efficient for predicting key quality variables of nonlinear processes by mapping an original input space into a high-dimensional feature space. The proposed kernel-based algorithm could effectively capture the nonlinear relationship in the process variables and show far better performance in the prediction of the key process variable compared with the conventional linear PLS and other nonlinear PLS methods.
We propose a distributed method to compute similarity (also known as kernel and Gram) matrices used in various kernel-based machine learning algorithms. Current methods for computing similarity matrices have quadratic...
详细信息
ISBN:
(纸本)9780769551449
We propose a distributed method to compute similarity (also known as kernel and Gram) matrices used in various kernel-based machine learning algorithms. Current methods for computing similarity matrices have quadratic time and space complexities, which make them not scalable to large-scale data sets. To reduce these quadratic complexities, the proposed method first partitions the data into smaller subsets using various families of locality sensitive hashing, including random project and spectral hashing. Then, the method computes the similarity values among points in the smaller subsets to result in approximated similarity matrices. We analytically show that the time and space complexities of the proposed method are subquadratic. We implemented the proposed method using the Message Passing Interface (MPI) framework and ran it on a cluster. Our results with real large-scale data sets show that the proposed method does not significantly impact the accuracy of the computed similarity matrices and it achieves substantial savings in running time and memory requirements.
暂无评论