Despite high pollution levels in Indian rivers, a comprehensive study on the water quality index (WQI) remains elusive. WQI values were computed, and their classes were determined using six water quality parameters fr...
详细信息
Despite high pollution levels in Indian rivers, a comprehensive study on the water quality index (WQI) remains elusive. WQI values were computed, and their classes were determined using six water quality parameters from an available decennial dataset (n = 3595) on Indian Rivers. This study aims to assess the spatial distribution of WQI values and their classes across Indian River systems while exploring the application of machine learning (ML) based models in predicting WQI classes using a reduced number of input parameters. modeling experiments were designed on five models- Decision Tree (DT), Random Forest (RF), Gradient Boosted Trees (GBT), Artificial Neural Network (ANN), and Support Vector Machine (SVM) for predicting WQI classes. Each model was trained with input parameters and WQI classes with 2990 datasets. Testing of WQI classes by each model was made on 605 datasets under different framework sets. Models' performance metrics were evaluated by accuracy, weighted mean recall and precision, and F-score. Our study demonstrates that the two largest systems, Ganga and Brahmaputra, lie on the extremes of the WQI (mean) spectrum, reflecting the impact of contrasting population density, industrial activities, change in land-use-land-cover pattern, and agricultural use on the riverine WQI. Our modeling experiments underscore that with only three input parameters, GBT can predict WQI classes with > 80% of performance metrics. With only two input parameters, GBT, RF, and ANN, all can provide reliable estimates. Our study highlights that ML models can serve as decision-supporting tools for water resource policymakers and managers in making effective pollution control and water resource management decisions.
The effects of pH (5.5, 6.5), temperature (4, 7 and 10 degrees C) and carbon dioxide (10, 30, 50, 70 and 90%) on the growth and/or survival of a five strain mixture of Listeria monocytogenes were examined in brain hea...
详细信息
The effects of pH (5.5, 6.5), temperature (4, 7 and 10 degrees C) and carbon dioxide (10, 30, 50, 70 and 90%) on the growth and/or survival of a five strain mixture of Listeria monocytogenes were examined in brain heart infusion broth. All three variables had a major influence on the growth characteristics of the organism. As expected, both the lag time and generation time increased as the CO, level increased, and as pH and temperature decreased. Growth over a 30-day period was observed at all parameter combinations tested, except at pH 5.5. 4 degrees C in the presence of either 50, 70 or 90% carbon dioxide. Two primary models, the Gompertz and Baranyi equations, were compared for their ability to describe the growth of L. monocytogenes. In general, the Gompertz model predicted both longer lag and shorter generation times, compared to the Baranyi model. The Baranyi model appeared to fit the overall data better than the Gompertz model. However, these differences were often small. Response surface models were developed for predicting the effects and interactions of the three independent variables on the growth and/or survival of L. monocytogenes in the different modified atmospheres. Results demonstrate the importance of strict temperature control for maintaining the advantages of food shelf life extension in enriched carbon dioxide environments. The information obtained in this study could be used as a guide to manufacturers of modified-atmosphere packaged foods, especially when designing products in which this organism may be a concern.
"Lifestyle politics" suggests that political and ideological opinions are strongly connected to our consumption choices, music and food taste, cultural preferences, and other aspects of our daily lives. With...
详细信息
"Lifestyle politics" suggests that political and ideological opinions are strongly connected to our consumption choices, music and food taste, cultural preferences, and other aspects of our daily lives. With the growing political polarization this idea has become all the more relevant to a wide range of social scientists. Empirical research in this domain, however, is confronted with an impractical challenge;this type of detailed information on people's lifestyle is very difficult to operationalize, and extremely time consuming and costly to query in a survey. A potential valuable alternative data source to capture these values and lifestyle choices is social media data. In this study, we explore the value of Facebook "like" data to complement traditional survey data to study lifestyle politics. We collect a unique dataset of Facebook likes and survey data of more than 6500 participants in Belgium, a fragmented multi-party system. Based on both types of data, we infer the political and ideological preference of our respondents. The results indicate that non-political Facebook likes are indicative of political preference and are useful to describe voters in terms of common interests, cultural preferences, and lifestyle features. This shows that social media data can be a valuable complement to traditional survey data to study lifestyle politics.
Background In-depth analysis of regulation networks of genes aberrantly expressed in cancer is essential for better understanding tumors and identifying key genes that could be therapeutically targeted. Results We dev...
详细信息
Background In-depth analysis of regulation networks of genes aberrantly expressed in cancer is essential for better understanding tumors and identifying key genes that could be therapeutically targeted. Results We developed a quantitative analysis approach to investigate the main biological relationships among different regulatory elements and target genes;we applied it to Ovarian Serous Cystadenocarcinoma and 177 target genes belonging to three main pathways (DNA REPAIR, STEM CELLS and GLUCOSE METABOLISM) relevant for this tumor. Combining data from ENCODE and TCGA datasets, we built a predictive linear model for the regulation of each target gene, assessing the relationships between its expression, promoter methylation, expression of genes in the same or in the other pathways and of putative transcription factors. We proved the reliability and significance of our approach in a similar tumor type (basal-like Breast cancer) and using a different existing algorithm (ARACNe), and we obtained experimental confirmations on potentially interesting results. Conclusions The analysis of the proposed models allowed disclosing the relations between a gene and its related biological processes, the interconnections between the different gene sets, and the evaluation of the relevant regulatory elements at single gene level. This led to the identification of already known regulators and/or gene correlations and to unveil a set of still unknown and potentially interesting biological relationships for their pharmacological and clinical use.
modeling mercury speciation is an important requirement for estimating harmful emissions from coal-fired power plants and developing strategies to reduce them. First-principle models based on chemical, kinetic, and th...
详细信息
modeling mercury speciation is an important requirement for estimating harmful emissions from coal-fired power plants and developing strategies to reduce them. First-principle models based on chemical, kinetic, and thermodynamic aspects exist, but these are complex and difficult to develop. The use of modem data-based machine learning techniques has been recently introduced, including neural networks. Here we propose an alternative approach using abductive networks based on the group method of data handling (GMDH) algorithm, with the advantages of simplified and more automated model synthesis, automatic selection of significant inputs, and more transparent input-output model relationships. Models were developed for predicting three types of mercury speciation (elemental, oxidized, and particulate) using a small dataset containing six inputs parameters on the composition of the coal used and boiler operating conditions. Prediction performance compares favourably with neural network models developed using the same dataset, with correlation coefficients as high as 0.97 for training data. Network committees (ensembles) are proposed as a means of improving prediction accuracy, and suggestions are made for future work to further improve performance. (c) 2006 Elsevier B.V. All rights reserved.
Asynchronous algorithms may increase the performance of parallel applications on large-scale HPC platforms due to decreased dependence among processing elements. This work investigates strategies for implementing asyn...
详细信息
Asynchronous algorithms may increase the performance of parallel applications on large-scale HPC platforms due to decreased dependence among processing elements. This work investigates strategies for implementing asynchronous hybrid parallel MPI-OpenMP iterative solvers. Seven different implementations are considered, and results show that striking a balance between communication and computation that balances the number of iterations in each processing element benefits performance and solution quality. A predictive performance model that utilizes kernel density estimation to model the underlying probability density function to the collected data is then developed to optimize execution parameters for a given problem. For the majority of iteration executions, the performance model matches within about 6% of the empirical data. The different hybrid parallel implementations are examined further to find optimal parametric distributions whose parameters can be tuned to the problem at hand. The generalized extreme value distribution was selected based on a combination of quantitative and qualitative comparisons, and for the most of the iteration executions, the model matches the data within about 6.1%. Results from the parametric distribution model are examined along with results of the model on related problems, and possible further extensions to the predictive model are discussed.
A predictive model is constructed for a radiative shock experiment, using a combination of a physics code and experimental measurements. The CRASH code can model the radiation hydrodynamics of the radiative shock laun...
详细信息
A predictive model is constructed for a radiative shock experiment, using a combination of a physics code and experimental measurements. The CRASH code can model the radiation hydrodynamics of the radiative shock launched by the ablation of a Be drive disk and driven down a tube filled with Xe. The code is initialized by a preprocessor that uses data from the Hyades code to model the initial 1.3 ns of the system evolution, with this data fit over seven input parameters by a Gaussian process model. The CRASH code output for shock location from 320 simulations is modeled by another Gaussian process model that combines the simulation data with eight field measurements of a CRASH experiment, and uses this joint model to construct a posterior distribution for the physical parameters of the simulation (model calibration). This model can then be used to explore sensitivity of the system to the input parameters. Comparison of the predicted shock locations in a set of leave-one-out exercises shows that the calibrated model can predict the shock location within experimental uncertainty. (C) 2011 Elsevier Ltd. All rights reserved.
The paper considers the problem of handling short sets of medical data. Effectively solving this problem will provide the ability to solve numerous classification and regression tasks in case of limited data in health...
详细信息
The paper considers the problem of handling short sets of medical data. Effectively solving this problem will provide the ability to solve numerous classification and regression tasks in case of limited data in health decision support systems. Many similar tasks arise in various fields of medicine. The authors improved the regression method of data analysis based on artificial neural networks by introducing additional elements into the formula for calculating the output signal of the existing RBFbased input-doubling method. This improvement provides averaging of the result, which is typical for ensemble methods, and allows compensating for the errors of different signs of the predicted values. These two advantages make it possible to significantly increase the accuracy of the methods of this class. It should be noted that the duration of the training algorithm of the advanced method remains the same as for existing method. Experimental modeling was performed using a real short medical data. The regression task in rheumatology was solved based on only 77 observations. The optimal parameters of the method, which provide the highest prediction accuracy based on MAE and RMSE,were selected experimentally. A comparison of its efficiency with other methods of this class has been performed. The highest accuracy of the proposed RBF-based additive input-doubling method among the considered ones is established. The method can be modified by using other nonlinear artificial intelligence tools to implement its training and application algorithms and such methods can be applied in various fields of medicine.
The biopharmaceutical industry continually seeks advancements in the commercial manufacturing of therapeutic proteins, where mammalian cell culture plays a pivotal role. The current work presents a novel data-driven p...
详细信息
The biopharmaceutical industry continually seeks advancements in the commercial manufacturing of therapeutic proteins, where mammalian cell culture plays a pivotal role. The current work presents a novel data-driven predictive modeling application designed to enhance the efficiency and predictability of cell culture processes in biotherapeutic production. The capability of the cloud-based digital data science application, developed using open-source tools, is demonstrated with respect to predicting bioreactor potency from at-line process parameters over a 5-day horizon. The uncertainty in model's prediction is quantified, providing valuable insights for process control and decision-making. Model validation on unseen data confirms the model's robust generalizability. An interactive dashboard, tailored to process scientist's requirements is also developed to streamline biopharmaceutical manufacturing processes, ultimately leading to enhanced productivity and product quality.
A predictive creep model is developed which uses the properties of matrix and reinforcement to predict the creep of polymer/layered silicate nanocomposites. Up to this point, primarily empirical creep models such as F...
详细信息
A predictive creep model is developed which uses the properties of matrix and reinforcement to predict the creep of polymer/layered silicate nanocomposites. Up to this point, primarily empirical creep models such as Findley and Burgers models have been used for creep of polymer/clay nanocomposites. The proposed creep model is based on the elastic-viscoelastic correspondence principle and a stiffness model of these nanocomposites. Also, the added stiffness of polymeric matrix due to the constraining effect of layered silicates on polymer chains in the nanocomposite is considered by a parameter termed constraint factor. The results of the proposed model show good agreement with experimental creep data for different clay contents, stresses and temperatures. Comparing the model predictions with experimental data, a logical relationship between the method of processing and the constraint factor is discovered which shows that in-situ polymerization can be more efficient for improving creep resistance of polymer/layered silicate nanocomposites relative to melt processing. (C) 2011 Elsevier Ltd. All rights reserved.
暂无评论