Residual metal removal can often be an inefficient, resource-heavy process, which involves multiple washing procedures. In the synthesis of Active Pharmaceutical Ingredients (APIs), transition-metal catalysis enables ...
详细信息
Residual metal removal can often be an inefficient, resource-heavy process, which involves multiple washing procedures. In the synthesis of Active Pharmaceutical Ingredients (APIs), transition-metal catalysis enables efficient synthesis of more complex structures;however, there is frequently a large amount of experimentation required to control residual transition metal amounts in the final products. We hypothesized that leveraging automation tools would allow generation of a data-driven, universal model for the prediction of palladium removal from typical workstreams. Novel automation methods for the preparation and generation of data were key to quickly understanding which parameters were the most important for the efficient chelation and removal of palladium. Parameters investigated included removal of both Pd(0) and Pd(II) as a function of temperature, organic solvent, pH, chelator identity, and equivalents of the chelator. The model indicates that chelator identity is the most important parameter followed by pH. This model, made available publicly, enables the prediction of the most successful conditions for palladium removal from organic reaction mixtures, reducing trial-and-error experimentation from the drug development process. Adoption of this methodology, leveraging automation and analytics, will allow for shortened development timelines when investigating cost-effective aqueous metal scavengers.
A recent thrust in turbulence closure modeling research is to incorporate machine learning (ML) elements, such as neural networks, for the purpose of enhancing the predictive capability to cover a broader class of flo...
详细信息
A recent thrust in turbulence closure modeling research is to incorporate machine learning (ML) elements, such as neural networks, for the purpose of enhancing the predictive capability to cover a broader class of flows. For generalizability to unseen flows, we submit that the data-driven ML approaches must preserve certain fundamental physical principles and closure tenets incumbent in physics-based (PB) models. We propose and investigate three elements to ensure the physical underpinnings of ML turbulence closures: (i) characteristic physical features and constraints that all (PB and ML) closure models must strive to satisfy;(ii) ML training scheme that infuses and preserves selected PB constraints;and (iii) physics-guided formulation of ML loss (objective) function to optimize models predictions. Current ML training and implementation strategies that can potentially cause significant physical incompatibilities and internal inconsistencies are identified. Means of mitigating inconsistencies and improving compatibility between different physical elements of the modeled system are developed. First, key closure constraints dictated by the model system dynamics are derived. Then a closed loop training procedure for enforcing the constraints in a self-consistent manner is proposed. Finally, the simple test case of turbulent channel flow is used to highlight the deficiencies in current ML methods and demonstrate improvements stemming from the proposed mitigation measures. In summary, this work addresses the need for physics-dictated guidance in the development of ML-enhanced turbulence closure models.
We propose a data-driven forward stochastic reachability analysis algorithm for a system with unknown dynamics. In this letter, we assume a limited number of trajectory data is available and one cannot obtain addition...
详细信息
We propose a data-driven forward stochastic reachability analysis algorithm for a system with unknown dynamics. In this letter, we assume a limited number of trajectory data is available and one cannot obtain additional data from the target system. The proposed algorithm learns the evolution of the state probability density function (pdf) as a Gaussian mixture model (GMM) from the given trajectory data and computes the pdf of the future state at a desired future time instance. We leverage the bootstrapping algorithm to account for the parameter estimation error of the GMM by computing the confidence interval of the estimated parameters. Then, the bootstrapped GMM is synthesized by selecting the optimal parameters within the confidence interval that yields the most informative model, thereby providing more reliable prediction results. The proposed algorithm is demonstrated via both numerical simulations and human subject experiments.
data-driven fault detection and classification (FDC) systems play an important role in ensuring the stability and security of modern industry. However, the security issue of the data-driven FDC itself poses new challe...
详细信息
data-driven fault detection and classification (FDC) systems play an important role in ensuring the stability and security of modern industry. However, the security issue of the data-driven FDC itself poses new challenges, where the model prediction can be seriously damaged by maliciously manipulated imperceptible perturbations, known as the adversarial attack. Since the adversarial attacks may threaten the FDC models and even the whole safety-critical industrial systems, there is an urgent need for the guarantee of data-driven model security. In this article, a scheme is presented for formally and completely verifying the security properties by treating the target problem as a convex mathematical programming. The major contribution on methodology is the verification under multiple norms via multiobjective optimization with a novel Pareto front approximation algorithm. Moreover, this work studies security verification for both supervised (fault classification) and unsupervised (fault detection) models under all mainstream norms simultaneously. For four basic data-driven models on two industrial datasets, our exclusive verification scheme provides deep and novel security insight into FDC systems. Moreover, we compare with related works to validate the algorithm performances of verification and Pareto front approximation.
The use of computationally demanding knowledge-driven models to optimize a process might encounter substantial numerical challenges. Because a model is an abstraction and approximation of the process, calculating the ...
详细信息
The use of computationally demanding knowledge-driven models to optimize a process might encounter substantial numerical challenges. Because a model is an abstraction and approximation of the process, calculating the exact model optimum might not be necessary because its industrial implementation is bound to be an approximate one. Here we are exploring an alternative optimization route through a surrogate model. Because one of the decision variables affecting the optimization is time-varying, the Design of Dynamic Experiments is used to estimate the surrogate model. The process considered here is a freeze-drying process widely used in the pharmaceutical industry. The model used is a stochastic model describing the process in great detail. It is shown that the proposed data-driven route calculates the optimum in about 8 h, as opposed to 22 h for the knowledge-driven model, while sacrificing only < 15% in the computed value of the process performance.
We propose a data-driven framework to increase the computational efficiency of the explicit finite element method in the structural analysis of soft tissue. An encoder-decoder long short-term memory deep neural networ...
详细信息
We propose a data-driven framework to increase the computational efficiency of the explicit finite element method in the structural analysis of soft tissue. An encoder-decoder long short-term memory deep neural network is trained based on the data produced by an explicit, distributed finite element solver. We leverage this network to predict synchronized displacements at shared nodes, minimizing the amount of communication between processors. We perform extensive numerical experiments to quantify the accuracy and stability of the proposed synchronization-avoiding algorithm.
In the absence of knowledge about challenging dynamic phenomena involved in batch distillation processes, e.g., complex flow regimes or appearing and vanishing phases, generation of accurate mechanistic models is limi...
详细信息
In the absence of knowledge about challenging dynamic phenomena involved in batch distillation processes, e.g., complex flow regimes or appearing and vanishing phases, generation of accurate mechanistic models is limited. Real plant data containing this missing information is scarce, also limiting the use of data-driven models. To exploit the information contained in measurement data and a related but inaccurate first-principles model, transfer learning from simulated to real plant data is analyzed. For the use case of a batch distillation column, the adapted model provides more accurate predictions than a data-driven model trained exclusively on scarce real plant data or simulated data. Its enhanced convergence and lower computational cost make it suitable for optimization in real-time.
data-driven approaches are an effective solution for modeling problems in machining. To increase the service life of hard-turned components, it is important to quantify the correlation between the cutting parameters s...
详细信息
data-driven approaches are an effective solution for modeling problems in machining. To increase the service life of hard-turned components, it is important to quantify the correlation between the cutting parameters such as feed rate, cutting speed and depth of cut and the near-surface properties. For obtaining high-quality models with small data sets, different data-driven approaches are investigated in this contribution. Additionally, models that enable uncertainty quantification are crucial for effective decision-making and the adjustment of cutting parameters. Therefore, parametric multiple polynomial regression and Takagi-Sugeno models, as well as non-parametric Gaussian process regression as a Bayesian approach are considered and compared regarding their capability to predict residual stress and surface roughness values of 51CrV4 specimens after hard-turning. Moreover, a novel method based on optimization of datadriven non-linear models is proposed that allows for identification of cutting parameter combinations, which at the same time lead to satisfactory surface roughness and residual stress states.
The water retention behavior-a critical factor of unsaturated flow in porous media-can be strongly affected by deformation in the solid matrix. However, it remains challenging to model the water retention behavior wit...
详细信息
The water retention behavior-a critical factor of unsaturated flow in porous media-can be strongly affected by deformation in the solid matrix. However, it remains challenging to model the water retention behavior with explicit consideration of its dependence on deformation. Here, we propose a data-driven approach that can automatically discover an interpretable model describing the water retention behavior of a deformable porous material, which can be as accurate as non-interpretable models obtained by other data-driven approaches. Specifically, we present a divide-and-conquer approach for discovering a mathematical expression that best fits a neural network trained with the data collected from a series of image-based drainage simulations at the pore-scale. We validate the predictive capability of the symbolically regressed counterpart of the trained neural network against unseen pore-scale simulations. Further, through incorporating the discovered symbolic function into a continuum-scale simulation, we showcase the inherent portability of the proposed approach: The discovered water retention model can provide results comparable to those from a hierarchical multi-scale model, while bypassing the need for sub-scale simulations at individual material points.
The present contribution is a follow-up of a recently conducted study to derive a datadriven model for the breakage of agglomerates by wall impacts. This time the collisioninduced breakage of agglomerates and concurre...
详细信息
The present contribution is a follow-up of a recently conducted study to derive a datadriven model for the breakage of agglomerates by wall impacts. This time the collisioninduced breakage of agglomerates and concurrently occurring particle agglomeration processes are considered in order to derive a model for Euler-Lagrange methods, in which agglomerates are represented by effective spheres. Although the physical problem is more challenging due to an increased number of influencing parameters, the strategy followed is very similar. In a first step extensive discrete element simulations are carried out to study a variety of binary inter-agglomerate collision scenarios. That includes different collision angles, collision velocities, agglomerate sizes and powders. The resulting extensive database accounts for back-bouncing, agglomeration and breakage events. Subsequently, the collision database is used for training artificial neural networks to predict the post-collision number of arising entities, their size distributions and their velocities. Finally, it is shown how the arising data-driven model can be incorporated into the Euler-Lagrange framework to be used in future studies for efficient computations of flows with high mass loadings. & COPY;2023 The Author(s). Published by Elsevier Ltd on behalf of Institution of Chemical Engineers. This is an open access article under the CC BY license (http://creative
暂无评论