Kernel techniques are among the most influential approaches in data science and statistics. Under mild conditions, the reproducing kernel Hilbert space associated to a kernel is capable of encoding the independence of...
In various industries, optimizing manufacturing parameters is vital for the efficient production of high-quality products. Traditional methods involve costly production trials and process tuning – particularly when d...
详细信息
In various industries, optimizing manufacturing parameters is vital for the efficient production of high-quality products. Traditional methods involve costly production trials and process tuning – particularly when dealing with complex processes and materials such as composites. High-fidelity simulations offer a cost-effective alternative. However, they can be computationally intensive, which often renders them impracticable for iterative optimization. Surrogate model-based optimization (SuMO) provides a solution by using efficient, data-driven approximations. However, existing approaches often overlook valuable domain knowledge, such as material behavior, spatial relationships and optimization objective. We investigate different types of knowledge varying in complexity, difficulty to incorporate and transferability to other domains. In numerical studies on composite manufacturing – specifically, textile draping – we demonstrate that integrating such domain knowledge improves prediction accuracy, reduces optimization iterations, and enhances overall outcomes.
Kernel techniques are among the most influential approaches in data science and statistics. Under mild conditions, the reproducing kernel Hilbert space associated to a kernel is capable of encoding the independence of...
ISBN:
(纸本)9798331314385
Kernel techniques are among the most influential approaches in data science and statistics. Under mild conditions, the reproducing kernel Hilbert space associated to a kernel is capable of encoding the independence of M ≥ 2 random variables. Probably the most widespread independence measure relying on kernels is the so-called Hilbert-Schmidt independence criterion (HSIC; also referred to as distance covariance in the statistics literature). Despite various existing HSIC estimators designed since its introduction close to two decades ago, the fundamental question of the rate at which HSIC can be estimated is still open. In this work, we prove that the minimax optimal rate of HSIC estimation on ℝd for Borel measures containing the Gaussians with continuous bounded translation-invariant characteristic kernels is O(n-1/2). Specifically, our result implies the optimality in the minimax sense of many of the most-frequently used estimators (including the U-statistic, the V-statistic, and the Nyström-based one) on ℝd.
Kernel techniques are among the most influential approaches in data science and statistics. Under mild conditions, the reproducing kernel Hilbert space associated to a kernel is capable of encoding the independence of...
详细信息
Kernel techniques are among the most popular and powerful approaches of data science. Among the key features that make kernels ubiquitous are (i) the number of domains they have been designed for, (ii) the Hilbert str...
详细信息
We consider nonstationary multi-armed bandit problems where the model parameters of the arms change over time. We introduce the adaptive resetting bandit (ADR-bandit), a bandit algorithm class that leverages adaptive ...
详细信息
We consider nonstationary multi-armed bandit problems where the model parameters of the arms change over time. We introduce the adaptive resetting bandit (ADR-bandit), a bandit algorithm class that leverages adaptive windowing techniques from literature on data streams. We first provide new guarantees on the quality of estimators resulting from adaptive windowing techniques, which are of independent interest. Furthermore, we conduct a finite-time analysis of ADR-bandit in two typical environments: an abrupt environment where changes occur instantaneously and a gradual environment where changes occur progressively. We demonstrate that ADR-bandit has nearly optimal performance when abrupt or gradual changes occur in a coordinated manner that we call global changes. We demonstrate that forced exploration is unnecessary when we assume such global changes. Unlike the existing nonstationary bandit algorithms, ADR-bandit has optimal performance in stationary environments as well as nonstationary environments with global changes. Our experiments show that the proposed algorithms outperform the existing approaches in synthetic and real-world environments.
Even though intelligent systems such as Siri or Google Assistant are enjoyable (and useful) dialog partners, users can only access predefined functionality. Enabling end-users to extend the functionality of intelligen...
详细信息
Traceability information is a fundamental prerequisite for many essential software maintenance and evolution tasks, such as change impact and software reusability analyses. However, manually generating traceability in...
Traceability information is a fundamental prerequisite for many essential software maintenance and evolution tasks, such as change impact and software reusability analyses. However, manually generating traceability information is costly and error-prone. Therefore, researchers have developed automated approaches that utilize textual similarities between artifacts to establish trace links. These approaches tend to achieve low precision at reasonable recall levels, as they are not able to bridge the semantic gap between high-level natural language requirements and code. We propose to overcome this limitation by leveraging fine-grained, method and sentence level, similarities between the artifacts for traceability link recovery. Our approach uses word embeddings and a Word Mover's Distance-based similarity to bridge the semantic gap. The fine-grained similarities are aggregated according to the artifacts structure and participate in a majority vote to retrieve coarse-grained, requirement-to-class, trace links. In a comprehensive empirical evaluation, we show that our approach is able to outperform state-of-the-art unsupervised traceability link recovery approaches. Additionally, we illustrate the benefits of fine-grained structural analyses to word embedding-based trace link generation.
A growing number of machine learning (ML) projects in manufacturing require the collaboration of various experts. In addition to data scientists, stakeholders with production engineering knowledge have to specify and ...
详细信息
A growing number of machine learning (ML) projects in manufacturing require the collaboration of various experts. In addition to data scientists, stakeholders with production engineering knowledge have to specify and prioritize individual project tasks. data engineers prepare input data, while machine learning operations (MLOps) engineers ensure that trained models are deployed and monitored within IT landscapes. Existing project management approaches, e.g., Scrum, have problems for ML projects, as they do not consider various expert roles or ML project stages. We propose a project management approach defining a Kanban workflow by readjusting stages of ML development lifecycles, e.g., CRISP DM. This makes it possible to map expert roles to stages of the Kanban workflow. An adapted Kanban board allows visualizing and reviewing the status of all project tasks. We validate our approach with specific use cases, showing that it facilitates ML project management in manufacturing.
Even though intelligent systems such as Siri or Google Assistant are enjoyable (and useful) dialog partners, users can only access predefined functionality. Enabling end-users to extend the functionality of intelligen...
详细信息
暂无评论