With the recent rapid progress in the study of deep generative models (DGMs), there is a need for a framework that can implement them in a simple and generic way. In this research, we focus on two features of DGMs: (1...
详细信息
With the recent rapid progress in the study of deep generative models (DGMs), there is a need for a framework that can implement them in a simple and generic way. In this research, we focus on two features of DGMs: (1) deep neural networks are encapsulated by probability distributions, and (2) models are designed and learned based on an objective function. Taking these features into account, we propose a new python library to implement DGMs called Pixyz. This library adopts a step-by-step implementation method with three APIs, which allows us to implement various DGMs more concisely and intuitively. In addition, the library introduces memoization to reduce the cost of duplicate computations in DGMs to speed up the computation. We demonstrate experimentally that this library is faster than existing probabilistic programming languages in training DGMs.
Researchers have long been concerned with the extrapolation capabilities of machine learning (ML) models, particularly when dealing with insufficient training data. The recently proposed solution-guided machine learni...
详细信息
Researchers have long been concerned with the extrapolation capabilities of machine learning (ML) models, particularly when dealing with insufficient training data. The recently proposed solution-guided machine learning (SGML) method addresses this issue by integrating existing solutions as additional features to supplement limited training data. We have applied this method to solve the strong nonlinearity in nanoindentation and present an approximate solution to the tangential entropic force in an asymmetrical two dimensional bilayer. To make this method more accessible, we developed a user-friendly python library called SGML, available on GitHub and PyPI. This paper introduces the architecture and functionality of the library, provides a usage example, and discusses its potential impact and applications.
This paper is a documentation for datawindow, a python package which has the goal of simplifying the data processing and analysis techniques. Datawindow provides eight methods which allows users to quickly and easily ...
详细信息
The exploratory spatial data analysis (ESDA) process refers to the use of various functions to gain an initial understanding of a spatial dataset. These include measures of spatial heterogeneity and spatial autocorrel...
详细信息
The exploratory spatial data analysis (ESDA) process refers to the use of various functions to gain an initial understanding of a spatial dataset. These include measures of spatial heterogeneity and spatial autocorrelation. Currently, the ESDA process is repetitive and time-consuming. Additionally, while different results arise for different datasets, how these results are generated does not change significantly. Results are also generated individually for each variable which means that they cannot be easily compared or shared. The automation of the ESDA process would therefore have multiple benefits as it would not only save time, but it would also allow the data analyst to keep up with the rapid rate at which we generate data. This paper aims to introduce the first iteration of autoESDA - a python library capable of automating the ESDA process by summarising the results into a single report. In this paper, we present the defined high-level requirements for the implementation of autoESDA. Various dependency libraries are discussed and a high-level overview of the workflow of autoESDA is described. The library is then evaluated against the requirements laid out earlier in the study. Semi-structured interviews were carried out, which yielded a wealth of feedback and suggestions from the participants, describing how the output report could be improved. Finally, a roadmap of proposed further developments and improvements is discussed. The first version demonstrates that the automation of ESDA is possible and lays the foundation for further development in this regard. This is an important contribution to understanding spatial data as it enables the data analyst to keep up with the magnitude of data that is generated on a daily basis.
Hyperdimensional computing (HD), also known as vector symbolic architectures (VSA), is a framework for computing with distributed representations by exploiting properties of random high-dimensional vector spaces. The ...
详细信息
Hyperdimensional computing (HD), also known as vector symbolic architectures (VSA), is a framework for computing with distributed representations by exploiting properties of random high-dimensional vector spaces. The commitment of the scientific community to aggregate and disseminate research in this particularly multidisciplinary area has been fundamental for its advancement. Joining these efforts, we present Torchhd, a high-performance open source python library for HD/VSA. Torchhd seeks to make HD/VSA more accessible and serves as an efficient foundation for further research and application development. The easy-to-use library builds on top of PyTorch and features state-of-the-art HD/VSA functionality, clear documentation, and implementation examples from wellknown publications. Comparing publicly available code with their corresponding Torchhd implementation shows that experiments can run up to 100× faster. Torchhd is available at: https://***/hyperdimensional-computing/torchhd.
Many researchers have used fuzzy set theory and fuzzy logic in a variety of applications related to computer science and engineering, given the capability of fuzzy inference systems to deal with uncertainty, represent...
详细信息
Many researchers have used fuzzy set theory and fuzzy logic in a variety of applications related to computer science and engineering, given the capability of fuzzy inference systems to deal with uncertainty, represent vague concepts, and connect human language to numerical data. In this work we propose Simpful, a general-purpose and user-friendly python library designed to facilitate the definition, analysis, and interpretation of fuzzy inference systems. Simpful provides a lightweight Application Programming Interface that allows to intuitively define fuzzy sets and fuzzy rules, and to perform fuzzy inference. Worthy of note, in Simpful the fuzzy rules are specified by means of strings of text written in natural language. We provide here some practical examples to show that Simpful represents a valuable addition to the open-source software that supports fuzzy reasoning. (C) 2020 The Authors. Published by Atlantis Press B.V.
Tuning hyperparameters for machine learning algorithms is a tedious task, one that is typically done manually. To enable automated hyperparameter tuning, recent works have started to use techniques based on Bayesian o...
详细信息
ISBN:
(纸本)9781509066315
Tuning hyperparameters for machine learning algorithms is a tedious task, one that is typically done manually. To enable automated hyperparameter tuning, recent works have started to use techniques based on Bayesian optimization. However, to practically enable automated tuning for large scale machine learning training pipelines, significant gaps remain in existing libraries, including lack of abstractions, fault tolerance, and flexibility to support scheduling on any distributed computing framework. To address these challenges, we present Mango, a python library for parallel hyperparameter tuning. Mango enables the use of any distributed scheduling framework, implements intelligent parallel search strategies, and provides rich abstractions for defining complex hyperparameter search spaces that are compatible with scikit-learn. Mango is comparable in performance to Hyperopt [1], another widely used library. Mango is available open-source [2] and is currently used in production at Arm Research to provide state-of-art hyperparameter tuning capabilities.
In this paper, we present the first exploratory study of deprecated python library APIs to understand the status quo of API deprecation in the realm of python libraries. Specifically, we aim to comprehend how deprecat...
详细信息
ISBN:
(纸本)9781450370431
In this paper, we present the first exploratory study of deprecated python library APIs to understand the status quo of API deprecation in the realm of python libraries. Specifically, we aim to comprehend how deprecated library APIs are declared and documented in practice by their maintainers, and how library users react to them. By thoroughly looking into six reputed python libraries and 1,200 GitHub projects, we experimentally observe that API deprecation is poorly handled by library contributors, which subsequently introduce difficulties for python developers to resolve the usage of deprecated library APIs. This empirical evidence suggests that our community should take immediate actions to appropriately handle the deprecation of python library APIs.
Individual-based modelling (IBM) is a powerful tool for simulating complex biological communities. By defining a population as comprising individuals that differ from one another, IBM can simulate the assembly and org...
详细信息
Individual-based modelling (IBM) is a powerful tool for simulating complex biological communities. By defining a population as comprising individuals that differ from one another, IBM can simulate the assembly and organisation of complex communities under various eco-evolutionary processes in a large spatial scale, with tremendous variables or parameters considered simultaneously. IBM disentangles a complex system into various sub-systems interacting with each other, allowing us to develop a unified library with a modular design for a wide range of complex scenarios in community assembly. In such a library, a number of parameters-controlled processes can be primitively coded as the sub-systems (or sub-models). Here, we released a python-coded library as a framework for Metacommunity Individual-based Modelling (MetaIBM). As an open-source library, the MetaIBM has several merits, including: (a) it can be used to simulate a wide range of ecological problems of metacommunities. The metacommunity landscape and its environment gradients can be designed flexibly by users. Users can selectively turn off or on and set up parameters-controlled ecological processes according to their needs. (b) It adopts optimised algorithms and adapts to the high-performance computing devices, so that the users can explore a wide range of parameters space synchronously within a reasonable time. (c) It can be used to simulate a group of communities with up to millions of unique individuals, which is an originally plain portrayal of natural communities. To guide potential users, we provided the source codes of the library and a user manual. In the present article, we gave four examples to demonstrate how to design and model a metacommunity using the MetaIBM, simulating the community assembly in an islands-mainland model under the metacommunity framework with (a) neutral assumptions, (b) niche assumptions, (c) slow evolution scenarios, (d) rapid evolution scenarios. The examples showed that the
The increased speed and sensitivity in mass spectrometry-based proteomics has encouraged its use in biomedical research in recent years. Large-scale detection of proteins in cells, tissues, and whole organisms yields ...
详细信息
The increased speed and sensitivity in mass spectrometry-based proteomics has encouraged its use in biomedical research in recent years. Large-scale detection of proteins in cells, tissues, and whole organisms yields highly complex quantitative data, the analysis of which poses significant challenges. Standardized proteomic workflows are necessary to ensure automated, sharable, and reproducible proteomics analysis. Likewise, standardized data processing workflows are also essential for the overall reproducibility of results. To this purpose, we developed PaDuA, a python package optimized for the processing and analysis of (phospho)proteomics data. PaDuA provides a collection of tools that can be used to build scripted workflows within Jupyter Notebooks to facilitate bioinformatics analysis by both end-users and developers.
暂无评论