检索结果-内蒙古大学图书馆

intSDM: An R Package for Building a reproducible workflow for the Field of Integrated Species Distribution Models

ECOLOGY AND EVOLUTION 2025年第3期15卷 e71029页

作者： Mostert, Philip S. Bjorkas, Ragnhild Bruls, Angeline J. H. M. Koch, Wouter Martin, Ellen C. Perrin, Sam W. Norwegian Univ Sci & Technol Dept Math Sci Trondheim Norway Norwegian Univ Sci & Technol Ctr Biodivers Dynam Trondheim Norway Norwegian Biodivers Informat Ctr Trondheim Norway Norwegian Univ Sci & Technol Gjaerevollsenteret Trondheim Norway

There has been an exponential increase in the quantity and type of biodiversity data in recent years, including presence-absence, counts, and presence-only citizen science data. Species Distribution Models (SDMs) have typically been used in ecology to estimate current and future ranges of species and are a common tool used when making conservation prioritization decisions. However, the integration of these data in a model-based framework is needed to address many of the current large-scale threats to biodiversity. Current SDM practice typically underutilizes the large amount of publicly available biodiversity data and does not follow a set of standard best practices. Integrating different data types with open-source tools and reproducible workflows saves time, increases collaboration opportunities, and increases the power of data inference in SDMs. We aim to address this issue by (1) proposing methods and (2) generating a reproducible workflow to integrate different available data types to increase the power of SDMs. We provide the R package intSDM, as well as guidance on how to accommodate users' diverse needs and ecological questions with different data types available on the Global Biodiversity Information Facility (GBIF), the largest biodiversity data aggregator in the world. Finally, we provide a case study of the application of our proposed reproducible workflow by creating SDMs for vascular plants in Norway, integrating presence-only and presence-absence species occurrence data as well as climate data.

关键词： citizen science data integration integrated species distribution model reproducible workflow spatial point process

来源：评论

学校读者我要写书评

暂无评论

nmrrr: A reproducible workflow for Binning and Visualizing NMR Spectra From Environmental Samples

引用

JOURNAL OF GEOPHYSICAL RESEARCH-BIOGEOSCIENCES 2023年第12期128卷 e2023JG007768-e2023JG007768页

作者： Patel, Kaizad F. Myers-Pigg, Allison N. Bond-Lamberty, Ben Barnes, Morgan E. Pacific Northwest Natl Lab Richland WA USA PNNL Marine & Coastal Res Lab Sequim WA USA PNNL Joint Global Change Res Inst College Pk MD USA

Nuclear magnetic resonance (NMR) spectroscopy is a useful tool for detection and identification of molecular structural information, with increasing applications in environmental sciences. NMR instrument outputs are however heterogeneous and require extensive post-processing, creating barriers to their use and application by non-specialists. Here, we report on a new open-source R package, nmrrr, that processes and visualizes spectral data obtained from one-dimensional solution-state and solid-state NMR experiments;the package also performs relevant calculations commonly applied in natural organic matter communities, such as computing the relative abundance of various functional groups. We document the package's installation, dependencies, and functions;and provide a standard workflow for processing NMR data. This package is currently available on CRAN and GitHub, and community contributions are welcome.

关键词： NMR R package reproducible workflow CRAN GitHub

来源：评论

学校读者我要写书评

暂无评论

The nlrx r package: A next-generation framework for reproducible NetLogo model analyses

引用

METHODS IN ECOLOGY AND EVOLUTION 2019年第11期10卷 1854-1863页

作者： Salecker, Jan Sciaini, Marco Meyer, Katrin M. Wiegand, Kerstin Univ Gottingen Dept Ecosyst Modelling Gottingen Germany Univ Goettingen Ctr Biodivers & Sustainable Land Use CBL Gottingen Germany

Agent-based models find wide application in all fields of science where large-scale patterns emerge from properties of individuals. Due to increasing capacities of computing resources it was possible to improve the level of detail and structural realism of next-generation models in recent years. However, this is at the expense of increased model complexity, which requires more efficient tools for model exploration, analysis and documentation that enable reproducibility, repeatability and parallelization. NetLogo is a widely used environment for agent-based model development, but it does not provide sufficient built-in tools for extensive model exploration, such as sensitivity analyses. One tool for controlling NetLogo externally is the r-package RNetLogo. However, this package is not suited for efficient, reproducible research as it has stability and resource allocation issues, is not straightforward to be setup and used on high performance computing clusters and does not provide utilities, such as storing and exchanging metadata, in an easy way. We present the r-package nlrx, which overcomes stability and resource allocation issues by running NetLogo simulations via dynamically created XML experiment files. Class objects make setting up experiments more convenient and helper functions provide many parameter exploration approaches, such as Latin Hypercube designs, Sobol sensitivity analyses or optimization approaches. Output is automatically collected in user-friendly formats and can be post-processed with provided utility functions. nlrx enables reproducibility by storing all relevant information and simulation output of experiments in one r object which can conveniently be archived and shared. We provide a detailed description of the nlrx package functions and the overall workflow. We also present a use case scenario using a NetLogo model, for which we performed a sensitivity analysis and a genetic algorithm optimization. The nlrx package is the first framework fo

关键词： agent-based modelling algorithm optimization individual-based modelling model analysis NetLogo r package reproducible workflow sensitivity analysis

来源：评论

学校读者我要写书评

暂无评论

A Journey from Wild to Textbook Data to Reproducibly Refresh the Wages Data from the National Longitudinal Survey of Youth Database

引用

JOURNAL OF STATISTICS AND DATA SCIENCE EDUCATION 2022年第3期30卷 289-303页

作者： Amaliah, Dewi Cook, Dianne Tanaka, Emi Hyde, Kate Tierney, Nicholas Monash Univ Dept Econometr & Business Stat Clayton Vic Australia Telethon Kids Inst Geospatial Hlth & Dev Nedlands WA Australia

Textbook data is essential for teaching statistics and data science methods because it is clean, allowing the instructor to focus on methodology. Ideally textbook datasets are refreshed regularly, especially when they are subsets taken from an ongoing data collection. It is also important to use contemporary data for teaching, to imbue the sense that the methodology is relevant today. This article describes the trials and tribulations of refreshing a textbook dataset on wages, extracted from the National Longitudinal Survey of Youth (NLSY79) in the early 1990s. The data is useful for teaching modeling and exploratory analysis of longitudinal data. Subsets of NLSY79, including the wages data, can be found in supplementary materials from numerous textbooks and research articles. The NLSY79 database has been continually updated through to 2018, so new records are available. Here we describe our journey to refresh the wages data, and document the process so that the data can be regularly updated into the future. Our journey was difficult because the steps and decisions taken to get from the raw data to the wages textbook subset have not been clearly articulated. We have been diligent to provide a reproducible workflow for others to follow, which also hopefully inspires more attempts at refreshing data for teaching. Three new datasets and the code to produce them are provided in the open source R package called yowie. Supplementary materials for this article are available online.

关键词： Data cleaning Data tidying Initial data analysis Longitudinal data NLSY79 reproducible workflow

来源：评论

学校读者我要写书评

暂无评论

Assessing species coverage and assembly quality of rapidly accumulating sequenced genomes

引用

GIGASCIENCE 2022年第11期11卷 giac006页

作者： Feron, Romain Waterhouse, Robert M. Univ Lausanne Dept Ecol & Evolut Biophore UNIL Sorge CH-1015 Lausanne Switzerland Swiss Inst Bioinformat Amphipole UNIL Sorge Evolutionary Funct Genom Grp CH-1015 Lausanne Switzerland

Background Ambitious initiatives to coordinate genome sequencing of Earth's biodiversity mean that the accumulation of genomic data is growing rapidly. In addition to cataloguing biodiversity, these data provide the basis for understanding biological function and evolution. Accurate and complete genome assemblies offer a comprehensive and reliable foundation upon which to advance our understanding of organismal biology at genetic, species, and ecosystem levels. However, ever-changing sequencing technologies and analysis methods mean that available data are often heterogeneous in quality. To guide forthcoming genome generation efforts and promote efficient prioritization of resources, it is thus essential to define and monitor taxonomic coverage and quality of the data. Findings Here we present an automated analysis workflow that surveys genome assemblies from the United States NCBI, assesses their completeness using the relevant BUSCO datasets, and collates the results into an interactively browsable resource. We apply our workflow to produce a community resource of available assemblies from the phylum Arthropoda, the Arthropoda Assembly Assessment Catalogue. Using this resource, we survey current taxonomic coverage and assembly quality at the NCBI, examine how key assembly metrics relate to gene content completeness, and compare results from using different BUSCO lineage datasets. Conclusions These results demonstrate how the workflow can be used to build a community resource that enables large-scale assessments to survey species coverage and data quality of available genome assemblies, and to guide prioritizations for ongoing and future sampling, sequencing, and genome generation initiatives.

关键词： arthropod genomes biodiversity genomics BUSCO assessments genome assembly genome quality database reproducible workflow

来源：评论

学校读者我要写书评

暂无评论

TopicTracker - An advanced software pipeline for text mining on PubMed data: Bridging the gap between off-the-shelf tools and code based approaches

引用

HELIYON 2024年第17期10卷 e36351页

作者： Spitale, Giovanni Germani, Federico Biller-Andorno, Nikola Univ Zurich Inst Biomed Eth & Hist Med Zurich Switzerland Inst Biomed Eth & Hist Med IBME Winterthurerstr 30 CH-8006 Zurich Switzerland

Background: The ever-increasing volume of academic literature necessitates efficient and sophisticated tools for researchers to analyze, interpret, and uncover trends. Traditional search methods, while valuable, often fail to capture the nuance and interconnectedness of vast research domains. Results: TopicTracker, a novel software tool, addresses this gap by providing a comprehensive solution from querying PubMed databases to creating intricate semantic network maps. Through its functionalities, users can systematically search for desired literature, analyze trends, and visually represent co-occurrences in a given field. Our case studies, including support for the WHO on ethical considerations in infodemic management and mapping the evolution of ethics pre- and post-pandemic, underscore the tool's applicability and precision. Conclusions: TopicTracker represents a significant advancement in academic research tools for text mining. While it has its limitations, primarily tied to its alignment with PubMed, its benefits far outweigh the constraints. As the landscape of research continues to expand, tools like TopicTracker may be instrumental in guiding scholars in their pursuit of knowledge, ensuring they navigate the large amount of literature with clarity and precision.

关键词： PubMed data analysis Text mining pipeline Scientific literature processing Customizable text mining Automated literature review Open-source text analysis reproducible workflow

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：