检索结果-内蒙古大学图书馆

UTAP2: an enhanced user-friendly transcriptome and epigenome analysis pipeline

BMC bioinformatics 2025年第1期26卷 1-12页

作者： Lindner, Jordana Dassa, Bareket Wigoda, Noa Stelzer, Gil Feldmesser, Ester Prilusky, Jaime Leshkowitz, Dena Weizmann Inst Sci Dept Life Sci Core Facil Bioinformat Unit IL-76100 Rehovot Israel

BackgroundThe emergence of next-generation sequencing (NGS) marked a revolution in biological research, enabling comprehensive characterization of the transcriptome and detailed analysis of the epigenome landscape. This technology has made it possible to detect differences across cell types, genotypes, and conditions. Advances in short-read sequencing platforms, have produced user-friendly machines that offer high throughput at a reduced cost per base. However, leveraging this data still requires bioinformatics expertise to develop and execute tailored solutions for each specific application. Democratizing access to sequence analysis tools is crucial to empower researchers from diverse fields to harness the full potential of NGS ***2, our enhanced version of UTAP published version in 2019 (Kohen et al. in BMC Bioinform 20(1):154, 2019), empowers researchers to unlock the mysteries of gene expression and epigenetic modifications with ease. This user-friendly, open-source pipeline, built by unit programmers and deep sequencing analysts, streamlines transcriptome and epigenome data analysis, handling everything from sequences to gene or peak counts and differentially expressed genes or genomic regions annotation. Results are delivered in organized folders and rich reports packed with plots, tables, and links for effortless interpretation. Since the debut of UTAP, it has been embraced by many researchers at the Weizmann Institute and over 100 citations, thus highlighting its scientific *** User-friendly Transcriptome and Epigenome Analysis Pipeline UTAP2 is available to the broader biomedical research community as an open-source installation. With a single image, it can be installed on both local servers and cloud platforms, allowing users to leverage parallel cluster resources. Once installed UTAP2 enables researchers, even those with limited bioinformatics skills to efficiently, accurately and reliably analyse transcriptome and epig

关键词： NGS (next-generation sequencing) RNA-seq ATAC-seq ChIP-seq Ribo-Seq Differential gene expression Peak calling Gene-set enrichment Clustering Transcriptome Epigenome RNA-Seq Sequence analysis pipelines bioinformatics workflow Genome mapping Bulk MARS-Seq UMI (unique molecular identifier) Gene expression profile Normalization

来源：评论

学校读者我要写书评

暂无评论

Closha 2.0: a bio-workflow design system for massive genome data analysis on high performance cluster infrastructure

引用

BMC bioinformatics 2024年第1期25卷 353页

作者： Ko, Gunhwan Kim, Pan-Gyu Yoon, Byung-Ha Kim, JaeHee Song, Wangho Byeon, IkSu Yoon, JongCheol Lee, Byungwook Kim, Young-Kuk KRIBB Korean Bioinformat Ctr KOBIC 125 Gwahangno Daejeon 34141 South Korea Chungnam Natl Univ Dept Bio AI Convergence Daejeon 34134 South Korea

BackgroundThe explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and significant computational challenges. As the cost of next-generation sequencing (NGS) has decreased, the amount of genomic data has surged globally. However, the cost and complexity of the computational resources required continue to be substantial barriers to leveraging big data. A promising solution to these computational challenges is cloud computing, which provides researchers with the necessary CPUs, memory, storage, and software ***, we present Closha 2.0, a cloud computing service that offers a user-friendly platform for analyzing massive genomic datasets. Closha 2.0 is designed to provide a cloud-based environment that enables all genomic researchers, including those with limited or no programming experience, to easily analyze their genomic data. The new 2.0 version of Closha has more user-friendly features than the previous 1.0 version. Firstly, the workbench features a script editor that supports Python, R, and shell script programming, enabling users to write scripts and integrate them into their pipelines. This functionality is particularly useful for downstream analysis. Second, Closha 2.0 runs on containers, which execute each tool in an independent environment. This provides a stable environment and prevents dependency issues and version conflicts among tools. Additionally, users can execute each step of a pipeline individually, allowing them to test applications at each stage and adjust parameters to achieve the desired results. We also updated a high-speed data transmission tool called GBox that facilitates the rapid transfer of large *** analysis pipelines on Closha 2.0 are reproducible, with all analysis parameters and inputs being permanently recorded. Closha 2.0 simplifies multi-step analysis with drag-and-drop functionality and provides a user-friendly interface for genomic scientists to obtain accur

关键词： Closha 2.0 Next-generation sequencing (NGS) Cloud computing bioinformatics workflow High-performance computing (HPC) Genomic data analysis User-friendly interface Data transmission (GBox) Single-cell RNA sequencing (scRNA-Seq)

来源：评论

学校读者我要写书评

暂无评论

MAGqual: a stand-alone pipeline to assess the quality of metagenome-assembled genomes

引用

MICROBIOME 2024年第1期12卷 226页

作者： Cansdale, Annabel Chong, James P. J. Univ York Ctr Excellence Anaerob Digest Dept Biol Wentworth Way York YO10 5DD England

BackgroundMetagenomics, the whole genome sequencing of microbial communities, has provided insight into complex ecosystems. It has facilitated the discovery of novel microorganisms, explained community interactions and found applications in various fields. Advances in high-throughput and third-generation sequencing technologies have further fuelled its popularity. Nevertheless, managing the vast data produced and addressing variable dataset quality remain ongoing challenges. Another challenge arises from the number of assembly and binning strategies used across studies. Comparing datasets and analysis tools is complex as it requires the quantitative assessment of metagenome quality. The inherent limitations of metagenomic sequencing, which often involves sequencing complex communities, mean community members are challenging to interrogate with traditional culturing methods leading to many lacking reference sequences. MIMAG standards aim to provide a method to assess metagenome quality for comparison but have not been widely *** address the need for simple and quick metagenome quality assignation, here we introduce the pipeline MAGqual (Metagenome-Assembled Genome qualifier) and demonstrate its effectiveness at determining metagenomic dataset quality in the context of the MIMAG *** MAGqual pipeline offers an accessible way to evaluate metagenome quality and generate metadata on a large scale. MAGqual is built in Snakemake to ensure readability and scalability, and its open-source nature promotes accessibility, community development, and ease of updates. MAGqual is built in Snakemake, R, and Python and is available under the MIT license on GitHub at https://***/ac1513/***_9RbKd3CYdEq3zCTHKaqVideo AbstractConclusionsThe MAGqual pipeline offers an accessible way to evaluate metagenome quality and generate metadata on a large scale. MAGqual is built in Snakemake to ensure readability and scalability, and its open-source na

关键词： Metagenomics Snakemake bioinformatics Pipeline MAGs Microbiome Metagenome-assembled genomes bioinformatics workflow

来源：评论

学校读者我要写书评

暂无评论

Exploring S-RNase diversity in the Andean black cherry (Prunus serotina) using MinION sequencing: a cost-effective approach with increased genotyping resolution

引用

EUPHYTICA 2023年第10期219卷 1-15页

作者： Becerra-Wong, Monica Gordillo-Romero, Milton Baus, Lisa C. Teran-Velastegui, Martin Torres, Maria de Lourdes Torres, Andres F. Univ San Francisco Quito Lab Biotecnol Vegetal COCIBA Quito Ecuador Ludwig Maximilians Univ Munchen Fac Biol Genet D-82152 Martinsried Germany

The Andean black cherry (P. serotina) is an underutilized fruit species that could contribute to the development of sustainable food systems in the Andean region. The species displays gametophytic self-incompatibility (GSI), a mechanism controlled by the multiallelic S-locus which prevents crossbreeding between genetically related individuals and hinders breeding efforts. To design effective crosses, breeders require accurate knowledge of the S-haplotypes of parental lines. However, S-haplotype diversity is commonly evaluated using PCR-based methods that fail to accurately discriminate alleles. To address this limitation, we developed a new method to identify S-alleles in P. serotina using nanopore sequencing technology. Our method uses the Native Barcoding protocol and MinION sequencer from Oxford Nanopore Technologies to enable scalable, multiplex amplicon sequencing. For sequence analysis, we developed a bioinformatic pipeline that uses Porechop for sample demultiplexing, MeshClust for sequence alignment and clustering, and the Ugene Consensus algorithm to determine allelic variants. In this study, we evaluated the S-RNase gene of 24 P. serotina accessions using our nanopore sequencing and bioinformatic workflow. Among these accessions, we identified 12 previously reported and 6 putative new S-alleles that could not be identified with existing S-genotyping methods. Five accessions were classified as homozygous, while the other 19 were heterozygous with two or three alleles. Our results demonstrate that nanopore sequencing provides a cost-effective alternative for S-allele profiling that improves on the accuracy of existing PCR-based methods. Because of the versatility of MinION sequencing, the reported workflow can be used to characterize the diversity of other useful genes in the species, which are of relevance for conservation and breeding efforts.

关键词： Nanopore MinION Prunus S-alleles bioinformatics workflow

来源：评论

学校读者我要写书评

暂无评论

Optimizing the Illumina COVIDSeq laboratorial and bioinformatics pipeline on thousands of samples for SARS-CoV-2 Variants of Concern tracking

引用

COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL 2022年 20卷 2558-2563页

作者： Donzelli, Sara Ciuffreda, Ludovica Pontone, Martina Betti, Martina Massacci, Alice Mottini, Carla De Nicola, Francesca Orlandi, Giulia Goeman, Frauke Giuliani, Eugenia Sperandio, Eleonora Piaggio, Giulia Morrone, Aldo Ciliberto, Gennaro Fanciulli, Maurizio Blandino, Giovanni Pimpinelli, Fulvia Pallocca, Matteo IRCCS Regina Elena Natl Canc Inst Oncogen & Epigenet Rome Italy IRCCS Regina Elena Natl Canc Inst SAFU Unit Rome Italy IRCCS San Gallicano Dermatol Inst Microbiol & Virol Unit Rome Italy IRCCS Regina Elena Natl Canc Inst Biostat Bioinformat & Clin Trial Ctr Rome Italy IRCCS San Gallicano Dermatol Inst Sci Direct Rome Italy IRCCS Regina Elena Natl Canc Inst Sci Direct Rome Italy

The SARS-CoV-2 Variants of Concern tracking via Whole Genome Sequencing represents a pillar of public health measures for the containment of the pandemic. The ability to track down the lineage distribution on a local and global scale leads to a better understanding of immune escape and to adopting interventions to contain novel outbreaks. This scenario poses a challenge for NGS laboratories worldwide that are pressed to have both a faster turnaround time and a high-throughput processing of swabs for sequencing and analysis. In this study, we present an optimization of the Illumina COVID-seq protocol carried out on thousands of SARS-CoV-2 samples at the wet and dry level. We discuss the unique challenges related to processing hundreds of swabs per week such as the tradeoff between ultra-high sensitivity and negative contamination levels, cost efficiency and bioinformatics quality metrics.(c) 2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).

关键词： Illumina COVID-seq SARS-CoV-2 genome SARS-CoV-2 mutation COVID mutations SARS-CoV-2 Variants of Concern bioinformatics workflow Oncology Oncology Metagenomics

来源：评论

学校读者我要写书评

暂无评论

Cancer Predisposition Sequencing Reporter (CPSR): A flexible variant report engine for high-throughput germline screening in cancer

引用

INTERNATIONAL JOURNAL OF CANCER 2021年第11期149卷 1955-1960页

作者： Nakken, Sigve Saveliev, Vladislav Hofmann, Oliver Moller, Pal Myklebost, Ola Hovig, Eivind Oslo Univ Hosp Inst Canc Res Dept Tumor Biol Oslo Norway Univ Oslo Fac Med Ctr Canc Cell Reprogramming Inst Clin Med Oslo Norway Univ Melbourne Ctr Canc Res Melbourne Vic Australia Univ Bergen Dept Clin Sci Bergen Norway Haukeland Hosp Western Norway Familial Canc Ctr Bergen Norway Univ Oslo Ctr Bioinformat Dept Informat Oslo Norway

The value of high-throughput germline genetic testing is increasingly recognized in clinical cancer care. Disease-associated germline variants in cancer patients are important for risk management and surveillance, surgical decisions and can also have major implications for treatment strategies since many are in DNA repair genes. With the increasing availability of high-throughput DNA sequencing in cancer clinics and research, there is thus a need to provide clinically oriented sequencing reports for germline variants and their potential therapeutic relevance on a per-patient basis. To meet this need, we have developed the Cancer Predisposition Sequencing Reporter (CPSR), an open-source computational workflow that generates a structured report of germline variants identified in known cancer predisposition genes, highlighting markers of therapeutic, prognostic and diagnostic relevance. A fully automated variant classification procedure based on more than 30 refined American College of Medical Genetics and Genomics (ACMG) criteria represents an integral part of the workflow. Importantly, the set of cancer predisposition genes profiled in the report can be flexibly chosen from more than 40 virtual gene panels established by scientific experts, enabling customization of the report for different screening purposes and clinical contexts. The report can be configured to also list actionable secondary variant findings, as recommended by ACMG. CPSR demonstrates comparable sensitivity and specificity for the detection of pathogenic variants when compared to other algorithms in the field. Technically, the tool is implemented in Python/R, and is freely available through Docker technology. Source code, documentation, example reports and installation instructions are accessible via the project GitHub page: .

关键词： bioinformatics workflow cancer germline testing clinical decision support precision cancer medicine variant interpretation

来源：评论

学校读者我要写书评

暂无评论

CellHeap: A workflow for Optimizing COVID-19 Single-Cell RNA-Seq Data Processing in the Santos Dumont Supercomputer 14th

CellHeap: A Workflow for Optimizing COVID-19 Single-Cell RNA...

引用

14th Brazilian Symposium on bioinformatics (BSB)

作者： Silva, Vanessa S. Costa, Maiana O. C. Castro, Maria Clicia S. Silva, Helena S. Walter, Maria Emilia M. T. Melo, Alba C. M. A. Ocana, Kary A. C. dos Santos, Marcelo T. Nicolas, Marisa F. Carvalho, Anna Cristina C. Henriques-Pons, Andrea Silva, Fabricio A. B. Fundacao Oswaldo Cruz Rio De Janeiro Brazil Natl Lab Sci Comp Petropolis RJ Brazil Univ Brasilia Brasilia DF Brazil Univ Estado Rio De Janeiro Rio De Janeiro Brazil

ISBN: (纸本)9783030918132;9783030918149

Currently, several hundreds of Terabytes of COVID-19 single-cell RNA-seq (scRNA-seq) data are available in public repositories. This data refers to multiple tissues, comorbidities, and conditions. We expect this trend to continue, and it is realistic to predict amounts of COVID-19 scRNA-seq data increasing to several Petabytes in the coming years. However, thoughtful analysis of this data requires large-scale computing infrastructures, and software systems optimized for such platforms to generate biological knowledge. This paper presents CellHeap, a portable and robust workflow for scRNA-seq customizable analyses, with quality control throughout the execution steps and deployable on supercomputers. Furthermore, we present the deployment of CellHeap in the Santos Dumont supercomputer for analyzing COVID-19 scRNA-seq datasets, and discuss a case study that processed dozens of Terabytes of COVID-19 scRNA-seq raw data.

关键词： Single-cell RNA-seq bioinformatics workflow COVID-19 High-performance computing

来源：评论

学校读者我要写书评

暂无评论

A detailed workflow to develop QIIME2-formatted reference databases for taxonomic analysis of DNA metabarcoding data

引用

BMC GENOMIC DATA 2022年第1期23卷 53-53页

作者： Dubois, Benjamin Debode, Frederic Hautier, Louis Hulin, Julie San Martin, Gilles Delvaux, Alain Janssen, Eric Mingeot, Dominique Walloon Agr Res Ctr Bioengn Unit Life Sci Dept Chaussee Charleroi 234 B-5030 Gembloux Belgium Walloon Agr Res Ctr Plant & Forest Hlth Unit Life Sci Dept Rue Liroux 2 B-5030 Gembloux Belgium Walloon Agr Res Ctr Knowledge & Valorizat Agr Prod Dept Qual & Authenticat Unit Chaussee Namur 24 B-5030 Gembloux Belgium Walloon Agr Res Ctr Protect Control Prod & Residues Unit Knowledge & Valorizat Agr Prod Dept Rue Bordia 11 B-5030 Gembloux Belgium

Background The DNA metabarcoding approach has become one of the most used techniques to study the taxa composition of various sample types. To deal with the high amount of data generated by the high-throughput sequencing process, a bioinformatics workflow is required and the QIIME2 platform has emerged as one of the most reliable and commonly used. However, only some pre-formatted reference databases dedicated to a few barcode sequences are available to assign taxonomy. If users want to develop a new custom reference database, several bottlenecks still need to be addressed and a detailed procedure explaining how to develop and format such a database is currently missing. In consequence, this work is aimed at presenting a detailed workflow explaining from start to finish how to develop such a curated reference database for any barcode sequence. Results We developed DB4Q2, a detailed workflow that allowed development of plant reference databases dedicated to ITS2 and rbcL, two commonly used barcode sequences in plant metabarcoding studies. This workflow addresses several of the main bottlenecks connected with the development of a curated reference database. The detailed and commented structure of DB4Q2 offers the possibility of developing reference databases even without extensive bioinformatics skills, and avoids 'black box' systems that are sometimes encountered. Some filtering steps have been included to discard presumably fungal and misidentified sequences. The flexible character of DB4Q2 allows several key sequence processing steps to be included or not, and downloading issues can be avoided. Benchmarking the databases developed using DB4Q2 revealed that they performed well compared to previously published reference datasets. Conclusion This study presents DB4Q2, a detailed procedure to develop custom reference databases in order to carry out taxonomic analyses with QIIME2, but also with other bioinformatics platforms if desired. This work also provides ready-to-

关键词： Reference database QIIME2 bioinformatics workflow Metabarcoding High-throughput sequencing ITS2 rbcL Plant

来源：评论

学校读者我要写书评

暂无评论

aTAP: automated transcriptome analysis platform for processing RNA-seq data by de novo assembly

引用

HELIYON 2022年第8期8卷 e10255页

作者： Surachat, Komwit Taylor, Todd Duane Wattanamatiphot, Wanicbut Sukpisit, Sukgamon Jeenkeawpiam, Kongpop Prince Songkla Univ Fac Med Dept Biomed Sci & Biomed Engn Hat Yai 90110 Songkhla Thailand Prince Songkla Univ Fac Med Translat Med Res Ctr Hat Yai 90110 Songkhla Thailand Prince Songkla Univ Fac Sci Mol Evolut & Computat Biol Res Unit Hat Yai 90110 Songkhla Thailand RIKEN Ctr Integrat Med Sci Yokohama Kanagawa 2300045 Japan Prince Songkla Univ Fac Sci Div Computat Sci Hat Yai 90110 Songkhla Thailand

RNA-seq is a sequencing technique that uses next-generation sequencing (NGS) to explore and study the entire transcriptome of a biological sample. NGS-based analyses are mostly performed via command-line interfaces, which is an obstacle for molecular biologists and researchers. Therefore, the higher throughputs from NGS can only be accessed with the help of bioinformatics and computer science expertise. As the cost of sequencing is continuously falling, the use of RNA-seq seems certain to increase. To minimize the problems encountered by biologists and researchers in RNA-seq data analysis, we propose an automated platform with a web application that integrates various bioinformatics pipelines. The platform is intended to enable academic users to more easily analyze transcriptome datasets. Our automated Transcriptome Analysis Platform (aTAP) offers comprehensive bioinformatics workflows, including quality control of raw reads, trimming of low-quality reads, de novo tran-scriptome assembly, transcript expression quantification, differential expression analysis, and transcript annota-tion. aTAP has a user-friendly graphical interface, allowing researchers to interact with and visualize results in the web browser. This project offers an alternative way to analyze transcriptome data, by integrating efficient and well-known tools, that is simpler and more accessible to research communities. aTAP is freely available to aca-demic users at https://***/.

关键词： Transcriptome RNA-seq bioinformatics workflow Differentially expressed genes Gene expression profile

来源：评论

学校读者我要写书评

暂无评论

workflow for Rapidly Extracting Biological Insights from Complex, Multicondition Proteomics Experiments with WGCNA and PloGO2

引用

JOURNAL OF PROTEOME RESEARCH 2020年第7期19卷 2898-2906页

作者： Wu, Jemma X. Pascovici, Dana Wu, Yunqi Walker, Adam K. Mirzaei, Mehdi Macquarie Univ Australian Proteome Anal Facil Sydney NSW 2109 Australia Univ Queensland Queensland Brain Inst Neurodegenerat Pathobiol Lab St Lucia Qld 4072 Australia Macquarie Univ Fac Med & Hlth Sci Ctr Motor Neuron Dis Res Dept Biomed Sci Sydney NSW 2109 Australia Macquarie Univ Dept Clin Sci Sydney NSW 2109 Australia

We describe a useful workflow for characterizing proteomics experiments incorporating many conditions and abundance data using the popular weighted gene correlation network analysis (WGCNA) approach and functional annotation with the PloGO2 R package, the latter of which we have extended and made available to Bioconductor. The approach can use quantitative data from labeled or label-free experiments and was developed to handle multiple files stemming from data partition or multiple pairwise comparisons. The WGCNA approach can similarly produce a potentially large number of clusters of interest, which can also be functionally characterized using PloG2. Enrichment analysis will identify clusters or subsets of proteins of interest, and the WGCNA network topology scores will produce a ranking of proteins within these clusters or subsets. This can naturally lead to prioritized proteins to be considered for further analysis or as candidates of interest for validation in the context of complex experiments. We demonstrate the use of the package on two published data sets using two different biological systems (plant and human plasma) and proteomics platforms (sequential window acquisition of all theoretical fragment-ion spectra (SWATH) and tandem mass tag (TMT)): an analysis of the effect of drought on rice over time generated using TMT and a pediatric plasma sample data set generated using SWATH. In both, the automated workflow recapitulates key insights or observations of the published papers and provides additional suggestions for further investigation. These findings indicate that the data set analysis using WGCNA combined with the updated PloGO2 package is a powerful method to gain biological insights from complex multifaceted proteomics experiments.

关键词： proteomics WGCNA functional enrichment analysis pathway gene ontology bioinformatics workflow

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：