Overview

Recent advancements in RNA-sequencing (RNA-Seq) have led to impactful technological breakthroughs in RNA-Seq quantification at high resolution, including single-cell RNA-Seq and spatial transcriptomics. However, bulk RNA-Seq (defined as: RNA-Seq quantification from averaged gene expression ; averaged expression across all cell types from tissue collected without spatial resolution), continues to play an important role in transcriptomics for various reasons, including but not limited to:

High volume of publicly available datasets (a good starting point for initial hypothesis testing)
Cost effectivess
It has been well standardized over the last decade, and thus, molecular and analysis methods are robust for the scope of the technique.

Bulk RNA-Seq as a foundational block to transcriptomics analysis

To data scientists seeking to analyze the latest transcriptomics approaches (e.g.: single-cell, or spatial), bulk RNA-Seq analysis methods are foundational blocks to transcriptomics. For example, familiarity with bulk RNA-Seq methods is critical for the efficient analysis of cell specific pseudobulk data derived from single-cell experiments. Furthermore, many enrichment analysis methods and visualization approaches are shared or built from classical methods from bulk RNA-Seq analysis. Thus, familiarity with bulk transcriptomics is a critical skill to have for any transcriptomics analysis.

Pre-requirements

Workshop participants are strongly encouraged to have familiarity with Unix and R (introductory level).

The Software Carpentry provides excellent introductory level material, which can be found in the following links:

If you have already attended one of the previous U-BDS workshops for R and Unix programming, then you should meet the pre-requirements. If you have not, we heavily encourage participants to go through the materials from the links above prior to attendance.

Scope of the workshop

This workshop will cover the foundational topics and methods to bulk RNA-Seq analysis, including:

Introduction to raw data management and data exploration
Secondary analysis with nf-core/rnaseq
Tertiary analysis topics: quality and control, data normalization and differential gene expression analysis with DESeq2, gene annotation, gene enrichment analysis (gene-ontology and gene set enrichment analysis)
Fundamentals of data visualization for transcriptomics

Authors

Austyn Trull
Bharat Mishra, PhD
Lara Ianov, PhD

A.T., B.M. and L.I. designed workshop content and initial outline. A.T. contributed to installation instructions, data management, and secondary analysis materials. B.M. contributed to Docker container creation and tertiary analysis materials (including DEG analysis, GSEA, GO, visualization etc.). L.I. supervised all work, reviewed all content and managed website design and deployment.

Nilesh Kumar, PhD and Luke Potter, PhD also provided additional edits and revisions to source material.

Additional credits

In addition to U-BDS’s best practices and code written by U-BDS, sections of the teaching material for this workshop (parts of tertiary analysis), contains materials which have been adapted or modified from the following sources (we thank the curators and maintainers of all of these resources for their wonderful contributions, compiling the best practices, and easy to follow training guides for beginners):

Beta phase of carpentries material: https://carpentries-incubator.github.io/bioc-rnaseq/index.html
Love MI, Huber W, Anders S (2014). “Moderated estimation of fold change and dispersion for RNA-Seq data with DESeq2.” Genome Biology, 15, 550. https://doi.org/10.1186/s13059-014-0550-8 ; vignette: https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html
Additional references and materials:
- https://alexslemonade.github.io/refinebio-examples/03-rnaseq/00-intro-to-rnaseq.html
- https://bioc.ism.ac.jp/packages/3.7/bioc/vignettes/enrichplot/inst/doc/enrichplot.html

We would also like to thank the following groups for support:

UAB’s Research Computing (HPC resources and workshop logistics with resources)
nf-core community (rnaseq pipeline)

We would also like to thank the authors of the dataset which we implement in our workshop:

Koch CM, Chiu SF, Akbarpour M, Bharat A, Ridge KM, Bartom ET, Winter DR. A Beginner’s Guide to Analysis of RNA Sequencing Data. Am J Respir Cell Mol Biol. 2018 Aug;59(2):145-157. doi: 10.1165/rcmb.2017-0430TR. PMID: 29624415; PMCID: PMC6096346.

Lastly, we would also like to thank Kristen Coutinho and the UAB Informatics Club for the dataset suggestion, and preliminary discussions for this workshop.

Workshop instructors

Austyn Trull
Bharat Mishra, PhD
Nilesh Kumar, PhD
Luke Potter, PhD
Lara Ianov, PhD

Citation

If these materials are beneficial to your analysis and research, please cite it with the following DOI:

UAB Biological Data Science Core, Bulk RNA-seq Workshop

U-BDS bulk RNA-Seq workshop