To participate in the workshop, you will need access to the software as described below. In addition, you will need an up-to-date web browser.
If you have any questions or issues regarding any of the below instructions, please feel free to attend the office hours dedicated to installation troubleshooting. The dates and times for these more focused office hour sessions are listed below, and the Zoom link is available here.
July 2nd: 2pm - 2:30pm
July 8th: 11:30am - 12pm
The Cheaha supercomputer is a resource offered by Research Computing to all members of UAB and is a very useful tool for analyzing large datasets.
Please follow the documentation produced by UAB’s Research Computing Team available here in order to create an account to access Cheaha.
Once your Cheaha account is successfully created, make sure to test that you can use the Interactive File System and Terminal available on UAB Research Computing’s OnDemand Application. In order to do that, follow the below steps:
Files
button/scratch/<blazer_id>/
where blazer_id
will be your blazer idOpen in Terminal
button in the upper right of the pagePerforming those steps will ensure that your account has been setup correctly.
In order to setup the environment to run the pipeline on Cheaha we will be using a tool called Anaconda. Anaconda is a package manager and allows for easy and quick installation of tools. We will be using Anaconda to install the tools needed for the initial parts of the workshop.
Files
button/scratch/<blazer_id>/
where blazer_id
will be your blazer idOpen in Terminal
button in the upper right of the pagemodule load Anaconda3
conda create -p $USER_SCRATCH/conda_envs/rnaseq_workshop python=3.12 bioconda::nf-core bioconda::nextflow
Type y
when prompted
(Proceed ([y]/n)?
).
This command will create a new conda enviroment called
rnaseq_workshop
in your scratch space on Cheaha and under a
sub-directory called conda_envs
. It is important to name
your environments as something intuitive to help you remember their
purpose.
conda activate $USER_SCRATCH/conda_envs/rnaseq_workshop
conda deactivate
Globus is a web-based tool that can be used for large data transfers between different locations.
As a member of UAB, you are able to login using your blazer id. In order to login to Globus using UAB, follow the steps below:
The data being used for the workshop comes from the paper below:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096346/
For this workshop, the data has been pre-downloaded and packaged into a Globus endpoint for ease of transfer. The link for the Globus endpoint can be found here
If you have never logged into the Globus account, please see the instructions here
In order to download the data, perform the following steps:
/scratch/your_blazerid
where your_blazerid
is your blazer idrnaseq_workshop
.
The folders will now be transferred to your scratch space on Cheaha. Following this process will match the directory structure that will be used for the first portion of the workshop.
For tertiary analysis, the class will be taught using your own local computers, instead of Cheaha.
Please download the data
and the
teaching_templates
folders from the following Box link: https://uab.box.com/s/arbfr7dm5nhe62p5dycau5g9c2xzcoth
Note: please add the folders above to a location in your computer that is easy to find, such as a sub-folder in your Desktop. This directory path will be your working directory in the R session (for a reminder on how to set your working directory in class, please the the following brief overview: https://www.learn-r.org/r-tutorial/setwd-r.php).
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.
Please follow the instructions at this location:
https://carpentries.github.io/workshop-template/install_instructions/#r-1
R has a variety of existing and pre-built packages that ease the burden of analysis. We will be using a few of these packages to complete the lesson taught at this workshop.
Please only complete this step AFTER successfully installing R and Rstudio above.
Source Code Panel
),
type the below command and press the Enter
keyinstall.packages(c("BiocManager", "remotes"))
Update all/some/none? [a/s/n]:
, be sure to type
a
and press the Enter
key.BiocManager::install(c("DESeq2", "tximport","vsn","apeglm", "tidyverse", "SummarizedExperiment", "vidger"))
BiocManager::install(c("biomaRt","clusterProfiler","msigdbr","Glimma","simplifyEnrichment"))
BiocManager::install(c("enrichplot", "kableExtra","ggrepel","ggpubr","ComplexHeatmap","ComplexUpset","EnhancedVolcano","RColorBrewer", "hexbin", "cowplot","gplots", "ggplot2"))
install.packages(c("gprofiler2", "ggridges", "rmarkdown"))
library("DESeq2")
library("tximport")
library("vsn")
library("apeglm")
library("tidyverse")
library("SummarizedExperiment")
library("vidger")
library("biomaRt")
library("clusterProfiler")
library("msigdbr")
library("Glimma")
library("simplifyEnrichment")
library("enrichplot")
library("kableExtra")
library("ggrepel")
library("ggpubr")
library("ComplexHeatmap")
library("ComplexUpset")
library("EnhancedVolcano")
library("RColorBrewer")
library("hexbin")
library("cowplot")
library("gplots")
library("ggplot2")
library("gprofiler2")
library("ggridges")
library("rmarkdown")