If you wish to try and perform the steps illustrated in this tutorial, you’ll need to follow all the instructions below to setup the


If you have any questions or issues regarding any of the below instructions, please feel free to attend Data Science office hours at the Zoom link below:

https://uab.zoom.us/meeting/register/tZ0rduCuqzotGtHShsawHLyRROqH3Sdz71mf#/registration


Cheaha Account Registration

The Cheaha supercomputer is a resource offered by Research Computing to all members of UAB and is a very useful tool for analyzing large datasets.

Please follow the documentation produced by UAB’s Research Computing Team available here in order to create an account to access Cheaha.

Once your Cheaha account is successfully created, make sure to test that you can use the Interactive File System and Terminal available on UAB Research Computing’s OnDemand Application. In order to do that, follow the below steps:

  1. Login to the OnDemand application located at the following url: https://rc.uab.edu/pun/sys/dashboard/
  2. In the upper left, click the Files button
  3. In the dropdown, click the link that says /scratch/<blazer_id>/ where blazer_id will be your blazer id
  4. In the new window that appears, click the Open in Terminal button in the upper right of the page

Performing those steps will ensure that your account has been setup correctly.

Environment Setup

In order to setup the environment to run the pipeline on Cheaha we will be using a tool called Anaconda. Anaconda is a package manager and allows for easy and quick installation of tools. We will be using Anaconda to install the tools needed for the initial parts of the workshop.

  1. Login to the UAB Research Computing OnDemand Application located at the following url: https://rc.uab.edu/pun/sys/dashboard/
  2. In the upper left, click the Files button
  3. In the dropdown, click the link that says /scratch/<blazer_id>/ where blazer_id will be your blazer id
  4. In the new window that appears, click the Open in Terminal button in the upper right of the page
  5. Cheaha comes with a lot of tools and software pre-installed called ‘modules’. Anaconda is one of these tools, and we can load it using the below command in the terminal window that appeared:
    • module load Anaconda3
  6. Anaconda works by creating isolated environments to install packages into. In order to create one of these environments and install Nextflow and nfcore, use the below command:
    • conda create -p $USER_SCRATCH/conda_envs/nfcore_workshop python=3.12 bioconda::nf-core bioconda::nextflow

    • Type y when prompted (Proceed ([y]/n)?).

    • This command will create a new conda enviroment called nfcore_workshop in your scratch space on Cheaha and under a sub-directory called conda_envs. It is important to name your environments as something intuitive to help you remember their purpose.

  7. We now need to activate the environment so we can begin installing packages inside the newly created ‘nfcore_workshop’ environment.
    • conda activate $USER_SCRATCH/conda_envs/nfcore_workshop
  8. The environment should be created and the packages installed, in order to get back out of the environment, we deactivate it
    • conda deactivate

Globus Account Registration

Globus is a web-based tool that can be used for large data transfers between different locations.

As a member of UAB, you are able to login using your blazer id. In order to login to Globus using UAB, follow the steps below:

  1. Go to the Globus home page.
  2. Click ‘Login’ in the upper right of the page.
  3. In the center of the page, click the drop down and search for ‘University of Alabama at Birmingham’ and click ‘Continue’.
  4. On the next page, enter your blazer id and password for your UAB account.

Download the Data to Cheaha

The data being used for the workshop comes from the paper below:

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6096346/

For this workshop, the data has been pre-downloaded and packaged into a Globus endpoint for ease of transfer. The link for the Globus endpoint can be found here

If you have never logged into the Globus account, please see the instructions here

In order to download the data, perform the following steps:

  1. Login to Globus
  2. Click on ‘File Manager’ in the side bar
  3. In the left pane, click the ‘Collection’ text box
  4. Search for ‘Cheaha cluster on-campus (UAB DMZ)’ and select it
    • If you are off campus, use the ‘Cheaha cluster off-campus (UAB DMZ)’ collection instead
  5. In the ‘Path’ text box type /scratch/your_blazerid where your_blazerid is your blazer id
  6. Towards the middle, click the ‘New Folder’ button and type nfcore_workshop.
    • This will create a new folder names ‘nfcore_workshop’ in your scratch space
  7. Double click the ‘nfcore_workshop’ folder to go inside of it
  8. In the right pane, click the ‘Collection’ text box
  9. Search for ‘nfcore_workshop_data’
  10. In the right pane, click the checkboxes beside the ‘input’ and ‘results’ folder that have appeared
  11. In the middle, click the ‘Transfer of Sync to…’ button.

The folders will now be transferred to your scratch space on Cheaha. Following this process will match the directory structure that will be used for the first portion of the workshop.

Install Visual Studio Code (VSCode)

In order to use Visual Studio Code (VSCode) with Cheaha, please follow the instructions created by UAB’s Research Computing team below:

https://docs.rc.uab.edu/cheaha/open_ondemand/hpc_desktop/#visual-studio-code-remote-tunnel

Install Nextflow Extension for VSCode

  1. Open up the “Visual Studio Code” application you installed in the instructions here
  2. Connect to the code tunnel you are running on Cheaha (instructions on how to create and use a tunnel are noted in the link above).
  3. On the left side menu, click the “Extensions” icon. The icon appears as four squares.
  4. In the search bar of the Extensions view, type “Nextflow”.
  5. Locate the “Nextflow” search result. Its named “Nextflow” and is published by “Nextflow” (which you can tell by the blue checkmark next to the publisher).
  6. Click the result and click the “Install” button located at the top of the Nextflow extension page.