Training > Pillar two

‘Omics

The vast amounts of data being produced in the life sciences means there are several hundred web-accessible databases for researchers to navigate and understand how to select and extract reliable data from these complex large-scale datasets.

These different ‘omics’ data types can only be analysed with specialized computational tools meaning interdisciplinary projects where various teams need to work together with bioinformaticians to answer health research questions is key. This is even more important when different ‘omics’ data types are integrated together.

Working with bioinformaticians, researchers and health professionals need the vocabulary and acquire basic skills to communicate effectively with bioinformaticians and better comprehend the crucial steps of the analysis of complex large-scale data allowing effective collaboration.

In the ‘Omics pillar there is a series of high-quality workshops, that are linked and integrated but can be taken separately and in any order at two different competency levels, beginner and applied. Most workshops are hands-on and have a ‘bring your own data’ to analyse component.

Summary of Modules

Arrows indicate the suggested flow of learning. All modules can be taken as and when required

Using Spreadsheets

Now Live

Basic R

Now Live

Statistics with R

Now Live

Advanced R

Coming Soon

Unix and IT Infrastructure

Now Live

Sequence Analaysis and BLAST

Now Live

Designing experiments for bioinformatic analysis

Coming Soon

Resources for biological data

Coming Soon

NGS Introduction and
pre-processing pipeline

Now Live

NGS: RNA-Seq

Now Live

NGS: Single Cell RNA-Seq

Now Live

NGS: ATAC-Seq

Now Live

NGS on the Galaxy Platform

Now Live

Basic R with Data Carpentry

R is one of the most used programming languages in Data Science to perform data analysis, used to visualize the data, to apply statistics and for machine learning analyses. This course is designed for participants with no programming experience and is the most beginner-friendly course within Innovation Scholars. We will be using example questions related to health research to learn how to write ad-hoc codes to manipulate and visualize data using R packages designed for Big Data analyses.

This course is for you if:

You are a complete/near complete beginner to programming
You work in health-related sector as either clinical, administrative or academic staff – for example as a nurse, health administrator, consultant, biomedical researcher, etc.
You require flexible learning format
You need to start using more data for your current or future role
You want to be able to perform basic tasks in R, such as making plots and sorting through tables, based on health-data without getting too deep into the theory of coding

By completing this module you will meet 10 of the capability statements in the Artificial Intelligence (AI) and Digital Healthcare Technologies Capability Framework. This framework outlines the skills and capabilities that will allow health and care professionals to work effectively in a digitally enhanced environment.

The content in this module covers skills and capabilities in the following domains:

2.2 Remote consultation and monitoring (m)
5.1.1 Data collection (b,i)
5.1.3 Data Visualisation and Reporting (a,c,d,h)
5.1.4 Data Processing and Analytics (j,m,n)

Using spreadsheets for recording data and metadata

A good experimental design is the first step for a successful bioinformatics analyses, the second one is to understand how to record and organise data. Most life scientists (at any career stage) store data in spreadsheets, so this is the place that many research projects start and how data are delivered to the bioinformatician. Participants will prepare data for bioinformatics analyses and improve their data organization skills.

This course is for you if:

You are a beginner to using Excel or have experience but would like to use it more effectively
You work in health-related sector as either clinical, administrative or academic staff – for example as a nurse, health administrator, consultant, biomedical researcher, etc.
You have some knowledge of basic statistical terms and would like to apply them in Excel
You need to start recording data or using already recorded data for your current or future role
You require flexible learning format

By completing this module you will meet 4 of the capability statements in the Artificial Intelligence (AI) and Digital Healthcare Technologies Capability Framework. This framework outlines the skills and capabilities that will allow health and care professionals to work effectively in a digitally enhanced environment.

The content in this module covers skills and capabilities in the following domains:

2.2 Remote consultation and monitoring (m)
5.1.1 Data collection (b,i)
5.1.4 Data Processing and Analytics (j)

Statistics with R

This course will provide practical examples of how to perform statistical tests using the R software environment. We will explore the most widely used statistical tests and will explain the basic concept behind applying a stats test in R so that participants will be able to apply their knowledge to other tests not covered in this course. A basic knowledge of the statistical tests and a basic knowledge of R and Rstudio are essential.

This course is for you if:

You need to apply statistical tests on large datasets in your current or future role
You have a basic or higher level of programming using R and Rstudio
You have a basic or higher understanding of statistics
You work in health-related sector as either clinical, administrative or academic staff – for example as a nurse, health administrator, consultant, biomedical researcher, etc.
You require flexible learning format

By completing this module you will meet 2 of the capability statements in the Artificial Intelligence (AI) and Digital Healthcare Technologies Capability Framework. This framework outlines the skills and capabilities that will allow health and care professionals to work effectively in a digitally enhanced environment.

The content in this module covers skills and capabilities in the following domains:

5.1.4 Data processing and analytics (m,n)

NGS Introduction and pre-processing pipeline

In this introduction to Next Generation Sequencing (NGS) we will cover the the key steps of the NGS pre processing pipeline. You will gain a comprehensive understanding of how NGS work, how different types of NGS can answer different biological questions and how the raw data processing is in common between the different platforms with a particular focus to bulk RNA-seq and ATAC-seq.

This introduction module is a preparatory course for our RNA-seq, ATAC-seq and variant analysis using Galaxy courses where we will go through the specific analyses after the pre-processing steps.

This course is for you if

You have basic knowledge of R programming language which may have been obtain via our Pillar 2: Basic R with Data Carpentry course
You work in a biomedical field and you want to learn how to pre-process NGS data
You require flexible learning format
You want to gain a better understanding of what NGS is used for
You would like to learn how to carry out differential expression (RNA-seq) or differential accessibility (ATAC-seq) tests for bulk data

NGS: RNA-seq

Having already completed the NGS Introduction and Pre processing pipeline you can now take on RNA-seq. On this course you will gain a comprehensive understanding of how to analyse and visualise bulk RNA-seq data.

You can also learn ATAC-seq and variant analysis using Galaxy on our other courses.

This course is for you if

You have basic knowledge of R programming language which may have been obtain via our Pillar 2: Basic R with Data Carpentry course
You work in a biomedical field
You require flexible learning format
You need to start using RNA-seq data
You want to gain a better understanding of what information you can get from RNA-seq data
You would like to learn how to carry out differential expression analysis for bulk data

NGS: ATAC-seq

Having already completed the NGS Introduction and Pre processing pipeline you can now take on ATAC-seq. On this course you will gain a comprehensive understanding of how to analyse and visualise bulk ATAC-seq data.

You can also learn RNA-seq and variant analysis using Galaxy on our other courses.

This course is for you if

You have basic knowledge of R programming language which may have been obtain via our Pillar 2: Basic R with Data Carpentry course
You work in a biomedical field
You require flexible learning format
You need to start using ATAC-seq data
You want to gain a better understanding of what information you can get from ATAC-seq data
You would like to learn how to carry out differential expression analysis for bulk data

NGS on the Galaxy Platform

Having already completed the NGS Introduction and Pre processing pipeline you can now take on processing and analysing your NGS data using the Galaxy platform

This course is designed for those without programming experience that want to process and analyse NGS data. In this course, we will use Galaxy, an open-source, web-based platform for accessible, reproducible, and transparent computational biomedical research. Even if we are only going to cover an NGS pipeline here (variant calling), this course will give students enough understanding on how Galaxy works, allowing them to easily run bioinformatics tools, setting up parameters and building larger workflows.

This course is for you if

You are a complete/near complete beginner to bioinformatics
You are a student, researcher, and professional seeking to enhance your skills in analyzing biological sequences, process NGS data and you don’t have access to a cluster
You want to be able to perform pre-processing and analysis of NGS data
You are interested in variant calling and annotation
You would like to learn how to carry out differential expression analysis for bulk data

Sequence Alignment and BLAST

The aim of this course is to provide participants with a comprehensive understanding of sequence alignment techniques, focusing on both BLAST and multiple sequence alignment, to equip them with the skills necessary to analyze biological sequences, uncover evolutionary relationships, identify functional elements, and explore the vast applications in fields such as genomics, proteomics, and bioinformatics.

This course is for you if

You are a complete/near complete beginner to bioinformatics
You are a student, researcher, and professional seeking to enhance your skills in analyzing biological sequences, uncovering evolutionary relationships, and exploring the diverse applications of sequence alignment in areas such as genomics, proteomics, and bioinformatics.
You want to be able to perform basic tasks in R, such as making plots and sorting through tables, based on health-data without getting too deep into the theory of coding

NGS: Single Cell RNA seq

Single-cell transcriptomics is a powerful tool to study the heterogeneity of the cellular transcriptome at single-cell levels. This course aims to equip participants with essential skills to process RNA-sequencing reads from single-cell transcriptomics experiment and to perform downstream analyses of the single-cell gene expression data. This course is designed for participants with some experience in R programming and running command-line programs. We will be using published single-cell RNA-sequencing datasets to perform quality control, data normalisation, cell clustering, differential expression and trajectory analyses.

This course is for you if

You have basic knowledge in using command-line terminal and in running commands in R
You work with, or about to work with, single-cell transcriptomics experiments
You need to analyse your own single-cell transcriptomics dataset, or analyse publicly available datasets
You require flexible learning format

Unix and IT Infrastructure

Knowledge of the Unix operating system is fundamental for handling Big Data and to be able to understand how the different analysis tools operate. Most of the bioinformatics pipelines involve the use of standalone software that are run via command line, inputting raw data and producing tables that contain biological information (that can be imported into R). This module will introduce you to Unix and the help you understand the underlying IT infrastructure of many of these command line programmes.

Advanced R

Closed: we are not currently accepting applications for this module.

Designing experiments for successful bioinformatics analysis

An appropriate experimental design is the key step for successful bioinformatics analyses to reduce the types and sources of variability including confounding batch effects, effect size, technical and biological replicates, controls, randomization, next generation sequencing (NGS) parameters including choice of appropriate numbers of reads and sequencing depth.

Closed: we are not currently accepting applications for this module.

Online Resources to access published datasets and basic biological data

The vast amount of published data stored in accessible databases represents a resource for data analysis, integration and validation but it can be overwhelming. This workshop will show how databases are accessed and used focusing on how to search, retrieve, analyse and interpret publicly available data and discriminate between different data types.

Closed: we are not currently accepting applications for this module.

‘Omics

Summary of Modules

Basic R with Data Carpentry

This course is for you if:

Using spreadsheets for recording data and metadata

Statistics with R

NGS Introduction and pre-processing pipeline

NGS: RNA-seq

NGS: ATAC-seq

NGS on the Galaxy Platform

Sequence Alignment and BLAST

NGS: Single Cell RNA seq

Unix and IT Infrastructure

Advanced R

Designing experiments for successful bioinformatics analysis

Online Resources to access published datasets and basic biological data

Have other questions?

Quick links

Training