Galaxy for bioinformatics on GCP

Below are the slides from my talk for the GAME 2017 conference  on ‘Scaling Galaxy for GCP’ to be delivered in Feb 2017 at the University of Melbourne, Australia.  Galaxy is a bioinformatics tool used for genomic research.  A sample screen from Galaxy is shown below.

In this talk, I will show demos and patterns for scaling the Galaxy tool (and also for creating bioinformatics research data pipelines in general) via the Google Cloud Platform.

Patterns include the following:

My particular area of interest is in the application of the results of using genomic sequencing for personalized medicine for  cancer genomics This is the application of the results of the totality of DNA sequence and gene expression differences between (cancer) tumor cells and normal host cells.

Building any type of genomics pipelines is true big data work, with EACH whole genome sequencing result producing 2.9 Billion base pairs (T,A,G,C).

Google has an interesting genomic browser (shown above) that you can use on the reference genomic data that they host on GCP.