BS6220 - Spatial and Multi-omics Data Analytics and Machine Learning

Summary of co​urse content

Data science and machine learning approaches has now become indispensable for biological research. The biological “omics” field, dealing with large scale genomics, transcriptomics, proteomics and metabolomics, has been growing rapidly with many job opportunities created worldwide. However, there is a general lack in data analytics and machine learning skillsets facing biological datasets. This course is aimed at equipping students with basic concepts and know-how in data analytics, machine learning, multi-omics and spatial-omics methods, with specific focus on biological applications. The course is especially geared for students who are new to, or are interested in, data science and machine learning. The course will guide each student to understand the basic principles, learn the fundamentals and apply the relevant methods taught to real life datasets. It will do this through lectures from various experts in their respective domains covering cross-disciplinary content and giving students hands-on experience in performing basic machine learning and data analytics of large scale omics datasets. Students will be required to demonstrate understanding in various approaches taught and why and when to use them.

 

Aims and objectives

1. You will learn to apply data science theories in a competitive real-world setting
2. You will learn about the various career paths available to data scientist to pursue
3. Via experiential learning in teams (4-5), you will acquire the necessary soft skills (e.g. teamwork, communication, leadership and grit) necessary for successful outcomes

 

Syllabus

  • Introduction to Multiomics, Spatial Omics and Synthetic Data 
  • Machine learning fundamentals 
  • Generative Models I (Sequential and Diffusion Models)
  • Generative Models II (GAN) 
  • Synthetic Omics Data (VAE and Differential Equation Models) 
  • Introduction and Basic Techniques to Spatial Omics 
  • Advanced Analytical Techniques in Spatial Omics 
  • Spatial Omics for Cancer Data Applications
  • Sparse Modelling methods (Singular value decomposition (SVD), Sparse Principal Component Analysis (sPCA)) 
  •  Integrative multi-omics analyses I (Canonical correlation analysis (CCA), Sparse CCA (SCCA), Multi-omics weighted SCCA (WSCCA)) 
  • Integrative multi-omics analyses II (MOMLIN and DIABLO) 
  • Machine Learning in Synthetic Biology 
  • Data Leakage and Experimental Design  

Assessment

Individual progress journal (reflection and technical organization)
Individual40%
First draft model and resultsGroup10%
1st Draft presentationGroup15%
2nd presentationGroup15%
Final presentation
Group20%
Total 100%