BS6220 - Spatial and Multi-omics Data Analytics and Machine Learning
Summary of course content
Data science and machine learning approaches has now become indispensable for biological research. The biological “omics” field, dealing with large scale genomics, transcriptomics, proteomics and metabolomics, has been growing rapidly with many job opportunities created worldwide. However, there is a general lack in data analytics and machine learning skillsets facing biological datasets. This course is aimed at equipping students with basic concepts and know-how in data analytics, machine learning, multi-omics and spatial-omics methods, with specific focus on biological applications. The course is especially geared for students who are new to, or are interested in, data science and machine learning. The course will guide each student to understand the basic principles, learn the fundamentals and apply the relevant methods taught to real life datasets. It will do this through lectures from various experts in their respective domains covering cross-disciplinary content and giving students hands-on experience in performing basic machine learning and data analytics of large scale omics datasets. Students will be required to demonstrate understanding in various approaches taught and why and when to use them.
Aims and objectives
Syllabus
- Introduction to Multiomics, Spatial Omics and Synthetic Data
- Machine learning fundamentals
- Generative Models I (Sequential and Diffusion Models)
- Generative Models II (GAN)
- Synthetic Omics Data (VAE and Differential Equation Models)
- Introduction and Basic Techniques to Spatial Omics
- Advanced Analytical Techniques in Spatial Omics
- Spatial Omics for Cancer Data Applications
- Sparse Modelling methods (Singular value decomposition (SVD), Sparse Principal Component Analysis (sPCA))
- Integrative multi-omics analyses I (Canonical correlation analysis (CCA), Sparse CCA (SCCA), Multi-omics weighted SCCA (WSCCA))
- Integrative multi-omics analyses II (MOMLIN and DIABLO)
- Machine Learning in Synthetic Biology
- Data Leakage and Experimental Design
Assessment
Individual progress journal (reflection and technical organization) | Individual | 40% |
First draft model and results | Group | 10% |
1st Draft presentation | Group | 15% |
2nd presentation | Group | 15% |
Final presentation | Group | 20% |
Total | 100% |