MS0003: Introduction to Data Science and Artificial Intelligence
| Academic Units | 3 |
| Semester | 2 |
| Pre-requisite(s) | SC1003; BG2211; CH2107; CV1014; MS1008; MA1008; EE1005; RE1016 |
| Co-requisite(s) | Nil |
Course Instructors
Associate Professor Kedar Hippalgaonkar | Assistant Professor Ng Wei Tat, Leonard |
Course AIMS
In today's era of Information, ‘Data’ is the new driving force, provided we know how to extract relevant ‘Intelligence’. This course will start with the core principles of Data Science, and will equip you with the basic tool and techniques of data handling , exploratory data analysis, data visualization, data-based inference, and data-focused communication. The course will also introduce you to the fundamentals of Artificial Intelligence – state space representation, uninformed search, and reinforcement learning.
The course will motivate you to work closely with data and make data-driven decisions in your field of study. The course will also touch upon ethical issues in Data Science and Artificial Intelligence, and motivate you to explore the cutting-edge applications in Materials Science related to Big Data, Neural Networks and Deep Learning. Python will be the language of choice to introduce hands-on computational techniques.
Intended Learning Outcomes
By the end of this course, you (as a student) would be able to:
- Identify and define data-oriented problems and data-driven decisions in real life.
- Discuss and illustrate the problems in terms of data exploration and visualization.
- Apply basic machine learning tools to extract inferential information from the data.
- Compose an engaging “data-story” to communicate the problem and the inference.
- Outline the roles and requirements of artificial intelligence in practical applications.
- Discuss and explain fundamentals of state space search and reinforcement learning.
Course Content
- Introduction - Data-Analytic Thinking. What is Data Science? – The core problems and solutions. Extracting Intelligence from Data – formulating problems.
- Basic Data Acquisition and Handling
- Basic Statistics and Exploratory Data Analysis
- Linear Regression
- Classification
- Clustering and Anomalies (CA 1)
- Visualization - Clustering and Anomalies
- Neural Networks - Visualization
- Large Language Models - Neural Networks & Large Language Models
- Time Series Modeling (CA2)
- Introduction to Real-Life Datasets (Project Lab I)
- Strategies for improved performance on project datasets I (Project Lab II)
- Strategies for improved performance on project datasets II (Project Lab III)
Reading and References
There is no single textbook for the course. The following books and resources will be used as references and if necessary, notes will be provided.
- Python Data Science Handbook : Jake VanderPlas : O’Reilly (2016)
- An Introduction to Statistical Learning : James, Witten, Hastie, Tibshirani (2021)
Additional resources, if required, will be shared with you in the Lectures and Example Classes.

