Introduction to Data Mining (NEW)

Date & Time
14 October, 21 October, 28 October, 4 November, 11 November, 18 November (Saturdays, 10am to 5pm)
25 November 2023 (Saturdays, 10am to 1pm)

Venue/ Delivery mode
Live online sessions conducted by NTU Faculty 

Fees and Funding
Standard Programme Fee: SGD 4,860SGD
SGD1458.00 - Singapore Citizens (SCs) and Permanent Residents (PRs) (Up to 70% funding)
SGD 558.00 - SCs aged ≥ 40 years old & SkillsFuture Mid-career Enhanced Subsidy (MCES) (Up to 90% funding)

*NTU/NIE alumni may utilise their $1,600 Alumni Course Credits


Data mining, also called knowledge discovery in databases, is the process of extracting useful and actionable information from large accumulations of data.  It has been applied in many fields, e.g., finance, retail, telecommunications, and fraud detection.  Data mining is important because many organisations are looking to extract value from their data.  While small datasets can be analysed with Microsoft Excel, massive datasets require the use of sophisticated tools and techniques.  This course is designed to teach you these specialised tools and techniques.

The technical knowledge and skills you will learn in the course can be used to analyse any kind of data.  You will be taken through the broad steps of data mining.  First, to perform an audit to understand and uncover the different types of data that exist in the organisation.  This step is important as it answers the question: What data are available for analysis?  Second, to use different methods to collect and load the data into a data warehouse.  Third, to explore the dataset to understand the granularity of the data, and what the important variables are.  Fourth, to clean and organise the data in a way that allows important question to be answered.  Fifth, to use tools like Python or R to uncover meaningful patterns and trends.  Finally, to present to present the findings in a manner that makes sense to stakeholders.

Both predictive data mining (making predictions about the future) and descriptive data mining (describing the data and identifying patterns and relationships) will be covered.  The techniques you will be exposed to include classification (e.g., decision tree, nearest-neighbor classifier, Bayesian classifier, and SVM), association analysis (e.g., apriori algorithm), cluster analysis (e.g., K-means, Agglomerative Hierarchical Clustering, and dbscan) and anomaly detection (e.g., proximity-based outlier detection).  The focus will be on the application of these techniques on real-world datasets.

In this course, you will acquire the following skills & knowledge: 

  • Data mining concepts.
  • The sort of problems that data mining can be used to solve.
  • How to frame your problem in a way that it can be addressed using data.
  • The different data mining techniques, and the advantages and disadvantages of each.
  • How to apply data mining techniques to real datasets.

  • This course is suitable for: 

  • People who analyse data as part of their job roles.
  • Data analytics and IT professionals.
  • Recent graduates who want to acquire data mining skills.
  •